Machine Learning - Computing Notes

Reinforcement Learning

November 21, 2025November 21, 2025 by ComputingNotes

https://au.mathworks.com/content/dam/mathworks/ebook/gated/reinforcement-learning-ebook-all-chapters.pdf Reinforcement Learning: Type of machine learning, where a computer agent learns to perform a task through a process of repeated trial and error within a dynamic environment. The goal is to learn what to do, how to map situations to take certain actions that minimize a numerical reward signal. Eg. RL trained computer gains ... Read More

Optimization

July 2, 2025July 2, 2025 by ComputingNotes

Rosenbrock Function \(f(x,y) = (1-x)^2 + 100(y-x^2)^2\) https://docs.scipy.org/doc/scipy/tutorial/optimize.html

What is Gradient? Loss Function? Model Parameters? Gradient Descent?

June 30, 2025June 30, 2025 by ComputingNotes

Gradient Descent: Optimization algorithm; minimizes errors between predicted and actual results; updates parameters by moving against the gradient Loss Function: aim is to minimize this function, closer to zero; measures how bad the prediction is in comparison to the actual true value; various methods are used; one is Mean Squared Error (MSE) Gradient: Slope; direction ... Read More

Famous Datasets

April 16, 2025April 16, 2025 by ComputingNotes

CIFAR 10 https://www.cs.toronto.edu/~kriz/cifar.html

Tensor basics

November 27, 2024November 27, 2024 by ComputingNotes

https://tensorly.org/stable/user_guide/tensor_basics.html Tensor – multi-dimensional array https://tensorly.org/stable/user_guide/tensor_decomposition.html

Intro to Weights and Biases

October 30, 2024October 30, 2024 by ComputingNotes

https://colab.research.google.com/github/wandb/examples/blob/master/colabs/intro/Intro_to_Weights_%26_Biases.ipynb

Training, Validation and Test Sets

October 28, 2024October 28, 2024 by ComputingNotes

Training Dataset: Evaluation Dataset: Test Dataset:

Convolutional Neural Networks

October 10, 2024October 10, 2024 by ComputingNotes

References: What are CNNs? Different steps:

Example code: Fine-tunning LLM using LoRA

October 4, 2024October 4, 2024 by ComputingNotes

Best resources:

Building Large Language Models (LLMs)

October 4, 2024October 4, 2024 by ComputingNotes

Pre-Training. Post-Training. Language Modeling. P(the, mouse, ate, the, cheese) = 0.02 – syntactic knowledgeP(the, the, mouse, ate,cheese) = 0.0001 – semantic knowledgeP(…) Auto-Reggressive (AR) language model: Predict next word. Steps: she likely prefers: tokenize -> 1 -she, 2-likely, 3-prefers => pass to blackbox model => get probability distribution over next word prediction – sample & ... Read More