Reinforcement Learning

https://au.mathworks.com/content/dam/mathworks/ebook/gated/reinforcement-learning-ebook-all-chapters.pdf Reinforcement Learning: Type of machine learning, where a computer agent learns to perform a task through a process of repeated trial and error within a dynamic environment. The goal is to learn what to do, how to map situations to take certain actions that minimize a numerical reward signal. Eg. RL trained computer gains ... Read More

Optimization

Rosenbrock Function \(f(x,y) = (1-x)^2 + 100(y-x^2)^2\) https://docs.scipy.org/doc/scipy/tutorial/optimize.html

What is Gradient? Loss Function? Model Parameters? Gradient Descent?

Gradient Descent: Optimization algorithm; minimizes errors between predicted and actual results; updates parameters by moving against the gradient Loss Function: aim is to minimize this function, closer to zero; measures how bad the prediction is in comparison to the actual true value; various methods are used; one is Mean Squared Error (MSE) Gradient: Slope; direction ... Read More

Tensor basics

https://tensorly.org/stable/user_guide/tensor_basics.html Tensor – multi-dimensional array https://tensorly.org/stable/user_guide/tensor_decomposition.html

Building Large Language Models (LLMs)

Pre-Training. Post-Training. Language Modeling. P(the, mouse, ate, the, cheese) = 0.02 – syntactic knowledgeP(the, the, mouse, ate,cheese) = 0.0001 – semantic knowledgeP(…) Auto-Reggressive (AR) language model: Predict next word. Steps: she likely prefers: tokenize -> 1 -she, 2-likely, 3-prefers => pass to blackbox model => get probability distribution over next word prediction – sample & ... Read More
error: