Building Large Language Models (LLMs)

Pre-Training. Post-Training. Language Modeling. P(the, mouse, ate, the, cheese) = 0.02 – syntactic knowledgeP(the, the, mouse, ate,cheese) = 0.0001 – semantic knowledgeP(…) Auto-Reggressive (AR) language model: Predict next word. Steps: she likely prefers: tokenize -> 1 -she, 2-likely, 3-prefers => pass to blackbox model => get probability distribution over next word prediction – sample & ... Read More

What is Transformer?

Used by: Primary Innovations: Key Terms: Visual Explaination: Example:https://aws.amazon.com/what-is/transformers-in-artificial-intelligence/ References:

Kernel Methods

Solve non-linear problems. Similarity measures between two data points. Moves point to different spaces in different dimensions. eg. Support Vector Machine, K-means Clustering. Work in feature space, not in data space. Encode data into feature space. Convert non-linearity to linear model. Good Resources: https://www.youtube.com/watch?v=uDAAi5aQbMU
error: