LM, N-Grams, RNNs, LLM, Fine Tuning, LoRA, QLoRA – Part II

Introduction to Large Language Models
https://developers.google.com/machine-learning/crash-course/llm

LLMs: What is a Large Language Model?

An LLM is a predictive technology that estimates the next “token” (word, character, or subword) in a sequence. They outperform older models (like N-grams) because they use vastly more parameters and can process significantly more context at once.

Transformer

It is most successful and widely used architecture that is used to build LLMs. It is the state-of-the-art achitecture used in various language model applications.

A full transformer consists of:

  • Encoder
  • Decoder

Encoder only or Decoder only architectures also exist.

Self Attention

Self-attention allows the model to understand the relationship between words in a sentence, regardless of distance.

Multi-Head Attention

LLMs stack multiple “heads” of attention. Each head focuses on different aspects of language, one might track grammar, while another tracks pronoun references. By stacking these layers, the model builds a complex, abstract understanding of the text.

How LLMs Generate Text

Functionally, LLMs are sophisticated autocomplete engines. When you ask a question, the model views it as the first part of a sequence and calculates the most probable “completion” (the answer), token by token.

Other Resources:

Leave a Reply

Your email address will not be published. Required fields are marked *

error: