LM, N-Grams, RNNs, LLM, Fine Tuning, LoRA, QLoRA - Part II

Introduction to Large Language Models
https://developers.google.com/machine-learning/crash-course/llm

LLMs: What is a Large Language Model?

An LLM is a predictive technology that estimates the next “token” (word, character, or subword) in a sequence. They outperform older models (like N-grams) because they use vastly more parameters and can process significantly more context at once.

Transformer

It is most successful and widely used architecture that is used to build LLMs. It is the state-of-the-art achitecture used in various language model applications.

A full transformer consists of:

Encoder
Decoder

Encoder only or Decoder only architectures also exist.

Self Attention

Self-attention allows the model to understand the relationship between words in a sentence, regardless of distance.

Multi-Head Attention

LLMs stack multiple “heads” of attention. Each head focuses on different aspects of language, one might track grammar, while another tracks pronoun references. By stacking these layers, the model builds a complex, abstract understanding of the text.

How LLMs Generate Text

Functionally, LLMs are sophisticated autocomplete engines. When you ask a question, the model views it as the first part of a sequence and calculates the most probable “completion” (the answer), token by token.

Other Resources:

“LoRA vs. QLoRA,” Redhat.com, 2025. https://www.redhat.com/en/topics/ai/lora-vs-qlora
“LoRA vs. QLoRA: Efficient fine-tuning techniques for LLMs,” Modal, 2024. https://modal.com/blog/lora-qlora
GeeksforGeeks, “FineTuning using LoRA and QLoRA,” GeeksforGeeks, Jun. 20, 2025. https://www.geeksforgeeks.org/deep-learning/fine-tuning-using-lora-and-qlora/
Shalini Dhote, “Parameter-Efficient Fine-Tuning of Large Language Models with LoRA and QLoRA,” Analytics Vidhya, Aug. 27, 2023. https://www.analyticsvidhya.com/blog/2023/08/lora-and-qlora/.
“LoRA,” Huggingface.co, 2018. https://huggingface.co/docs/peft/en/package_reference/lora

‌

LM, N-Grams, RNNs, LLM, Fine Tuning, LoRA, QLoRA – Part II

LLMs: What is a Large Language Model?

Transformer

Self Attention

Multi-Head Attention

How LLMs Generate Text

Related