What is Transformer?

Used by:

  1. OpentAI’s popular ChatGPT
  2. BERT Model (Bidirectional Encoder Representations from Transformers)

Primary Innovations:

  1. Positional Encoding
  2. Self Attention: pay more attention to relevant information

Key Terms:

  1. Tokens: Text converted into numerical representations; Each token is then contextualized within the scope of the context window other tokens https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)

Visual Explaination:

  1. Attention in transformers, visually explained | Chapter 6, Deep Learning https://www.youtube.com/watch?v=eMlx5fFNoYc

Example:
https://aws.amazon.com/what-is/transformers-in-artificial-intelligence/

References:

  1. What is transformer model ? https://www.ibm.com/topics/transformer-model
  2. https://www.datacamp.com/tutorial/how-transformers-work
  3. https://towardsdatascience.com/transformers-141e32e69591
  4. https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/#.XIWlzBNKjOR
  5. https://jalammar.github.io/illustrated-transformer/
  6. https://towardsdatascience.com/openai-gpt-2-understanding-language-generation-through-visualization-8252f683b2f8

Leave a Reply

Your email address will not be published. Required fields are marked *

error: