Large Language Model (LLM)

A large neural network trained on massive text data to understand and generate language.

A large language model is a Transformer-based neural network trained on billions of tokens of text. The 2020 GPT-3 paper turned the category mainstream; ChatGPT, Claude Sonnet, Gemini, and Llama 3 later normalized day-to-day use. Training an LLM typically goes through Pre-training, Post-training, and RLHF stages. LLMs are now the core engine behind generative assistants as well as RAG, AI Agent, and Coding Agents applications.