GlossaryContext Window
Architecture
Context Window
The maximum amount of text (measured in tokens) an AI model can process and remember at any single moment during an interaction.
The Context Window is the "working memory" of a Large Language Model. It defines the maximum number of tokens (words or sub-words) that the model can process, read, and remember simultaneously in a single request.
If a conversation or document exceeds the model's context window, the model will "forget" the earliest information provided. In recent years, context windows have expanded dramatically, growing from 4,000 tokens to over 2,000,000 tokens, allowing models to analyze entire books, codebases, or hours of video transcripts in a single prompt.