Fundamentals
Context Window
Quick Answer
The maximum amount of text (tokens) an LLM can process in a single request.
The context window defines the maximum length of input and output tokens a model can handle in a single interaction. This is a hard technical limit built into the model's architecture. Context windows vary dramatically—from 4K tokens in older models to 200K+ tokens in modern ones. A larger context window enables the model to consider more information at once, which is crucial for long-document analysis, multi-turn conversations, and complex reasoning tasks. However, larger windows come with increased computational costs and potential latency. Managing context effectively is essential for prompt engineering.
Last verified: 2026-04-08