Fundamentals

Context Window

Quick Answer

The maximum amount of text (tokens) an LLM can process in a single request.

The context window defines the maximum length of input and output tokens a model can handle in a single interaction. This is a hard technical limit built into the model's architecture. Context windows vary dramatically—from 4K tokens in older models to 200K+ tokens in modern ones. A larger context window enables the model to consider more information at once, which is crucial for long-document analysis, multi-turn conversations, and complex reasoning tasks. However, larger windows come with increased computational costs and potential latency. Managing context effectively is essential for prompt engineering.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →