LLM Context Window Comparison 2026
Compare context window sizes across 25 large language models. Larger context windows let you process longer documents and maintain richer conversation histories.
Data verified Apr 3, 2026
Context Window by Model
OpenAI
Anthropic
Google
Meta
Mistral
DeepSeek
xAI
Cohere
Microsoft
Alibaba
All Models — Ranked by Context Window
| Model | Provider | Context Window | Max Output | Input $/M | Output $/M |
|---|---|---|---|---|---|
| Llama 4 Scout | Meta | 10.48576M | 32,768 | $0.100 | $0.300 |
| Gemini 2.5 Pro | 1.048576M | 65,536 | $1.25 | $10.00 | |
| Llama 4 Maverick | Meta | 1.048576M | 32,768 | $0.200 | $0.600 |
| Gemini 2.0 Flash | 1.048576M | 8,192 | $0.100 | $0.400 | |
| Gemini 2.0 Flash Lite | 1.048576M | 8,192 | $0.075 | $0.300 | |
| GPT-4.1 | OpenAI | 1.047576M | 32,768 | $2.00 | $8.00 |
| GPT-4.1 Mini | OpenAI | 1.047576M | 32,768 | $0.400 | $1.60 |
| GPT-4.1 Nano | OpenAI | 1.047576M | 32,768 | $0.100 | $0.400 |
| o4-mini | OpenAI | 200K | 100,000 | $1.10 | $4.40 |
| Claude Opus 4 | Anthropic | 200K | 32,000 | $15.00 | $75.00 |
| o3-mini | OpenAI | 200K | 100,000 | $1.10 | $4.40 |
| Claude Sonnet 4 | Anthropic | 200K | 64,000 | $3.00 | $15.00 |
| Claude Haiku 4 | Anthropic | 200K | 8,192 | $0.800 | $4.00 |
| DeepSeek R1 | DeepSeek | 128K | 8,192 | $0.550 | $2.19 |
| Grok 3 | xAI | 128K | 16,384 | $3.00 | $15.00 |
| DeepSeek V3 | DeepSeek | 128K | 8,192 | $0.270 | $1.10 |
| GPT-4o | OpenAI | 128K | 16,384 | $2.50 | $10.00 |
| Qwen 2.5 Max | Alibaba | 128K | 8,192 | $0.160 | $0.640 |
| Mistral Large | Mistral | 128K | 8,192 | $2.00 | $6.00 |
| GPT-4o Mini | OpenAI | 128K | 16,384 | $0.150 | $0.600 |
| Grok 3 Mini | xAI | 128K | 16,384 | $0.300 | $0.500 |
| Command R+ | Cohere | 128K | 4,096 | $2.50 | $10.00 |
| Mistral Small | Mistral | 128K | 8,192 | $0.100 | $0.300 |
| Command R | Cohere | 128K | 4,096 | $0.150 | $0.600 |
| Phi-4 | Microsoft | 16.384K | 4,096 | $0.070 | $0.140 |
Frequently Asked Questions
- What is a context window?
- A context window is the maximum number of tokens (words and word pieces) that a language model can process in a single request. It includes both the input prompt and the generated output. Larger context windows allow you to send longer documents, maintain longer conversation histories, and process more data in a single API call.
- Which LLM has the largest context window?
- As of 2026, Gemini 2.5 Pro leads with a 1 million token context window, followed by Gemini 2.0 Flash and Flash Lite with 1M tokens each. Among non-Google models, Claude Opus 4 and Claude Sonnet 4 offer 200K tokens, while GPT-4o provides 128K tokens.
- Does context window size affect price?
- Context window size itself doesn't directly affect per-token pricing, but larger context windows mean you can send more tokens per request, which increases total cost. Some providers offer cached input pricing at a discount for repeated content within the context window. Models with very large context windows (like Gemini) may also have different rate limits.