LLM Context Window Comparison 2026

Compare context window sizes across 25 large language models. Larger context windows let you process longer documents and maintain richer conversation histories.

Data verified Apr 3, 2026

Context Window by Model

OpenAI
Anthropic
Google
Meta
Mistral
DeepSeek
xAI
Cohere
Microsoft
Alibaba

All Models — Ranked by Context Window

ModelProviderContext WindowMax OutputInput $/MOutput $/M
Llama 4 ScoutMeta10.48576M32,768$0.100$0.300
Gemini 2.5 ProGoogle1.048576M65,536$1.25$10.00
Llama 4 MaverickMeta1.048576M32,768$0.200$0.600
Gemini 2.0 FlashGoogle1.048576M8,192$0.100$0.400
Gemini 2.0 Flash LiteGoogle1.048576M8,192$0.075$0.300
GPT-4.1OpenAI1.047576M32,768$2.00$8.00
GPT-4.1 MiniOpenAI1.047576M32,768$0.400$1.60
GPT-4.1 NanoOpenAI1.047576M32,768$0.100$0.400
o4-miniOpenAI200K100,000$1.10$4.40
Claude Opus 4Anthropic200K32,000$15.00$75.00
o3-miniOpenAI200K100,000$1.10$4.40
Claude Sonnet 4Anthropic200K64,000$3.00$15.00
Claude Haiku 4Anthropic200K8,192$0.800$4.00
DeepSeek R1DeepSeek128K8,192$0.550$2.19
Grok 3xAI128K16,384$3.00$15.00
DeepSeek V3DeepSeek128K8,192$0.270$1.10
GPT-4oOpenAI128K16,384$2.50$10.00
Qwen 2.5 MaxAlibaba128K8,192$0.160$0.640
Mistral LargeMistral128K8,192$2.00$6.00
GPT-4o MiniOpenAI128K16,384$0.150$0.600
Grok 3 MinixAI128K16,384$0.300$0.500
Command R+Cohere128K4,096$2.50$10.00
Mistral SmallMistral128K8,192$0.100$0.300
Command RCohere128K4,096$0.150$0.600
Phi-4Microsoft16.384K4,096$0.070$0.140

Frequently Asked Questions

What is a context window?
A context window is the maximum number of tokens (words and word pieces) that a language model can process in a single request. It includes both the input prompt and the generated output. Larger context windows allow you to send longer documents, maintain longer conversation histories, and process more data in a single API call.
Which LLM has the largest context window?
As of 2026, Gemini 2.5 Pro leads with a 1 million token context window, followed by Gemini 2.0 Flash and Flash Lite with 1M tokens each. Among non-Google models, Claude Opus 4 and Claude Sonnet 4 offer 200K tokens, while GPT-4o provides 128K tokens.
Does context window size affect price?
Context window size itself doesn't directly affect per-token pricing, but larger context windows mean you can send more tokens per request, which increases total cost. Some providers offer cached input pricing at a discount for repeated content within the context window. Models with very large context windows (like Gemini) may also have different rate limits.