LLM Token Pricing Explained: What You're Actually Paying For
Quick answer: Tokens are the unit of LLM pricing. Roughly 1 token = 0.75 words in English. Input tokens (your prompt) and output tokens (the model's response) are priced separately — output tokens typically cost 3-5× more than input tokens. A typical user query costs $0.00001-$0.0002 depending on the model.
What is a token?
A token is the unit that language models process text in. Tokens are not the same as words, characters, or syllables — they're subword units determined by the model's tokenizer.
Rough approximations for English text:
- 1 token ≈ 4 characters
- 1 token ≈ 0.75 words
- 1,000 tokens ≈ 750 words ≈ 1.5 pages of text
- 1 million tokens ≈ 750,000 words ≈ a 1,500-page novel
For code, tokens run roughly 1 token per 3-4 characters (more tokens per visible character because of whitespace and symbols). For non-English text, languages with larger character sets like Chinese use more tokens per word than English.
You can count tokens exactly using our AI Token Counter tool — it shows estimated token counts per model.
Input tokens vs output tokens
All major LLM APIs charge separately for input tokens (your prompt, system prompt, conversation history) and output tokens (the model's generated response).
Output tokens cost more because:
- Generation is computationally more expensive than processing input
- Output is sequential (each token requires a forward pass), while input can be processed in parallel
Typical output/input price ratio by provider:
- OpenAI GPT-4o: 4× (output costs 4× more per token)
- Anthropic Claude Sonnet 4: 5×
- Google Gemini 2.5 Pro: ~4×
- Smaller/cheaper models: typically 4×
Real-world cost examples
Example 1: Customer support chatbot
- System prompt: 500 tokens
- User message: 50 tokens
- Response: 200 tokens
- Total: 750 tokens per turn (550 input, 200 output)
- At Claude Sonnet 4 pricing: (550 × $3 + 200 × $15) / 1,000,000 = $0.000165 + $0.003 = $0.00317 per conversation turn
- At 100,000 turns/month: $317/month
Example 2: Document summarization
- Document context: 8,000 tokens
- Instruction: 100 tokens
- Summary output: 500 tokens
- Total: 8,600 tokens (8,100 input, 500 output)
- At GPT-4.1 pricing: (8,100 × $2 + 500 × $8) / 1,000,000 = $0.0162 + $0.004 = $0.0202 per document
- At 10,000 documents/month: $202/month
Example 3: High-volume classification
- Prompt: 200 tokens
- Output: 5 tokens ("positive", "negative", etc.)
- Total: 205 tokens
- At GPT-4.1 Nano pricing: (200 × $0.10 + 5 × $0.40) / 1,000,000 = $0.0000022 per request
- At 10 million requests/month: $22/month
Context window and pricing
Context window is the maximum total tokens (input + output) a model can process in a single request. A 200K context window means you can pass up to 200,000 tokens per call.
Important: context window size doesn't add cost by itself. You pay for the tokens you actually send, not the maximum you could send. A 1,000-token prompt costs the same whether the model has a 4K or 200K context window.
However, larger context requests cost more because you're sending more input tokens. Passing a 100,000-token document to a model costs 100,000× the price of passing a 1-token prompt.
How to calculate your monthly LLM bill
Formula:
Monthly cost = (Monthly input tokens × input price per million / 1,000,000)
+ (Monthly output tokens × output price per million / 1,000,000)
To estimate your monthly tokens:
Monthly tokens = (average tokens per request) × (requests per month)
Input tokens = monthly tokens × input ratio (typically 0.6-0.8)
Output tokens = monthly tokens × output ratio (typically 0.2-0.4)
Use the LLMversus cost calculator to run this calculation across all major models simultaneously, or the token counter tool to measure your specific prompt's token cost before committing to a model.
Pricing trends 2024-2026
Token prices have fallen dramatically:
- GPT-4 at launch (2023): $30/1M input, $60/1M output
- GPT-4o (2024): $5/1M input, $15/1M output
- GPT-4.1 (2026): $2/1M input, $8/1M output
- GPT-4.1 Nano (2026): $0.10/1M input, $0.40/1M output
Frontier model pricing has dropped 95%+ in three years. The same quality level you paid $30/1M for in 2023 now costs $2-3/1M. Budget accordingly — your LLM costs should be going down year over year even as your usage grows.