tokensllm-pricinginput-tokensoutput-tokenscost-calculation

LLM Token Pricing Explained: What You're Actually Paying For

LLM Token Pricing Explained: What You're Actually Paying For

Quick answer: Tokens are the unit of LLM pricing. Roughly 1 token = 0.75 words in English. Input tokens (your prompt) and output tokens (the model's response) are priced separately — output tokens typically cost 3-5× more than input tokens. A typical user query costs $0.00001-$0.0002 depending on the model.


What is a token?

A token is the unit that language models process text in. Tokens are not the same as words, characters, or syllables — they're subword units determined by the model's tokenizer.

Rough approximations for English text:

  • 1 token ≈ 4 characters
  • 1 token ≈ 0.75 words
  • 1,000 tokens ≈ 750 words ≈ 1.5 pages of text
  • 1 million tokens ≈ 750,000 words ≈ a 1,500-page novel

For code, tokens run roughly 1 token per 3-4 characters (more tokens per visible character because of whitespace and symbols). For non-English text, languages with larger character sets like Chinese use more tokens per word than English.

You can count tokens exactly using our AI Token Counter tool — it shows estimated token counts per model.


Input tokens vs output tokens

All major LLM APIs charge separately for input tokens (your prompt, system prompt, conversation history) and output tokens (the model's generated response).

Output tokens cost more because:

  1. Generation is computationally more expensive than processing input
  2. Output is sequential (each token requires a forward pass), while input can be processed in parallel

Typical output/input price ratio by provider:

  • OpenAI GPT-4o: 4× (output costs 4× more per token)
  • Anthropic Claude Sonnet 4: 5×
  • Google Gemini 2.5 Pro: ~4×
  • Smaller/cheaper models: typically 4×


Real-world cost examples

Example 1: Customer support chatbot

  • System prompt: 500 tokens
  • User message: 50 tokens
  • Response: 200 tokens
  • Total: 750 tokens per turn (550 input, 200 output)
  • At Claude Sonnet 4 pricing: (550 × $3 + 200 × $15) / 1,000,000 = $0.000165 + $0.003 = $0.00317 per conversation turn
  • At 100,000 turns/month: $317/month

Example 2: Document summarization

  • Document context: 8,000 tokens
  • Instruction: 100 tokens
  • Summary output: 500 tokens
  • Total: 8,600 tokens (8,100 input, 500 output)
  • At GPT-4.1 pricing: (8,100 × $2 + 500 × $8) / 1,000,000 = $0.0162 + $0.004 = $0.0202 per document
  • At 10,000 documents/month: $202/month

Example 3: High-volume classification

  • Prompt: 200 tokens
  • Output: 5 tokens ("positive", "negative", etc.)
  • Total: 205 tokens
  • At GPT-4.1 Nano pricing: (200 × $0.10 + 5 × $0.40) / 1,000,000 = $0.0000022 per request
  • At 10 million requests/month: $22/month


Context window and pricing

Context window is the maximum total tokens (input + output) a model can process in a single request. A 200K context window means you can pass up to 200,000 tokens per call.

Important: context window size doesn't add cost by itself. You pay for the tokens you actually send, not the maximum you could send. A 1,000-token prompt costs the same whether the model has a 4K or 200K context window.

However, larger context requests cost more because you're sending more input tokens. Passing a 100,000-token document to a model costs 100,000× the price of passing a 1-token prompt.


How to calculate your monthly LLM bill

Formula:

Monthly cost = (Monthly input tokens × input price per million / 1,000,000)
             + (Monthly output tokens × output price per million / 1,000,000)

To estimate your monthly tokens:

Monthly tokens = (average tokens per request) × (requests per month)
Input tokens = monthly tokens × input ratio (typically 0.6-0.8)
Output tokens = monthly tokens × output ratio (typically 0.2-0.4)

Use the LLMversus cost calculator to run this calculation across all major models simultaneously, or the token counter tool to measure your specific prompt's token cost before committing to a model.


Pricing trends 2024-2026

Token prices have fallen dramatically:

  • GPT-4 at launch (2023): $30/1M input, $60/1M output
  • GPT-4o (2024): $5/1M input, $15/1M output
  • GPT-4.1 (2026): $2/1M input, $8/1M output
  • GPT-4.1 Nano (2026): $0.10/1M input, $0.40/1M output

Frontier model pricing has dropped 95%+ in three years. The same quality level you paid $30/1M for in 2023 now costs $2-3/1M. Budget accordingly — your LLM costs should be going down year over year even as your usage grows.

Your ad here

Related Tools