Pricing & Cost

Context Compression

Quick Answer

Techniques for reducing context size while preserving necessary information.

Context compression reduces token consumption by removing redundant or irrelevant context. Techniques: summarization, selective retrieval, information filtering. Compression trades context fidelity for cost. Lossy compression (losing details) requires careful tuning. Compression can reduce costs 30-50%. Compression is particularly useful for RAG applications. Compression quality affects output quality.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →