Pricing & Cost

Context Caching

Quick Answer

The capability to cache long context and reuse it across multiple requests.

Context caching enables storing large contexts (documents, code) for reuse. Multiple queries can use the same cached context. This reduces latency and cost. Context caching is particularly valuable for RAG and document analysis. Caching requires minimum cache size. Cache invalidation (when to refresh) requires consideration. Context caching is becoming standard in modern APIs. This enables new application patterns.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →