Fundamentals

Top-K Sampling

Quick Answer

A sampling method that selects from the K most likely next tokens.

Top-K sampling restricts the model to only consider the K most probable next tokens. If K=40, the model samples from only the top 40 tokens by probability, filtering out unlikely options. This prevents the model from occasionally generating nonsensical text from very low-probability tokens while preserving diversity when multiple tokens are plausible. Top-K is simpler than nucleus sampling but less adaptive. Many practitioners use top-p exclusively, though top-k can be useful for specific applications.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →