Pricing & Cost

Requests Per Minute (RPM)

Quick Answer

A rate limit measuring maximum requests allowed per minute.

RPM is a request frequency limit. RPM limits prevent single users overwhelming infrastructure. Typical limits: 3,500 RPM for standard tier. Concurrent requests are counted toward RPM. Managing concurrent requests requires queueing. RPM limits are less relevant than TPM for token-heavy applications. Understanding RPM helps with architecture planning.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →