Rate limits - Croma API

Per-organization buckets

Rate limits are enforced per organization, not per key. Every key issued to the same org shares one bucket, so adding keys doesn’t multiply your quota. The default limit is 100 requests per day per organization. Some endpoints have tighter ceilings. For example, Web Search is capped at 10 requests per hour. Each endpoint’s page notes its limit.

Quota in response headers

Rate-limit state comes back as HTTP headers on every response (not in the body):

Header	Meaning
`X-RateLimit-Limit`	Requests allowed in the current window.
`X-RateLimit-Remaining`	Requests left before you’re throttled.
`X-RateLimit-Reset`	ISO timestamp when the window resets.
`X-Request-Id`	Unique id for the request (`req_…`); include it in support reports.
`X-Cache`	`HIT` or `MISS` on cacheable endpoints. Cached hits still count against your quota.

When you exceed the limit

Over-quota requests return 429 with a rate_limit_error envelope and a Retry-After header (seconds):

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limited",
    "message": "Rate limit exceeded. Try again in 42 seconds."
  }
}

Retry-After: 42
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-05-23T18:00:00.000Z

Back off until Retry-After elapses (or X-RateLimit-Reset), then retry.

The limiter fails open: if the rate-limit backend is briefly unavailable, requests are allowed through and no X-RateLimit-* headers are emitted. Don’t depend on the headers always being present.

Next: Errors

The error envelope and every error code.

Documentation Index

​Per-organization buckets

​Quota in response headers

​When you exceed the limit

Next: Errors

Per-organization buckets

Quota in response headers

When you exceed the limit