Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.croma.run/llms.txt

Use this file to discover all available pages before exploring further.

Per-organization buckets

Rate limits are enforced per organization, not per key. Every key issued to the same org shares one bucket, so adding keys doesn’t multiply your quota. The default limit is 100 requests per day per organization. Some endpoints have tighter ceilings. For example, Web Search is capped at 10 requests per hour. Each endpoint’s page notes its limit.

Quota in response headers

Rate-limit state comes back as HTTP headers on every response (not in the body):
HeaderMeaning
X-RateLimit-LimitRequests allowed in the current window.
X-RateLimit-RemainingRequests left before you’re throttled.
X-RateLimit-ResetISO timestamp when the window resets.
X-Request-IdUnique id for the request (req_…); include it in support reports.
X-CacheHIT or MISS on cacheable endpoints. Cached hits still count against your quota.

When you exceed the limit

Over-quota requests return 429 with a rate_limit_error envelope and a Retry-After header (seconds):
{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limited",
    "message": "Rate limit exceeded. Try again in 42 seconds."
  }
}
Retry-After: 42
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-05-23T18:00:00.000Z
Back off until Retry-After elapses (or X-RateLimit-Reset), then retry.
The limiter fails open: if the rate-limit backend is briefly unavailable, requests are allowed through and no X-RateLimit-* headers are emitted. Don’t depend on the headers always being present.

Next: Errors

The error envelope and every error code.