Rate limits

Requests are rate-limited per token. Every response carries headers describing your current budget so you can pace yourself instead of guessing.

Headers

Header	Meaning
`X-RateLimit-Limit`	Max requests allowed in the current window.
`X-RateLimit-Remaining`	Requests left in the current window.
`X-RateLimit-Reset`	When the window resets (epoch seconds).

When you exceed the limit you get a 429 with the same headers — X-RateLimit-Reset tells you when to try again.

Backing off

Respect Retry-After/X-RateLimit-Reset on a 429, and use exponential backoff with jitter for 429/5xx:

import time, random, requests

def call_with_backoff(do_request, max_attempts=5):
    for attempt in range(max_attempts):
        resp = do_request()
        if resp.status_code not in (429, 500, 502, 503, 504):
            return resp
        # Prefer the server's hint; otherwise exponential backoff with jitter.
        reset = resp.headers.get("X-RateLimit-Reset")
        if resp.status_code == 429 and reset:
            wait = max(0, int(reset) - int(time.time()))
        else:
            wait = (2 ** attempt) + random.random()
        time.sleep(wait)
    return resp  # caller handles the final failure

Staying under the limit

Page larger, call less — use limit: 100 on searches instead of many small pages.
Project fields — request only the fields you need to keep responses (and your processing) lean.
Cache read-mostly data (pipelines, stage metadata) rather than refetching per operation.
Spread bulk work — for imports/exports, watch X-RateLimit-Remaining and slow down as it approaches zero rather than sprinting into a 429.
One token per workload — limits are per token, so isolate a heavy batch job behind its own token to avoid starving your interactive traffic.

Headers​

Backing off​

Staying under the limit​

Headers

Backing off

Staying under the limit