UnifAPI Docs

Rate limits

Per-workspace and per-endpoint limits, headers, and retry behavior.

UnifAPI enforces two layers of rate limiting: a per-workspace limit that smooths traffic across all endpoints, and a per-endpoint limit that mirrors what the upstream provider will tolerate.

Headers

Every response — 2xx or 429 — carries the standard headers:

HeaderMeaning
X-RateLimit-LimitRequests allowed in the current window
X-RateLimit-RemainingRequests left in the current window
X-RateLimit-ResetUnix timestamp (seconds) when the window resets
Retry-AfterSeconds to wait before retrying (429 only)

When both a workspace and an endpoint limit apply, the headers report whichever is tighter.

Defaults

PlanWorkspace limitPer-endpoint limit
Free60 req/minWhatever the upstream allows
Pay-as-you-go600 req/minWhatever the upstream allows
EnterpriseCustomCustom

Handling 429

async function call(url: string, init: RequestInit, attempt = 0): Promise<Response> {
  const res = await fetch(url, init);
  if (res.status !== 429 || attempt >= 5) return res;

  const retryAfter = Number(res.headers.get('Retry-After') ?? 1);
  await new Promise(r => setTimeout(r, retryAfter * 1000));
  return call(url, init, attempt + 1);
}

Two rules to live by:

  1. Honor Retry-After. UnifAPI returns the exact value the upstream provider asked for — guessing usually makes things worse.
  2. Back off on repeated 429s. If you hit the limit five times in a row, the workspace is undersized for the workload. Upgrade or shard across workspaces.

Burst behavior

Limits are token-bucket: a workspace that has been idle for a minute can briefly burst above its steady-state limit. Don't rely on this for production traffic — design for the steady-state rate.

Asking for more

Need a higher limit? Email support@unifapi.com with your workspace ID (in the dashboard) and a rough QPS estimate.

On this page