Conventions.
Cross-cutting rules that apply to every /v1 endpoint. Read these once and you can skim the reference.
§ 03.1Authentication¶
Every authenticated request carries a Bearer token in the standard Authorization header:
Authorization: Bearer gpuai_live_AbCdEf0123456789...Keys are issued from the dashboard or via gpuctl auth login. The server stores only a SHA-256 hash of the raw key — the plaintext is shown exactly once at creation.
§03.1.1Scopes¶
Keys can carry shortcut scopes or per-resource permissions. Shortcuts:
full_access— read + write on everything.read_only— read on everything, no writes.
Per-resource scopes set none, read, or write independently on the four resource families:
{
"instances": "write",
"ssh_keys": "write",
"billing": "read",
"webhooks": "write"
}Calls failing scope checks return 403 insufficient_scope. Calls missing auth entirely return 401 unauthenticated or 401 invalid_api_key.
§ 03.2Errors (RFC 7807)¶
Every 4xx and 5xx response uses the application/problem+json envelope from RFC 7807. The shape is stable across the entire API:
{
"type": "https://api.gpu.ai/errors/validation_failed",
"title": "Unprocessable Entity",
"status": 422,
"detail": "gpu_count must be between 1 and 8",
"code": "validation_failed",
"request_id": "8b1a8a9c0d1e2f3a4b5c6d7e8f9a0b1c"
}| Code | Status | Meaning |
|---|---|---|
unauthenticated | 401 | No Bearer key supplied. |
invalid_api_key | 401 | Key is malformed, revoked, expired, or does not match a known hash. |
insufficient_scope | 403 | Key authenticated but lacks the scope required for this operation. |
not_found | 404 | Resource does not exist, or is not visible to your key (org-scoped). |
idempotency_conflict | 409 | A request with this Idempotency-Key is already in progress. Retry later. |
idempotency_mismatch | 422 | Idempotency key reused with a different request body within the 24h window. |
validation_failed | 422 | Body or query parameters failed validation. See `detail`. |
invalid_gpu_type | 422 | Unknown gpu_type. Pull /v1/gpu-types for the canonical list. |
insufficient_balance | 402 | Account balance cannot cover the requested operation. |
quota_exceeded | 402 | Org quota reached for this resource family. |
rate_limited | 429 | Per-key sliding window exceeded. Honor the Retry-After header. |
operation_failed | 500 | An async operation transitioned to `failed`. See operation.error for detail. |
internal_error | 500 | Unexpected server-side fault. Safe to retry on idempotent ops. |
§ 03.3Pagination¶
List endpoints return a cursor-paginated envelope. Pass ?limit (1–200, default 50) and ?cursor (opaque string returned in the previous page).
GET /v1/instances?limit=25
↓
{
"data": [ ... 25 items ... ],
"next_cursor": "MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg..."
}
GET /v1/instances?limit=25&cursor=MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg...
↓
{
"data": [ ... next 25 items ... ],
"next_cursor": null // no more pages
}The cursor is opaque — treat it as an unparseable string. We guarantee that round-tripping the same cursor returns the same page, but we make no commitments about its internal format. When next_cursor is null, you've reached the end.
Invalid cursors return 422 validation_failed.
§ 03.4Idempotency¶
Every POST, PATCH, and DELETE accepts an Idempotency-Key header. Always send one for writes — it makes retries safe.
Idempotency-Key: 8b1a8a9c-0d1e-2f3a-4b5c-6d7e8f9a0b1cBehavior, scoped to (api_key_id, key) over a 24-hour window:
- First request: handler runs; status, body, and
Operation-Idare cached. - Replay (same body): the stored response is replayed byte-for-byte. Safe to retry indefinitely within 24h.
- Replay (different body):
422 idempotency_mismatch. Use a fresh key if you genuinely want a new operation. - In-flight:
409 idempotency_conflict. Another request with the same key is still running. Back off and retry.
Compatible with Stripe's SDK conventions — if you've wired Stripe-style retry helpers, they work as-is.
§ 03.5Rate limits¶
Per-key sliding window: 100 requests/second sustained plus a 200-request burst. Counter resets every second.
When you exceed the window, the next request returns 429 rate_limited with a Retry-After header (seconds until the counter rolls over):
HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/problem+json
{
"code": "rate_limited",
"status": 429,
...
}If our rate-limit cache hits an outage we fail open rather than dropping your traffic — your requests will go through, and we get paged. Need higher limits for a specific key? Contact support@gpu.ai.
§ 03.6Request IDs¶
Every response carries an X-Request-Id header. The same id is echoed in the request_id field of every error envelope. Include it whenever you reach out for support — it lets us pull the full request trace in seconds.
You can supply your own (up to 128 chars) by setting X-Request-Id on the request. We honor it verbatim if present, otherwise we mint a 32-char hex id.