gpu.aiDocs
UPDATED 2026.05.11READ 8 MINEDIT ON GITHUB →
CH·03API

Conventions.

Cross-cutting rules that apply to every /v1 endpoint. Read these once and you can skim the reference.

§ 03.1Authentication

Every authenticated request carries a Bearer token in the standard Authorization header:

HEADER
Authorization: Bearer gpuai_live_AbCdEf0123456789...

Keys are issued from the dashboard or via gpuctl auth login. The server stores only a SHA-256 hash of the raw key — the plaintext is shown exactly once at creation.

§03.1.1Scopes

Keys can carry shortcut scopes or per-resource permissions. Shortcuts:

  • full_access — read + write on everything.
  • read_only — read on everything, no writes.

Per-resource scopes set none, read, or write independently on the four resource families:

SCOPE OBJECT
{
  "instances": "write",
  "ssh_keys":  "write",
  "billing":   "read",
  "webhooks":  "write"
}

Calls failing scope checks return 403 insufficient_scope. Calls missing auth entirely return 401 unauthenticated or 401 invalid_api_key.

§ 03.2Errors (RFC 7807)

Every 4xx and 5xx response uses the application/problem+json envelope from RFC 7807. The shape is stable across the entire API:

PROBLEM+JSON
{
  "type":       "https://api.gpu.ai/errors/validation_failed",
  "title":      "Unprocessable Entity",
  "status":     422,
  "detail":     "gpu_count must be between 1 and 8",
  "code":       "validation_failed",
  "request_id": "8b1a8a9c0d1e2f3a4b5c6d7e8f9a0b1c"
}
CodeStatusMeaning
unauthenticated401No Bearer key supplied.
invalid_api_key401Key is malformed, revoked, expired, or does not match a known hash.
insufficient_scope403Key authenticated but lacks the scope required for this operation.
not_found404Resource does not exist, or is not visible to your key (org-scoped).
idempotency_conflict409A request with this Idempotency-Key is already in progress. Retry later.
idempotency_mismatch422Idempotency key reused with a different request body within the 24h window.
validation_failed422Body or query parameters failed validation. See `detail`.
invalid_gpu_type422Unknown gpu_type. Pull /v1/gpu-types for the canonical list.
insufficient_balance402Account balance cannot cover the requested operation.
quota_exceeded402Org quota reached for this resource family.
rate_limited429Per-key sliding window exceeded. Honor the Retry-After header.
operation_failed500An async operation transitioned to `failed`. See operation.error for detail.
internal_error500Unexpected server-side fault. Safe to retry on idempotent ops.

§ 03.3Pagination

List endpoints return a cursor-paginated envelope. Pass ?limit (1–200, default 50) and ?cursor (opaque string returned in the previous page).

GET /v1/instances?limit=25
↓
{
  "data": [ ... 25 items ... ],
  "next_cursor": "MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg..."
}

GET /v1/instances?limit=25&cursor=MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg...
↓
{
  "data": [ ... next 25 items ... ],
  "next_cursor": null   // no more pages
}

The cursor is opaque — treat it as an unparseable string. We guarantee that round-tripping the same cursor returns the same page, but we make no commitments about its internal format. When next_cursor is null, you've reached the end.

Invalid cursors return 422 validation_failed.

§ 03.4Idempotency

Every POST, PATCH, and DELETE accepts an Idempotency-Key header. Always send one for writes — it makes retries safe.

HEADER
Idempotency-Key: 8b1a8a9c-0d1e-2f3a-4b5c-6d7e8f9a0b1c

Behavior, scoped to (api_key_id, key) over a 24-hour window:

  • First request: handler runs; status, body, and Operation-Id are cached.
  • Replay (same body): the stored response is replayed byte-for-byte. Safe to retry indefinitely within 24h.
  • Replay (different body): 422 idempotency_mismatch. Use a fresh key if you genuinely want a new operation.
  • In-flight: 409 idempotency_conflict. Another request with the same key is still running. Back off and retry.

Compatible with Stripe's SDK conventions — if you've wired Stripe-style retry helpers, they work as-is.

§ 03.5Rate limits

Per-key sliding window: 100 requests/second sustained plus a 200-request burst. Counter resets every second.

When you exceed the window, the next request returns 429 rate_limited with a Retry-After header (seconds until the counter rolls over):

HTTP RESPONSE
HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/problem+json

{
  "code":   "rate_limited",
  "status": 429,
  ...
}

If our rate-limit cache hits an outage we fail open rather than dropping your traffic — your requests will go through, and we get paged. Need higher limits for a specific key? Contact support@gpu.ai.

§ 03.6Request IDs

Every response carries an X-Request-Id header. The same id is echoed in the request_id field of every error envelope. Include it whenever you reach out for support — it lets us pull the full request trace in seconds.

You can supply your own (up to 128 chars) by setting X-Request-Id on the request. We honor it verbatim if present, otherwise we mint a 32-char hex id.