Conventions.

Cross-cutting rules that apply to every /v1 endpoint. Read these once and you can skim the reference.

§ 03.1Authentication¶

Every authenticated request carries a Bearer token in the standard Authorization header:

HEADER

Authorization: Bearer gpuai_live_AbCdEf0123456789...

Keys are issued from the dashboard or via gpuctl auth login. The server stores only a SHA-256 hash of the raw key — the plaintext is shown exactly once at creation.

§03.1.1Scopes¶

Keys can carry shortcut scopes or per-resource permissions. Shortcuts:

full_access — read + write on everything.
read_only — read on everything, no writes.

Per-resource scopes set none, read, or write independently on the four resource families:

SCOPE OBJECT

{
  "instances": "write",
  "ssh_keys":  "write",
  "billing":   "read",
  "webhooks":  "write"
}

Calls failing scope checks return 403 insufficient_scope. Calls missing auth entirely return 401 unauthenticated or 401 invalid_api_key.

§ 03.2Errors (RFC 7807)¶

Every 4xx and 5xx response uses the application/problem+json envelope from RFC 7807. The shape is stable across the entire API:

PROBLEM+JSON

{
  "type":       "https://api.gpu.ai/errors/validation_failed",
  "title":      "Unprocessable Entity",
  "status":     422,
  "detail":     "gpu_count must be between 1 and 8",
  "code":       "validation_failed",
  "request_id": "8b1a8a9c0d1e2f3a4b5c6d7e8f9a0b1c"
}

Code	Status	Meaning
`unauthenticated`	`401`	No Bearer key supplied.
`invalid_api_key`	`401`	Key is malformed, revoked, expired, or does not match a known hash.
`insufficient_scope`	`403`	Key authenticated but lacks the scope required for this operation.
`not_found`	`404`	Resource does not exist, or is not visible to your key (org-scoped).
`idempotency_conflict`	`409`	A request with this Idempotency-Key is already in progress. Retry later.
`idempotency_mismatch`	`422`	Idempotency key reused with a different request body within the 24h window.
`validation_failed`	`422`	Body or query parameters failed validation. See `detail`.
`invalid_gpu_type`	`422`	Unknown gpu_type. Pull /v1/gpu-types for the canonical list.
`insufficient_balance`	`402`	Account balance cannot cover the requested operation.
`quota_exceeded`	`402`	Org quota reached for this resource family.
`rate_limited`	`429`	Per-key sliding window exceeded. Honor the Retry-After header.
`operation_failed`	`500`	An async operation transitioned to `failed`. See operation.error for detail.
`internal_error`	`500`	Unexpected server-side fault. Safe to retry on idempotent ops.

§ 03.3Pagination¶

List endpoints return a cursor-paginated envelope. Pass ?limit (1–200, default 50) and ?cursor (opaque string returned in the previous page).

GET /v1/instances?limit=25
↓
{
  "data": [ ... 25 items ... ],
  "next_cursor": "MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg..."
}

GET /v1/instances?limit=25&cursor=MjAyNi0wNS0wOFQxNzo0MTo0MlR8aW5zXzAxSFg...
↓
{
  "data": [ ... next 25 items ... ],
  "next_cursor": null   // no more pages
}

The cursor is opaque — treat it as an unparseable string. We guarantee that round-tripping the same cursor returns the same page, but we make no commitments about its internal format. When next_cursor is null, you've reached the end.

Invalid cursors return 422 validation_failed.

§ 03.4Idempotency¶

Every POST, PATCH, and DELETE accepts an Idempotency-Key header. Always send one for writes — it makes retries safe.

HEADER

Idempotency-Key: 8b1a8a9c-0d1e-2f3a-4b5c-6d7e8f9a0b1c

Behavior, scoped to (api_key_id, key) over a 24-hour window:

First request: handler runs; status, body, and Operation-Id are cached.
Replay (same body): the stored response is replayed byte-for-byte. Safe to retry indefinitely within 24h.
Replay (different body): 422 idempotency_mismatch. Use a fresh key if you genuinely want a new operation.
In-flight: 409 idempotency_conflict. Another request with the same key is still running. Back off and retry.

Compatible with Stripe's SDK conventions — if you've wired Stripe-style retry helpers, they work as-is.

§ 03.5Rate limits¶

Per-key sliding window: 100 requests/second sustained plus a 200-request burst. Counter resets every second.

When you exceed the window, the next request returns 429 rate_limited with a Retry-After header (seconds until the counter rolls over):

HTTP RESPONSE

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/problem+json

{
  "code":   "rate_limited",
  "status": 429,
  ...
}

If our rate-limit cache hits an outage we fail open rather than dropping your traffic — your requests will go through, and we get paged. Need higher limits for a specific key? Contact support@gpu.ai.

§ 03.6Request IDs¶

Every response carries an X-Request-Id header. The same id is echoed in the request_id field of every error envelope. Include it whenever you reach out for support — it lets us pull the full request trace in seconds.

You can supply your own (up to 128 chars) by setting X-Request-Id on the request. We honor it verbatim if present, otherwise we mint a 32-char hex id.