Supplier API.
A lightweight REST API that GPU providers implement to list inventory, provision instances, and integrate with GPU.ai. Designed for small GPU farms and colocations without existing APIs.
§ 07.1How it works¶
You implement three REST endpoints on your infrastructure. GPU.ai polls your inventory every 30 seconds, sends provision requests when customers order GPUs, and polls instance status for lifecycle tracking. You don't need to call GPU.ai — we call you.
- Contact integrations@gpu.ai to receive your
client_idandclient_secret. - Implement the three endpoints described below.
- GPU.ai configures your base URL and starts polling automatically.
§ 07.2Authentication¶
GPU.ai authenticates to your endpoints using Bearer tokens obtained via the OAuth2 client credentials flow. GPU.ai calls its own token endpoint with your credentials, receives a short-lived token, and sends it in the Authorization header on every request.
Authorization: Bearer eyJhbGciOiJS...Tokens expire after 1 hour. GPU.ai handles token refresh automatically — your endpoints just need to validate the Bearer token on each request.
§ 07.3Base URL¶
All endpoints are relative to your base URL, configured during onboarding. For example, if your base URL is https://api.acme-gpu.com, the availability endpoint would be at https://api.acme-gpu.com/v1/gpu/available.
§ 07.4Endpoints¶
/v1/gpu/availableReturns your current GPU offerings with pricing and availability. GPU.ai polls this every 30 seconds. Only return GPUs that are currently provisionable.
Response body
offeringsarrayrequired{
"offerings": [
{
"gpu_type": "h100_sxm",
"gpu_count": 1,
"vram_per_gpu_gb": 80,
"cpu_cores": 24,
"ram_gb": 128,
"storage_gb": 500,
"price_per_hour": 3.49,
"tier": "on_demand",
"region": "US",
"datacenter_location": "US-East-1",
"stock_status": "High",
"available_count": 12
}
]
}/v1/instancesProvisions a new GPU instance. The startup_script field contains a bootstrap script that must be executed on instance boot — it establishes the SSH tunnel back to GPU.ai. Return immediately with your instance ID; GPU.ai polls status separately.
Request body
instance_idstringrequiredgpu_typestringrequiredh100_sxm, a100_80gb).gpu_countintegerrequiredtierstringrequiredon_demand or spot.regionstringssh_public_keysstring[]docker_imagestringstartup_scriptstringResponse body
upstream_idstringrequiredstatusstringrequiredcreating).cost_per_hournumberestimated_ready_secondsintegerdatacenter_locationstringregionstring{
"upstream_id": "sup-12345",
"status": "creating",
"cost_per_hour": 3.49,
"estimated_ready_seconds": 60,
"datacenter_location": "US-East-1",
"region": "US"
}/v1/instances/{id}Returns the current status of a provisioned instance. GPU.ai polls this to track lifecycle transitions.
Response body
upstream_idstringrequiredstatusstringrequiredcreating, running, stopping, terminated, error.ipstringcost_per_hournumberuptime_secondsinteger{
"upstream_id": "sup-12345",
"status": "running",
"ip": "10.0.1.55",
"cost_per_hour": 3.49,
"uptime_seconds": 3600
}/v1/instances/{id}Terminates a running instance and releases all resources. This must be idempotent — terminating an already-terminated instance should return 204 without error.
Returns 204 No Content on success (no response body).
§ 07.5GPU type identifiers¶
Use GPU.ai's canonical GPU type strings in your gpu_type fields. These must match exactly.
| Identifier | GPU | VRAM |
|---|---|---|
h200_sxm | NVIDIA H200 SXM | 141 GB |
h100_sxm | NVIDIA H100 SXM | 80 GB |
h100_pcie | NVIDIA H100 PCIe | 80 GB |
a100_80gb | NVIDIA A100 | 80 GB |
l40s | NVIDIA L40S | 48 GB |
rtx_4090 | NVIDIA RTX 4090 | 24 GB |
rtx_3090 | NVIDIA RTX 3090 | 24 GB |
§ 07.6Error handling¶
GPU.ai retries on 5xx errors with exponential backoff (up to 3 attempts). Return 429 with a Retry-After header if you need to rate limit. 4xx errors (except 429) are not retried.
§ 07.7OpenAPI spec¶
The full OpenAPI 3.0.3 specification is available for download. Import it into Postman, Swagger UI, or your API client of choice to explore the endpoints interactively.
View openapi.yaml on GitHub ↗§ 07.8Get started¶
Ready to integrate? Contact integrations@gpu.ai with your company name, datacenter locations, and GPU inventory. We'll provision your credentials and walk you through onboarding.