{"openapi":"3.1.0","info":{"title":"NUREV Inference API","version":"1.0.0","description":"OpenAI-compatible inference API. Routes requests through NUREV's edge worker fleet (operator hardware) with automatic fallback to Together.ai and OpenRouter. Same request/response shape as the OpenAI API — most OpenAI SDK clients work by changing the `base_url` to `https://api.app.nurev.io/v1`.\n\n**Authentication**: Bearer token. API keys prefixed with `nrev_`. Issued via the admin console — contact sales@nurev.io for pilot access.\n\n**Status**: Pre-launch, invitation-only pilot. Architecture verified end-to-end; no real customer traffic yet. Pricing, SLA, and contract terms are sales-touched per partner during the pilot phase.","contact":{"name":"NUREV API support","email":"support@nurev.io","url":"https://nurev.io"},"license":{"name":"Proprietary"}},"servers":[{"url":"https://api.app.nurev.io","description":"Production"}],"security":[{"bearerAuth":[]}],"tags":[{"name":"Inference","description":"Synchronous inference endpoints. Embeddings serve from edge workers at 5–50ms; chat completions currently route through commercial providers (Together / OpenRouter) with edge support shipping in v2."},{"name":"Batch","description":"Asynchronous batch processing for large embedding workloads. Queue an inputs array, receive a webhook on completion, fetch results via a signed object-storage URL."},{"name":"Models","description":"Model catalog. Returns the union of models served across all currently-connected edge workers plus the commercial fallback providers."}],"paths":{"/v1/embeddings":{"post":{"tags":["Inference"],"summary":"Generate embeddings","description":"OpenAI-compatible embeddings endpoint. Returns a vector representation for each input string. Server preferentially routes to a connected edge worker that has the requested model loaded; falls through to Together.ai / OpenRouter if no worker is available.\n\nResponses are deterministic and cached server-side for ~1 hour (cache-hit p95 ~5ms, miss p95 ~150ms). Cache hits are billed at retail; provider attribution distinguishes them in your usage report.","operationId":"createEmbeddings","requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/EmbeddingsRequest"},"examples":{"single":{"summary":"Single input string","value":{"model":"nomic-embed-text","input":"The quick brown fox jumps over the lazy dog."}},"multiple":{"summary":"Multiple inputs (returns array preserving order)","value":{"model":"nomic-embed-text","input":["First chunk of text","Second chunk of text","Third chunk of text"]}}}}}},"responses":{"200":{"description":"Embeddings generated successfully","content":{"application/json":{"schema":{"$ref":"#/components/schemas/EmbeddingsResponse"}}}},"400":{"$ref":"#/components/responses/BadRequest"},"401":{"$ref":"#/components/responses/Unauthorized"},"403":{"$ref":"#/components/responses/Forbidden"},"429":{"$ref":"#/components/responses/RateLimited"},"500":{"$ref":"#/components/responses/ServerError"},"503":{"$ref":"#/components/responses/Unavailable"}}}},"/v1/chat/completions":{"post":{"tags":["Inference"],"summary":"Generate chat completion","description":"OpenAI-compatible chat completions endpoint. Same request/response shape as `https://api.openai.com/v1/chat/completions`.\n\n**Note**: edge worker support for chat completions ships in v2. Today, requests route through Together.ai / OpenRouter — same wire format, our markup applied, same billing pipeline. Streaming via Server-Sent Events ships in v2 alongside edge support.","operationId":"createChatCompletion","requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ChatCompletionRequest"}}}},"responses":{"200":{"description":"Chat completion generated successfully","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ChatCompletionResponse"}}}},"400":{"$ref":"#/components/responses/BadRequest"},"401":{"$ref":"#/components/responses/Unauthorized"},"403":{"$ref":"#/components/responses/Forbidden"},"429":{"$ref":"#/components/responses/RateLimited"},"500":{"$ref":"#/components/responses/ServerError"},"503":{"$ref":"#/components/responses/Unavailable"}}}},"/v1/models":{"get":{"tags":["Models"],"summary":"List available models","description":"Returns the union of all models currently servable across the NUREV fleet — edge workers (per-worker-pulled), Together.ai, and OpenRouter. Shape matches the OpenAI `/v1/models` response.","operationId":"listModels","responses":{"200":{"description":"Model list","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ModelList"}}}},"401":{"$ref":"#/components/responses/Unauthorized"}}}},"/v1/batch/embeddings":{"post":{"tags":["Batch"],"summary":"Submit a batch embedding job","description":"Asynchronously embed a large array of inputs. Returns a `batch_id` immediately; processing happens in the background. Subscribe to completion via the optional webhook (HMAC-SHA-256 signed — see `webhookSecret` returned on customer creation) or poll `GET /v1/batch/{id}`.\n\nResult vectors land in a customer-isolated object-storage URL accessible via `GET /v1/batch/{id}/results`. URL is a signed pre-shared link valid for 24 hours.","operationId":"createBatchEmbeddings","requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/BatchEmbeddingsRequest"}}}},"responses":{"201":{"description":"Batch accepted for processing","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Batch"}}}},"400":{"$ref":"#/components/responses/BadRequest"},"401":{"$ref":"#/components/responses/Unauthorized"},"403":{"$ref":"#/components/responses/Forbidden"},"429":{"$ref":"#/components/responses/RateLimited"}}}},"/v1/batch/{id}":{"get":{"tags":["Batch"],"summary":"Get batch status","operationId":"getBatch","parameters":[{"name":"id","in":"path","required":true,"schema":{"type":"string","format":"uuid"}}],"responses":{"200":{"description":"Batch status","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Batch"}}}},"401":{"$ref":"#/components/responses/Unauthorized"},"404":{"description":"Batch not found or not owned by this customer"}}}},"/v1/batch/{id}/results":{"get":{"tags":["Batch"],"summary":"Fetch batch results","description":"Returns a signed object-storage URL (valid for 24 hours) containing the result vectors. Only returns 200 once the batch is `completed`.","operationId":"getBatchResults","parameters":[{"name":"id","in":"path","required":true,"schema":{"type":"string","format":"uuid"}}],"responses":{"200":{"description":"Signed results URL","content":{"application/json":{"schema":{"type":"object","properties":{"url":{"type":"string","format":"uri","description":"Signed pre-shared URL valid for 24h"},"expires_at":{"type":"string","format":"date-time"}}}}}},"401":{"$ref":"#/components/responses/Unauthorized"},"404":{"description":"Batch not found or not yet complete"}}}}},"components":{"securitySchemes":{"bearerAuth":{"type":"http","scheme":"bearer","bearerFormat":"nrev_*","description":"API key prefixed with `nrev_`. Pass as `Authorization: Bearer nrev_...` on every request."}},"schemas":{"EmbeddingsRequest":{"type":"object","required":["model","input"],"properties":{"model":{"type":"string","description":"Model identifier. Currently supported on edge: `nomic-embed-text`, `mxbai-embed-large`, `all-minilm`, `snowflake-arctic-embed`. Commercial fallback supports a wider set — see `GET /v1/models`.","examples":["nomic-embed-text"]},"input":{"oneOf":[{"type":"string","description":"Single input string"},{"type":"array","items":{"type":"string"},"description":"Multiple input strings (returns array preserving order)"}]},"encoding_format":{"type":"string","enum":["float","base64"],"default":"float","description":"Vector encoding. `float` returns plain JSON arrays; `base64` returns base64-encoded float32 binary for transport efficiency. Same OpenAI semantics."}}},"EmbeddingsResponse":{"type":"object","properties":{"object":{"type":"string","enum":["list"]},"data":{"type":"array","items":{"type":"object","properties":{"object":{"type":"string","enum":["embedding"]},"index":{"type":"integer"},"embedding":{"oneOf":[{"type":"array","items":{"type":"number"}},{"type":"string","description":"base64-encoded float32 array if encoding_format=base64"}]}}}},"model":{"type":"string"},"usage":{"type":"object","properties":{"prompt_tokens":{"type":"integer"},"total_tokens":{"type":"integer"}}}}},"ChatCompletionRequest":{"type":"object","required":["model","messages"],"properties":{"model":{"type":"string"},"messages":{"type":"array","items":{"type":"object","required":["role","content"],"properties":{"role":{"type":"string","enum":["system","user","assistant"]},"content":{"type":"string"}}}},"temperature":{"type":"number","minimum":0,"maximum":2,"default":1},"max_tokens":{"type":"integer","minimum":1},"top_p":{"type":"number","minimum":0,"maximum":1,"default":1},"stream":{"type":"boolean","default":false,"description":"Server-Sent Events streaming. Ships in v2."}}},"ChatCompletionResponse":{"type":"object","properties":{"id":{"type":"string"},"object":{"type":"string","enum":["chat.completion"]},"created":{"type":"integer"},"model":{"type":"string"},"choices":{"type":"array","items":{"type":"object","properties":{"index":{"type":"integer"},"message":{"type":"object","properties":{"role":{"type":"string","enum":["assistant"]},"content":{"type":"string"}}},"finish_reason":{"type":"string","enum":["stop","length","content_filter"]}}}},"usage":{"type":"object","properties":{"prompt_tokens":{"type":"integer"},"completion_tokens":{"type":"integer"},"total_tokens":{"type":"integer"}}}}},"BatchEmbeddingsRequest":{"type":"object","required":["model","inputs"],"properties":{"model":{"type":"string"},"inputs":{"type":"array","items":{"type":"string"},"minItems":1,"maxItems":100000,"description":"Input strings to embed. Returned vectors preserve order."},"webhook_url":{"type":"string","format":"uri","description":"Optional. POST {event, batch_id, status, totals} to this URL on completion. Signed with HMAC-SHA-256 using your customer webhookSecret — verify the `X-NUREV-Signature: sha256=<hex>` header before trusting the payload."}}},"Batch":{"type":"object","properties":{"id":{"type":"string","format":"uuid"},"object":{"type":"string","enum":["batch"]},"status":{"type":"string","enum":["pending","processing","completed","failed","cancelled"]},"model":{"type":"string"},"total_jobs":{"type":"integer"},"completed_jobs":{"type":"integer"},"failed_jobs":{"type":"integer"},"total_tokens":{"type":"integer"},"created_at":{"type":"string","format":"date-time"},"completed_at":{"type":"string","format":"date-time","nullable":true}}},"ModelList":{"type":"object","properties":{"object":{"type":"string","enum":["list"]},"data":{"type":"array","items":{"type":"object","properties":{"id":{"type":"string"},"object":{"type":"string","enum":["model"]},"owned_by":{"type":"string"}}}}}},"Error":{"type":"object","properties":{"error":{"type":"object","properties":{"message":{"type":"string"},"type":{"type":"string","enum":["invalid_request_error","authentication_error","permission_error","rate_limit_error","server_error"]},"retry_after":{"type":"integer","description":"Seconds to wait before retrying (rate_limit_error only)"}}}}}},"responses":{"BadRequest":{"description":"Missing or malformed request parameter","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}},"Unauthorized":{"description":"Missing or invalid API key","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}},"Forbidden":{"description":"Account suspended or monthly spend limit reached","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}},"RateLimited":{"description":"Per-key RPM or TPM limit exceeded","headers":{"Retry-After":{"schema":{"type":"integer"},"description":"Seconds until the limit resets"}},"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}},"ServerError":{"description":"Internal server error","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}},"Unavailable":{"description":"All upstream providers unavailable. Retry with exponential backoff.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Error"}}}}}}}