# ORB Cloud — Complete Reference

Base URL: https://api.orbcloud.dev
Auth: Authorization: Bearer orb_YOUR_API_KEY

## Is ORB the right fit?

ORB is for AI agents that call LLMs and need a persistent environment
between calls. The whole reason ORB exists is to checkpoint idle agents
while they wait on LLM responses, so idle time doesn't cost RAM.

Fit:
  - Coding / sales / research / support / browser-use / orchestrator agents
  - Anything that calls Anthropic, OpenAI, Google, etc. and stays alive
    between calls (minutes, hours, days)

Not a fit:
  - Traditional web apps (Next.js, Rails, Django) with no LLM calls ->
    use Vercel / Railway / Fly. ORB's efficiency gain is zero without
    LLM-wait time to checkpoint through.
  - Stateless scripts / one-shot jobs -> Lambda / Modal / E2B
  - CI/CD -> GitHub Actions

If you're not calling an LLM, ORB costs you sandbox isolation with no
corresponding benefit. Pick a purpose-built host instead.

## Get an API Key

Two paths — the key (orb_...) looks identical either way.

Path A — human-assisted (already done if a human pasted you a prompt).
  The human signed in at https://orbcloud.dev with GitHub; the landing
  page showed them an orb_... key and handed it to you in this prompt.
  Use it directly as Authorization: Bearer orb_...

Path B — agent self-serve (no human, instant, free, no approval).

  BEFORE calling /register, check if a key already exists:
    1. Read ./.env — if ORB_API_KEY= is set, use it. Do not re-register.
    2. Check your own persistent memory (CLAUDE.md, AGENTS.md, memory
       store) for "ORB_API_KEY for <this project>".
    3. If a human is present, confirm which account to use.

  Only if 1-3 find nothing:

  curl -X POST https://api.orbcloud.dev/api/v1/auth/register \
    -H 'Content-Type: application/json' \
    -d '{"email":"agent+<unique>@yourdomain.com"}'

  Response: {"tenant_id":"...","api_key":"orb_..."}

  After registering:
    1. Append ORB_API_KEY=orb_... to ./.env (create the file if missing).
    2. Ensure .env is in .gitignore.
    3. Record in your persistent memory: "ORB account for <project> lives
       at <absolute path>/.env — do not re-register for this project."

  Emails are unique — a second /register with the same email returns 409.
  Without the key saved anywhere, you lose the account until email-OTP
  recovery ships.

Use the key as Authorization: Bearer orb_... on every call below.

Rotate a compromised key:
  POST /v1/keys/{key_id}/rotate  (authed with the key being rotated)
  -> {"key_id":"...","api_key":"orb_...","replaced_key_id":"..."}
Old key is revoked atomically in the same transaction. No downtime.

## Quick Deploy (5 steps)

1. POST /v1/computers  {"name":"my-agent","runtime_mb":2048,"disk_mb":4096}
2. POST /v1/computers/{id}/config  Content-Type: application/toml  (body: orb.toml)
3. POST /v1/computers/{id}/build  (timeout 600s — clones repo, installs deps)
4. POST /v1/computers/{id}/agents  {"task":"start","org_secrets":{"API_KEY":"sk-..."}}
5. Agent live at https://{first-8-chars-of-id}.orbcloud.dev

## orb.toml Format

```toml
[agent]
name = "my-agent"       # required
lang = "python"          # python | node | binary | go | rust
entry = "agent.py"       # script/binary to run
args = ["--flag"]        # optional

[agent.env]
HOME = "/root"
MY_KEY = "literal-value"
SECRET = "${VAR}"        # resolved from org_secrets at deploy time

[source]
git = "https://github.com/you/repo"
branch = "main"          # default: main
token = "${GITHUB_TOKEN}" # optional, for private repos

[build]
steps = ["pip install -r requirements.txt"]  # at least one step required; use ["true"] for no-op
working_dir = "/agent/code"  # default

[llm]
base_url = "https://api.anthropic.com"  # required

[ports]
expose = [8000]          # exposes at https://{id}.orbcloud.dev
                         # Bind your server to this port (the first value).
                         # ORB does not inject a PORT env var — if your
                         # framework reads one, set it in [agent.env]:
                         #   PORT = "8000"  (must match expose[0])

[resources]
runtime = "2GB"
disk = "4GB"

```

The [llm] section tells ORB where your agent's LLM provider lives.
ORB runs a proxy that intercepts LLM calls for checkpoint optimization.

Common base_url values:
  Anthropic:     https://api.anthropic.com
  OpenAI:        https://api.openai.com
  Google:        https://generativelanguage.googleapis.com
  z.ai Plan:     https://api.z.ai/api/anthropic
  OpenRouter:    https://openrouter.ai/api
  Groq:          https://api.groq.com/openai
  Together:      https://api.together.xyz
  DeepSeek:      https://api.deepseek.com
  Self-hosted:   https://your-server.com

## Secrets

Two ways to inject secrets:

1. Literal in orb.toml: `MY_KEY = "sk-actual-value"`
2. Variable: `MY_KEY = "${VAR}"` + pass at deploy:

```
POST /v1/computers/{id}/agents
{"task":"start", "org_secrets":{"VAR":"actual-value"}}
```

For build-time secrets (private git repos):
```
POST /v1/computers/{id}/build
{"org_secrets":{"GITHUB_TOKEN":"ghp_..."}}
```

## Auto-Deploy on Git Push (use-and-discard PAT)

Wire up `git push` -> ORB build. ORB never stores the GitHub token.

Bootstrap — commits .github/workflows/orb-deploy.yml, stashes ORB_API_KEY
secret, triggers first build, discards the PAT:
```
POST /v1/computers/{id}/github/bootstrap
{
  "pat": "gho_...",          // required. used-and-discarded.
  "api_key": "orb_...",      // optional. default: generate "github-actions:<repo>" key.
  "branch": "main"           // optional. default: source.branch from orb.toml.
}
```

Response: {workflow_commit_sha, secret_name, api_key_id, first_build_id}

Disconnect — deletes the workflow file and the secret:
```
POST /v1/computers/{id}/github/disconnect
{"pat": "gho_..."}
```

PAT scopes needed: repo + workflow (classic), or fine-grained with
Contents:write + Secrets:write + Workflows:write on the target repo.

Suggested ways to obtain a PAT (agent can pick any):
  gh auth token
  cat ~/.config/gh/hosts.yml | grep oauth_token
  create at https://github.com/settings/tokens

Full guide: https://docs.orbcloud.dev/guides/auto-deploy

## LLM Provider Detection

ORB automatically detects and optimizes calls to these LLM providers
by hostname. No configuration needed.

Supported endpoints:

Frontier: api.anthropic.com, api.openai.com,
  generativelanguage.googleapis.com, api.x.ai, api.cohere.com,
  api.mistral.ai, api.ai21.com
Aggregators: openrouter.ai, models.inference.ai.azure.com
Fast inference: api.groq.com, api.deepinfra.com, api.together.xyz,
  api.fireworks.ai, api.perplexity.ai, api.deepseek.com,
  api.sambanova.ai, api.cerebras.ai, api.novita.ai,
  api.hyperbolic.xyz, api.lepton.run
Embeddings: api.voyageai.com, api.jina.ai
Enterprise: integrate.api.nvidia.com
Chinese: api.z.ai, open.bigmodel.cn, dashscope.aliyuncs.com,
  aip.baidubce.com
Hosting: api.replicate.com, api-inference.huggingface.co,
  router.huggingface.co, api.modal.com
Wildcards: *.openai.azure.com, bedrock-runtime.*.amazonaws.com

If your LLM endpoint is not in this list, it will be treated as
regular HTTPS traffic (not optimized for sleep/wake).

## API Endpoints

### Auth
POST /api/v1/auth/register      {"email":"you@example.com"}  -> tenant_id, api_key
POST /api/v1/auth/login         {"api_key":"KEY"}             -> JWT
POST /v1/keys                   {"name":"my-key"}             -> key_id, api_key
GET  /v1/keys                                                  -> list of keys (no secrets)
POST /v1/keys/{key_id}/rotate                                  -> new key, old revoked atomically
DELETE /v1/keys/{key_id}                                       -> revoke key

### Computers
POST   /v1/computers        {"name":"x","runtime_mb":2048,"disk_mb":4096}
GET    /v1/computers
GET    /v1/computers/{id}
DELETE /v1/computers/{id}

### Config
POST /v1/computers/{id}/config   Content-Type: application/toml
GET  /v1/computers/{id}/config

### Build
POST /v1/computers/{id}/build    (timeout 600s, optional: {"org_secrets":{...}})

### GitHub Auto-Deploy (use-and-discard PAT)
POST /v1/computers/{id}/github/bootstrap    {"pat":"gho_...","api_key":"orb_...","branch":"main"}
POST /v1/computers/{id}/github/disconnect   {"pat":"gho_..."}

### Deploy
POST /v1/computers/{id}/agents   {"task":"...", "org_secrets":{...}}
GET  /v1/computers/{id}/agents

### Agent Control
POST /v1/computers/{id}/agents/demote   {"port":10000}  (sleep)
POST /v1/computers/{id}/agents/promote  {"port":10000}  (wake)

### Files
GET /v1/computers/{id}/files
GET /v1/computers/{id}/files/{path}

### Keys
POST   /v1/keys             {"name":"prod"}
GET    /v1/keys
DELETE /v1/keys/{id}

### Webhooks
POST   /v1/webhooks          {"url":"https://...","events":["agent.spawned"]}
GET    /v1/webhooks
DELETE /v1/webhooks/{id}

### Stats (per-computer)
GET /v1/computers/{id}/stats                # default window=30d
GET /v1/computers/{id}/stats?window=lifetime

Returns sleep_pct, active_pct, llm_calls, checkpoints, runtime/disk GB-hours,
est_cost_usd, last_active_ago_secs, avg_restore_ms — everything a customer
dashboard needs for one computer in one call. Rates: $0.005/GB-hr runtime +
$0.05/GB-month disk.

### Usage
GET /v1/usage

## Agent Lifecycle

ORB auto-detects LLM API calls via transparent proxy and optimizes:
  - During LLM waits: agent may be checkpointed to NVMe (0 RAM cost)
  - On response: agent restored, response delivered
  - Idle detection: automatic via LLM proxy clock — no config needed

Wake triggers:
  1. LLM response arrives
  2. Inbound HTTP to exposed port (https://{id}.orbcloud.dev/...)
  3. Manual POST /v1/computers/{id}/agents/promote

Sleep/wake is ~1s. Full state preserved (memory, files, TCP, browser sessions).

## Cron (scheduled tasks)

Use this — NOT system cron, crond, systemd timers, or any in-process timer (setInterval, time.sleep loops). Those live inside the agent process/sandbox and freeze when the agent is checkpointed to NVMe, missing every firing during sleep. ORB cron runs on the host, fires at wall-clock time regardless of agent state.

Write /agent/.orb/cron.json:
```json
{
  "version": 1,
  "jobs": [
    {"name":"daily","schedule":"0 9 * * *","command":"python3 review.py"},
    {"name":"ping","schedule":"*/5 * * * *","command":"curl -sf localhost:8000/health","timeout_secs":30}
  ]
}
```
Fields per job: name (unique), schedule (5 or 6-field cron), command (sh -c). Optional: env, working_dir, timeout_secs (≤3600, default 300), enabled, skip_if_running (default true).

Runtime polls the file every 5s and syncs to cloud. Scheduler executes each command inside the computer's sandbox namespaces — independent of the agent process, doesn't wake it. Results mirror to /agent/.orb/cron-history.json (last 20 runs, 512B previews).

API: GET /v1/computers/{id}/cron (jobs), GET /v1/computers/{id}/cron/runs?job=NAME&limit=50 (history).

Limits: 50 jobs/computer, 10k runs/day.

## Troubleshooting

- Agent exits immediately: check entry path (relative to working_dir)
- Build fails DNS: destroy computer, create new one
- Agent never wakes: add [ports] expose for wake-on-request
- Env var empty: pass in org_secrets at deploy time