PHEWSH API

One API key. One endpoint. Any model. Budget-aware routing across your whole AI stack.

Keep your stack — unify the layer underneath it. Point any tool that speaks OpenAI at one endpoint and route 100+ models through it. Bind the key to a Decision Gate budget and every call from every tool is held to the limit you set — agents are literally unable to overspend it.

Works with the tools you already use Claude Code Cursor Hermes OpenClaw Codex Continue LM Studio Open WebUI MCP your own code

Endpoint

Base URL
https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions

Your API Key

Setup Guides

Hermes Agent
When Hermes asks for your inference provider during setup:

API base URL:
https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
API key: Your phewsh API key (above)
Model: anthropic/claude-sonnet-4 or any OpenRouter model ID

Or set in ~/.hermes/.env:
OPENAI_API_KEY=your-phewsh-key
OPENAI_BASE_URL=https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
Cursor
Settings → Models → OpenAI API Key → Override Base URL:
https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
Enter your phewsh API key as the OpenAI API key.
Continue
In ~/.continue/config.json:
{
  "models": [{
    "provider": "openai",
    "model": "anthropic/claude-sonnet-4",
    "apiBase": "https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions",
    "apiKey": "your-phewsh-key"
  }]
}
cURL / Any Code
Basic call (no project budget enforcement):
curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
  -H "Authorization: Bearer YOUR_PHEWSH_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
See the Decision Gate section below to add budget enforcement.
phewsh CLI
The phewsh CLI already supports this natively. Just run:
phewsh login --set-key
# Choose "Custom (OpenAI-compatible)"
# Paste the endpoint URL
# Paste your phewsh API key
Or use the built-in provider support:
phewsh ai run -p openrouter "your prompt"

Decision Gate — Budget Enforcement

Easiest: bind your key to a project in Your API Key above — then every tool using that key is gated automatically, no headers needed. Per-request: or pass the header below to target a project explicitly (this overrides the key binding for that call).

Either way, the gateway pre-checks your remaining budget before the model call — if exhausted it returns 402 budget_exhausted. After each successful call, real spend is atomically recorded to your project.

Add this header
x-phewsh-project: <your-project-id>
Find your project ID in phewsh.com/intent → open a project → Settings, or in .intent/project.json if using the CLI.
cURL with budget enforcement
curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
  -H "Authorization: Bearer YOUR_PHEWSH_KEY" \
  -H "Content-Type: application/json" \
  -H "x-phewsh-project: YOUR_PROJECT_ID" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Response when budget is exhausted (402):
{
  "error": {
    "message": "Project budget exhausted ($5.00 of $5.00 spent). Raise the Decision Gate budget at phewsh.com/intent.",
    "type": "budget_exhausted"
  }
}
When x-phewsh-project is omitted, calls proceed against your global credit pool with no project-level budget check.
phewsh CLI (budget-enforced)
If you're in a project directory synced to PHEWSH, the CLI attaches the project header automatically:
phewsh login --token <jwt-from-phewsh.com/intent>
cd /your-project    # must have .intent/project.json with a real project id
phewsh ai run -p phewsh "your prompt"
# → routes through the PHEWSH gateway with x-phewsh-project attached
# → spend increments; hard-stops at your Decision Gate budget

Models

Loading models live from the gateway…

Any OpenRouter-compatible chat model routes through the gateway — pass its full ID (e.g. anthropic/claude-sonnet-4) as the model field. The list above is the set we tune defaults for, pulled live from GET /models — it stays current as the gateway changes.

Billing is per token, at each model's live rate plus a small phewsh margin — never per “generation.” Prices shown are $ in / $ out per million tokens. Your prepaid balance is held in credits worth $0.125 of usage each, and the gateway pre-checks that your balance covers a request before running it.

Routing Policies optional

Don't want to hard-code a model? Name a route_policy and the gateway picks one for you — but only when you ask. An explicit model always wins, and every response carries x-phewsh-model + x-phewsh-route headers, so the gateway never picks a model you can't see. Policies are served live from the same registry:

cURL with a routing policy
Send no model, name a policy instead:
curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
  -H "Authorization: Bearer YOUR_PHEWSH_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "route_policy": "coding_agent",
    "messages": [{"role": "user", "content": "Refactor this function."}]
  }'
The gateway resolves the policy and tells you what it chose, in the response headers:
x-phewsh-route: coding_agent
x-phewsh-model: anthropic/claude-sonnet-4
You can also pass it as a header (x-phewsh-route: coding_agent) for clients that can't edit the body. An unknown policy returns 400 rather than silently defaulting.

Complete the loop

Step 1 Set your gate budget → In the Intent app, Decision Gate section. Step 2 Route calls via CLI → phewsh ai run -p phewsh auto-attaches the project header. See it Watch the 402 fire → 2-min terminal walkthrough. Budget hits zero. System hard-stops.