Keep your stack — unify the layer underneath it. Point any tool that speaks OpenAI at one endpoint and route 100+ models through it. Bind the key to a Decision Gate budget and every call from every tool is held to the limit you set — agents are literally unable to overspend it.
https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
API key: Your phewsh API key (above)anthropic/claude-sonnet-4 or any OpenRouter model ID~/.hermes/.env:
OPENAI_API_KEY=your-phewsh-key
OPENAI_BASE_URL=https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions
Enter your phewsh API key as the OpenAI API key.
~/.continue/config.json:
{
"models": [{
"provider": "openai",
"model": "anthropic/claude-sonnet-4",
"apiBase": "https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions",
"apiKey": "your-phewsh-key"
}]
}
curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
-H "Authorization: Bearer YOUR_PHEWSH_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
See the Decision Gate section below to add budget enforcement.
phewsh login --set-key
# Choose "Custom (OpenAI-compatible)"
# Paste the endpoint URL
# Paste your phewsh API key
Or use the built-in provider support:
phewsh ai run -p openrouter "your prompt"
Easiest: bind your key to a project in Your API Key above — then every
tool using that key is gated automatically, no headers needed. Per-request: or
pass the header below to target a project explicitly (this overrides the key binding for that call).
Either way, the gateway pre-checks your remaining budget before the model call — if exhausted it returns
402 budget_exhausted. After each successful call, real spend is atomically recorded to your project.
x-phewsh-project: <your-project-id>
.intent/project.json if using the CLI.curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
-H "Authorization: Bearer YOUR_PHEWSH_KEY" \
-H "Content-Type: application/json" \
-H "x-phewsh-project: YOUR_PROJECT_ID" \
-d '{
"model": "anthropic/claude-sonnet-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Response when budget is exhausted (402):
{
"error": {
"message": "Project budget exhausted ($5.00 of $5.00 spent). Raise the Decision Gate budget at phewsh.com/intent.",
"type": "budget_exhausted"
}
}
When x-phewsh-project is omitted, calls proceed against your global credit pool with no project-level budget check.
phewsh login --token <jwt-from-phewsh.com/intent>
cd /your-project # must have .intent/project.json with a real project id
phewsh ai run -p phewsh "your prompt"
# → routes through the PHEWSH gateway with x-phewsh-project attached
# → spend increments; hard-stops at your Decision Gate budget
Any OpenRouter-compatible chat model routes through the gateway — pass its
full ID (e.g. anthropic/claude-sonnet-4)
as the model field.
The list above is the set we tune defaults for, pulled live from
GET /models — it stays current as the gateway changes.
Billing is per token, at each model's live rate plus a small phewsh margin — never per “generation.” Prices shown are $ in / $ out per million tokens. Your prepaid balance is held in credits worth $0.125 of usage each, and the gateway pre-checks that your balance covers a request before running it.
Don't want to hard-code a model? Name a route_policy
and the gateway picks one for you — but only when you ask.
An explicit model always wins, and every response carries
x-phewsh-model +
x-phewsh-route headers, so the gateway never picks a model you can't see. Policies are served live from the same registry:
model, name a policy instead:
curl https://fpnpfnahwaztdlxuayyv.supabase.co/functions/v1/chat-completions \
-H "Authorization: Bearer YOUR_PHEWSH_KEY" \
-H "Content-Type: application/json" \
-d '{
"route_policy": "coding_agent",
"messages": [{"role": "user", "content": "Refactor this function."}]
}'
The gateway resolves the policy and tells you what it chose, in the response headers:
x-phewsh-route: coding_agent
x-phewsh-model: anthropic/claude-sonnet-4
You can also pass it as a header (x-phewsh-route: coding_agent) for clients that can't edit the body. An unknown policy returns 400 rather than silently defaulting.