Introduction
SAP (Sustainable AI Protocol) is an open protocol for measuring AI usage across applications. It provides a standard way to capture, store, and aggregate telemetry data — tokens, estimated energy, and carbon — for every AI inference.
Think of it like analytics for AI usage. Every time your application calls an AI model, SAP records what happened: which model, how many tokens, and what that costs in estimated energy. The data aggregates into personal stats (per user) and global stats (across all integrations).
SAP is currently tracking real AI usage in production across the PHEWSH ecosystem, with support for external integrations. Optionally, it exposes usage to end users via a small UI component — sometimes called "the missing button" in AI interfaces.
Personal stats are scoped to a user or session. Global stats aggregate all events across all integrations.
What it does
- Records one event per AI inference (model, tokens, estimated energy, CO₂, water)
- Aggregates usage at the personal, session, daily, and global level
- Provides a standard event schema and API for any application
- Optionally surfaces usage data to end users via the SAP button component
What it does not do
- It does not capture prompt content — only metadata
- It does not require authentication — anonymous tracking works
- It does not claim perfect accuracy — energy values are model-based estimates designed for relative comparison, with support for integrating real telemetry as it becomes available
Honest about estimates. SAP uses model-based estimates derived from published research, not direct measurement from data centers. No AI provider currently exposes per-request energy data. SAP is designed for increasing fidelity — estimates today, verified telemetry when infrastructure providers publish real numbers.
Why integrate SAP
- Standardizes AI usage tracking across models and providers — one event format, one aggregation model
- Provides immediate usage analytics without building custom pipelines
- Enables optional user-facing transparency via the SAP button ("the missing button") — builds user trust
- Creates a foundation for cost optimization, efficiency feedback, and future compliance tooling
- Minimal integration cost — one API call per inference, fire-and-forget, never blocks your app
Quick start
The simplest integration is a single API call after each AI inference. This example uses the Supabase RPC endpoint that SAP runs on:
// After your AI call completes, fire and forget:
fetch('https://fpnpfnahwaztdlxuayyv.supabase.co/rest/v1/rpc/track_sap_event', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'apikey': YOUR_SUPABASE_ANON_KEY,
'Authorization': `Bearer ${YOUR_SUPABASE_ANON_KEY}`
},
body: JSON.stringify({
p_source: 'external',
p_model: 'anthropic/claude-sonnet-4',
p_prompt_tokens: 450,
p_completion_tokens: 820,
p_kwh: 0.00076,
p_co2_g: 0.38,
p_water_ml: 1.37
})
});
That's it. One call per inference. The call is designed to be fire-and-forget and should never block your application or user experience. The backend atomically updates global, daily, and hourly rollups. No additional setup required.
If you don't want to calculate energy estimates yourself, you can omit p_kwh, p_co2_g, and p_water_ml — the database function uses sensible defaults (0.0004 kWh, 0.2g CO₂, 0.72 mL water).
Event schema
Each AI inference produces one row in the sap_events table:
| Column | Type | Description |
|---|---|---|
id | UUID | Auto-generated primary key |
user_id | UUID | Authenticated user (optional, null for anonymous) |
session_id | TEXT | Client-generated session ID (per tab or per session, used to group related interactions) |
source | TEXT | intent | cli | styletree | external |
model | TEXT | Full model slug (e.g. anthropic/claude-sonnet-4) |
prompt_tokens | INTEGER | Input token count |
completion_tokens | INTEGER | Output token count |
estimated_kwh | DECIMAL | Estimated energy (kilowatt-hours) |
estimated_co2_g | DECIMAL | Estimated CO₂ (grams) |
estimated_water_ml | DECIMAL | Estimated cooling water (milliliters) |
created_at | TIMESTAMPTZ | Event timestamp |
Privacy. No prompt content is captured. Only metadata: which model, how many tokens, when. Events are anonymous by default — user attribution only happens when the user is explicitly authenticated.
API
SAP uses Supabase RPC and REST endpoints.
Track an event
POST /rest/v1/rpc/track_sap_event
Records a single inference event and atomically updates all aggregation tables. Returns the event UUID.
| Parameter | Type | Required | Description |
|---|---|---|---|
p_user_id | UUID | No | Authenticated user ID |
p_session_id | TEXT | No | Client session ID |
p_source | TEXT | No | App identifier (default: intent) |
p_model | TEXT | No | Model slug |
p_prompt_tokens | INTEGER | No | Input tokens |
p_completion_tokens | INTEGER | No | Output tokens |
p_kwh | DECIMAL | No | Energy estimate (default: 0.0004) |
p_co2_g | DECIMAL | No | CO₂ estimate (default: 0.2) |
p_water_ml | DECIMAL | No | Water estimate (default: 0.72) |
Headers required: apikey (Supabase anon key), Authorization: Bearer {key}, Content-Type: application/json.
Read global stats
GET /rest/v1/global_stats?select=*&id=eq.1
Returns a single row: total_count, total_kwh, total_co2, total_water, sap_tracked_count.
Read personal stats
GET /rest/v1/sap_my_stats?select=*&user_id=eq.{id}
Requires user JWT. Returns: total_events, total_kwh, total_co2_g, total_water_ml, per-source breakdown, avg_tokens, date range.
Energy estimates
SAP uses model-specific coefficients to estimate energy per inference. These are based on published benchmarks (Patterson et al. 2021, ML CO2 Impact, MLCommons estimates) and are conservative.
Model coefficients
Base kWh per ~1000-token inference:
| Model | kWh |
|---|---|
google/gemini-flash | 0.00020 |
anthropic/claude-3-haiku | 0.00025 |
deepseek/deepseek-chat | 0.00030 |
moonshotai/kimi-k2 | 0.00040 |
anthropic/claude-sonnet-4 | 0.00060 |
anthropic/claude-opus-4 | 0.00200 |
Scaling
kwh = base_coefficient * clamp(total_tokens / 1000, 0.5, 4.0)
Derived metrics
| Metric | Formula | Source |
|---|---|---|
| CO₂ (grams) | kwh * 500 | US grid average (EPA 2023) |
| Water (mL) | kwh * 1800 | Data center evaporative cooling estimate |
Water estimates represent cooling usage based on evaporative data center systems and are directional, not exact. Not all data centers use evaporative cooling — some use closed-loop or air cooling systems that consume significantly less water.
These are estimates. Actual energy varies by data center, hardware, load, region, and cooling system. No AI provider currently exposes per-request energy or water data. SAP is designed for increasing fidelity — estimates today, verified telemetry when infrastructure providers publish real numbers.
Aggregation
Aggregation is handled entirely server-side. Clients only send events. Every call to track_sap_event atomically updates three aggregation layers:
- Global — single row in
global_stats, tracks totals across all users and sources - Daily — one row per day in
sap_daily_stats - Hourly — one row per date + hour in
sap_hourly_stats
Personal stats are computed on read via the sap_my_stats view, which aggregates all events for a given user_id with per-source breakdowns.
Backend integration
Backend-only integration is the primary pattern. No UI required. Call the API after each inference.
TypeScript
import { trackSapEvent, estimateKwh } from '@/lib/sap';
// After your AI call:
trackSapEvent({
userId: user?.id,
source: 'external',
model: 'anthropic/claude-sonnet-4',
promptTokens: 450,
completionTokens: 820,
});
The trackSapEvent function is fire-and-forget — it calculates energy estimates, sends the event, and never throws. It won't block your application or surface errors to users.
Python
import requests, threading
def track_sap_event(model, prompt_tokens, completion_tokens, source="external"):
total = prompt_tokens + completion_tokens
kwh = 0.0004 * max(0.5, min(4, total / 1000))
def _send():
try:
requests.post(
"https://fpnpfnahwaztdlxuayyv.supabase.co/rest/v1/rpc/track_sap_event",
json={
"p_source": source, "p_model": model,
"p_prompt_tokens": prompt_tokens,
"p_completion_tokens": completion_tokens,
"p_kwh": kwh, "p_co2_g": kwh * 500,
"p_water_ml": kwh * 1800,
},
headers={
"apikey": ANON_KEY,
"Authorization": f"Bearer {ANON_KEY}",
"Content-Type": "application/json",
},
timeout=5,
)
except: pass
threading.Thread(target=_send).start()
Any language
SAP is just a POST to a REST endpoint. Any language that can make HTTP requests can integrate. The pattern is always the same: after your AI call completes, fire off the tracking call in the background.
Frontend integration
For browser-based apps, SAP provides a session tracking module that generates a per-tab session ID, maintains a counter, and lets UI components subscribe to tracking events.
import { getSapSessionCount, onSapTrack } from '@/lib/sap';
const [count, setCount] = useState(getSapSessionCount());
useEffect(() => {
return onSapTrack((n) => setCount(n));
}, []);
The session ID persists for the lifetime of the browser tab. When a user authenticates, all session activity can be attributed to their account, enabling persistent personal usage tracking over time. Anonymous usage works instantly; login enhances it.
Optional UI
The core of SAP is backend measurement. The UI is optional but encouraged — it gives end users visibility into their AI usage, which most interfaces don't provide. It can be embedded, customized, or omitted entirely depending on the application.
The reference implementation is a small button that sits in the chat interface. It shows a session generation count. Clicking it opens a dashboard popup with personal and global stats. The button spins briefly each time an inference is tracked — a subtle signal that measurement is happening.
This is sometimes called "the missing button" — a piece of UI that arguably should exist in every AI chat interface but currently doesn't.
This is not required for integration. Most apps integrate SAP backend-only. The UI is for apps that want to give users visibility.
Click "I just prompted AI" to simulate a tracked inference. Click the SAP button to open the dashboard popup — the same experience users would see in a real integration.
For non-React apps, an embeddable script is available:
<!-- Auto-attaches to prompt inputs, adds SAP button -->
<script src="https://phewsh.com/SustainableAiProtocol/sap-embed.js"></script>
Status
Live
- Tracking real AI inference usage in production (PHEWSH Intent Engine, PHEWSH CLI)
- Model-specific energy coefficients for 12+ models
- Per-inference token counting (input + output)
- Global + personal + daily + hourly aggregation
- Supabase RPC API — open to external integrations
- Anonymous and authenticated tracking
In progress
- Personal dashboard (usage over time)
- Embeddable widget for third-party apps
- Usage budgeting (daily/monthly limits)
- Efficiency feedback per prompt
Future
- Region-specific carbon intensity (beyond US grid average)
- Research partnerships for coefficient validation
- Data center partnerships for verified telemetry
- Compute abstraction layer (beyond AI workloads)
SAP is open source. View on GitHub · Contact · A PHEWSH open standard.