Configuration Reference
Complete reference for all ModelTrack configuration options: environment variables, JSON config files, and runtime settings.
Proxy environment variables
| Variable | Default | Description |
|---|---|---|
PROXY_PORT | 8080 | Port the proxy listens on |
DATA_DIR | ../data | Directory for JSONL cost events and config files |
ANTHROPIC_BASE_URL | https://api.anthropic.com | Upstream Anthropic API endpoint |
OPENAI_BASE_URL | https://api.openai.com | Upstream OpenAI API endpoint |
LOG_LEVEL | info | Logging verbosity (debug, info, warn, error) |
Cache settings
The proxy includes an in-memory LRU response cache. Identical requests (same model, messages, system prompt, and temperature) return cached responses, saving both cost and latency.
| Variable | Default | Description |
|---|---|---|
CACHE_ENABLED | true | Enable or disable the response cache |
CACHE_TTL_SECONDS | 3600 | Time-to-live for cached entries (seconds) |
CACHE_MAX_ENTRIES | 1000 | Maximum number of cached responses |
The cache key is a SHA-256 hash of the model, messages, system prompt, and temperature. Streaming requests are not cached.
API environment variables
| Variable | Default | Description |
|---|---|---|
PORT | 3001 | API server listen port |
DATA_DIR | /data | Directory for cost events and config files |
DB_PATH | /data/modeltrack.db | SQLite database file path |
SLACK_WEBHOOK_URL | - | Slack incoming webhook for alerts and reports |
Budget configuration
Budgets are defined in data/budgets.json. The proxy reads this file on startup and watches for changes.
{
"budgets": [
{
"team": "ml-research",
"app": "",
"monthly_limit": 500.00,
"action": "block"
},
{
"team": "product",
"app": "chatbot",
"monthly_limit": 200.00,
"action": "warn"
},
{
"team": "product",
"app": "search",
"monthly_limit": 100.00,
"action": "block"
}
]
}| Field | Type | Description |
|---|---|---|
team | string | Team name (matched against X-ModelTrack-Team header) |
app | string | App name (empty string means all apps for that team) |
monthly_limit | number | Monthly budget in USD |
action | string | block (reject requests) or warn (log + allow) |
Routing configuration
Routing rules are defined in data/routing.json. They allow the proxy to automatically downgrade to cheaper models when teams approach their budget limits.
{
"rules": [
{
"name": "budget-downgrade",
"trigger": "budget_percent_above",
"threshold": 70,
"from_models": ["claude-sonnet-4-6"],
"to_model": "claude-haiku-4-5",
"provider": "anthropic",
"action": "downgrade",
"enabled": true
},
{
"name": "openai-budget-downgrade",
"trigger": "budget_percent_above",
"threshold": 80,
"from_models": ["gpt-4o"],
"to_model": "gpt-4o-mini",
"provider": "openai",
"action": "downgrade",
"enabled": true
}
]
}| Field | Type | Description |
|---|---|---|
name | string | Human-readable rule name |
trigger | string | Trigger type (e.g., budget_percent_above) |
threshold | number | Budget percentage that activates the rule (0-100) |
from_models | string[] | Models to downgrade from |
to_model | string | Cheaper model to route to |
provider | string | Provider (anthropic, openai) |
action | string | Action to take (downgrade) |
enabled | boolean | Whether the rule is active |
Collector namespace map
The collector uses data/namespace_map.json to map Kubernetes namespaces to ModelTrack teams for infrastructure cost attribution.
{
"ml-research": "ml-research",
"product-chatbot": "product",
"default": "platform"
}Keys are Kubernetes namespace names, values are ModelTrack team names. This allows the collector to attribute infrastructure costs (GPU, compute) to the correct team in the dashboard.
Proxy route paths
The proxy routes requests based on URL path:
| Path | Provider | Description |
|---|---|---|
/v1/messages | Anthropic | Anthropic Messages API |
/v1/chat/completions | OpenAI | OpenAI Chat Completions API |
/bedrock/v1/messages | AWS Bedrock | AWS Bedrock Messages API |
/azure/v1/chat/completions | Azure OpenAI | Azure OpenAI Chat Completions API |
/healthz | - | Health check endpoint |
/stats | - | Proxy statistics |
/cache/stats | - | Cache hit rate and savings |
/routing/stats | - | Routing decisions and savings |