"I left an agent running overnight. Woke up to a $280 bill."
Yeah. We've all been there. tokenclaw is the hard stop that should have existed from day one.
OpenAI silently converted their hard caps to email alerts. Anthropic only has workspace-level monthly limits. Neither has per-key budgets. Neither has daily limits. You find out what your agent spent after it already spent it.
Dashboards show you numbers. Alerts tell you after the fact. Neither one actually prevents your agent from burning through $200 at 3am while you're asleep.
Agent infinite loops. Runaway retries. A "quick test" that scans your monorepo twice. Monthly limits don't catch this. Daily ones do.
Multiple agents sharing one API key. No way to know which one is spending what. Per-key budgets fix that.
Not a soft limit. Not a warning. Not an email you'll see tomorrow. A 429 that stops the spend right now.
Per API key. Daily, weekly, or monthly. Budget gone? Proxy returns 429. Agent can't spend past the limit. Done.
Anthropic and OpenAI auto-detected from request path. Also covers Groq, Together, Fireworks — anything OpenAI-compatible. One port, one proxy.
One key per agent. See exactly who's spending what. Longest-prefix matching handles key rotation. Unregistered keys still tracked, just not limited.
Warning at 80%. Blocked at 100%. Account-level alerts get louder if you don't acknowledge them. Hourly until you say "this is fine."
No cloud. No telemetry. No accounts. No third party sees your requests. Just localhost. The way it should be.
Even without the proxy, scans 9 AI tools on your machine (Claude Code, Cursor, Windsurf, Aider, etc.) and shows what you've been spending.
tokenclaw set --key sk-ant-proj-research --budget 10/day — the research agent gets $10/day. The deploy agent gets $100. A runaway on one can't touch the other.
tokenclaw proxy — runs on localhost:4040. Set ANTHROPIC_BASE_URL or OPENAI_BASE_URL to point at it. One env var per agent. That's it.
The proxy watches every request. Budget hit? Blocked. Slack alert fired. The 429 response tells the agent exactly what happened and the CLI command to increase the limit. No surprises in the morning.
tokenclaw is the difference between an alert and a stop.