Notes from building the billing layer for AI apps.
Per-user metering patterns, edge-router architecture, SDK + CLI integration walkthroughs. Every post is written by the engineers shipping the thing — no content-marketing filler.
How I cut my AI app's OpenAI bill 60% with per-user API keys
A $5,247 bill I couldn't attribute. The naive Postgres approach that didn't scale. The per-user-keys pattern that fixed it — with the actual numbers, code, and the 60% cost reduction in 6 weeks.
The hidden infrastructure you build when you ship AI chat
A chat textarea is a 30-minute build. The persistence, streaming reassembly, dedup, tool-call audit trail, and history endpoint that turn it into a real product is 3+ weeks. A walkthrough of what breaks, in what order, and how to skip most of it.
qlaud vs OpenRouter: when to pick which
OpenRouter is a router. qlaud is a router PLUS per-user billing, threads, semantic search, and 100+ vendor MCP connectors — one platform, one bill.
qlaud vs LiteLLM: same proxy idea, different layer
LiteLLM is a self-hosted proxy. qlaud is hosted with per-user billing, threads, semantic search, and 100+ tool connectors built in. When to pick which.
47 tools out of the box: 35 vendor MCP connectors + 12 first-party builtins
Every qlaud account ships with 100+ vendor MCP connectors and 12 first-party builtins (E2B, web search, image gen, email). Auto-discovered.
Stop building auth, billing, and a database for your AI app
The default AI-app stack is Clerk + Stripe + Postgres + a custom proxy. qlaud collapses three of those four into one platform. You write the chat UI.
Build a ChatGPT clone in 200 lines (no database)
End-to-end Next.js + Clerk + qlaud tutorial: per-user keys, streaming chat with auto-history, semantic search, 100+ connectors live — no Postgres, no Pinecone.
Per-user AI billing in 5 minutes (without rebuilding metering)
The four-step playbook with qlaud — mint per-user keys, cap spend, meter usage, bill — using the official SDKs you're already using. Ship in 5 minutes.
Point your coding agent at qlaud — Claude Code, Codex, Aider, Cline, Cursor
Point Claude Code, Codex, Aider, Cline, or Cursor at qlaud — one base URL, per-user spend caps, every frontier model. Setup snippets for each agent.
Building qlaud on Cloudflare AI Gateway in five days
How qlaud was built on Cloudflare AI Gateway in five days — architecture, primitives, and the trade-offs that made shipping per-user billing fast.
Why Claude Code on DeepSeek V3 costs 30× less than Claude direct
DeepSeek V3 via qlaud delivers Claude-class agentic coding for ~30x less than Claude direct. The pricing math and cap pattern that make it work.
Want new posts in your inbox?
We ship an engineering deep-dive roughly once every two weeks. No drip sequences, no sales pitches.
We’ll only send engineering posts and major product launches. Unsubscribe any time by replying to any email.