Blog

Notes from building the billing layer for AI apps.

Per-user metering patterns, edge-router architecture, SDK + CLI integration walkthroughs. Every post is written by the engineers shipping the thing — no content-marketing filler.

ArchitectureApr 30, 20269 min read

How I cut my AI app's OpenAI bill 60% with per-user API keys

A $5,247 bill I couldn't attribute. The naive Postgres approach that didn't scale. The per-user-keys pattern that fixed it — with the actual numbers, code, and the 60% cost reduction in 6 weeks.

qlaud teamEngineering
ArchitectureApr 30, 202611 min read

The hidden infrastructure you build when you ship AI chat

A chat textarea is a 30-minute build. The persistence, streaming reassembly, dedup, tool-call audit trail, and history endpoint that turn it into a real product is 3+ weeks. A walkthrough of what breaks, in what order, and how to skip most of it.

qlaud teamEngineering
ProductApr 28, 20268 min read

qlaud vs OpenRouter: when to pick which

OpenRouter is a router. qlaud is a router PLUS per-user billing, threads, semantic search, and 100+ vendor MCP connectors — one platform, one bill.

qlaud teamEngineering
ProductApr 28, 20267 min read

qlaud vs LiteLLM: same proxy idea, different layer

LiteLLM is a self-hosted proxy. qlaud is hosted with per-user billing, threads, semantic search, and 100+ tool connectors built in. When to pick which.

qlaud teamEngineering
ProductApr 28, 20269 min read

47 tools out of the box: 35 vendor MCP connectors + 12 first-party builtins

Every qlaud account ships with 100+ vendor MCP connectors and 12 first-party builtins (E2B, web search, image gen, email). Auto-discovered.

qlaud teamEngineering
ArchitectureApr 28, 20267 min read

Stop building auth, billing, and a database for your AI app

The default AI-app stack is Clerk + Stripe + Postgres + a custom proxy. qlaud collapses three of those four into one platform. You write the chat UI.

qlaud teamEngineering
TutorialsApr 28, 202610 min read

Build a ChatGPT clone in 200 lines (no database)

End-to-end Next.js + Clerk + qlaud tutorial: per-user keys, streaming chat with auto-history, semantic search, 100+ connectors live — no Postgres, no Pinecone.

qlaud teamEngineering
TutorialsApr 25, 20267 min read

Per-user AI billing in 5 minutes (without rebuilding metering)

The four-step playbook with qlaud — mint per-user keys, cap spend, meter usage, bill — using the official SDKs you're already using. Ship in 5 minutes.

qlaud teamEngineering
TutorialsApr 25, 20266 min read

Point your coding agent at qlaud — Claude Code, Codex, Aider, Cline, Cursor

Point Claude Code, Codex, Aider, Cline, or Cursor at qlaud — one base URL, per-user spend caps, every frontier model. Setup snippets for each agent.

qlaud teamEngineering
ArchitectureApr 25, 20268 min read

Building qlaud on Cloudflare AI Gateway in five days

How qlaud was built on Cloudflare AI Gateway in five days — architecture, primitives, and the trade-offs that made shipping per-user billing fast.

qlaud teamEngineering
BenchmarksApr 25, 20266 min read

Why Claude Code on DeepSeek V3 costs 30× less than Claude direct

DeepSeek V3 via qlaud delivers Claude-class agentic coding for ~30x less than Claude direct. The pricing math and cap pattern that make it work.

qlaud teamEngineering

Want new posts in your inbox?

We ship an engineering deep-dive roughly once every two weeks. No drip sequences, no sales pitches.

We’ll only send engineering posts and major product launches. Unsubscribe any time by replying to any email.

Blog — qlaud — qlaud