Mint a key per user.
Cap their spend.
Get paid.
qlaud is the billing layer for AI apps. Mint a qlaud key for every end-user you onboard, set their monthly cap, and we track spend per-key — Claude Opus, GPT-5.4, Sora 2, Eleven, Whisper, all included. End of month, pull the per-user invoice and bill them through Stripe.
No card required to start. Mint your first end-user key in two API calls.
Build with qlaud
Three steps. One language. Pick yours — every snippet flips together.
Mint a per-user key with a hard cap
Server-side, in any Node framework — Next.js API route, Express, Hono, Fastify.
// Mint a per-user qlaud key with a hard cap.
// — In Next.js API route, Express, Hono, Fastify… same call.
const r = await fetch('https://api.qlaud.ai/v1/keys', {
method: 'POST',
headers: {
'x-api-key': process.env.QLAUD_MASTER_KEY,
'content-type': 'application/json',
},
body: JSON.stringify({
name: 'user_42', // any unique label (your user id works)
max_spend_usd: 5, // hard cap — enforced gateway-side
}),
});
const { secret: userKey, max_spend_usd } = await r.json();
// userKey === 'qlk_live_…' the api key for that user
// max_spend_usd === 5 the cap qlaud now enforcesThe returned secret is shown once — store it however you store secrets in your stack. The max_spend_usd is enforced gateway-side; once exceeded, requests with that key return 402.
Use any official frontier SDK with that key
Same code your users would already write — only the base URL + API key change.
// userKey from step 1 above.
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.qlaud.ai/v1',
apiKey: userKey,
});
await client.chat.completions.create({
model: 'claude-sonnet-4-6', // any qlaud catalog slug
messages: [{ role: 'user', content: 'hello' }],
});We don't ship a qlaud SDK. The official OpenAI / Anthropic / ElevenLabs / LangChain package already works because we passthrough verbatim. Same applies to xAI Grok, DeepSeek, Qwen, Kimi, MiniMax via the OpenAI SDK pointed at api.qlaud.ai/v1.
Pull spend at month-end, bill how you want
One GET call returns spend per key. Use the numbers in whatever billing system you already run.
// Pull per-user spend from qlaud. Use the numbers however you bill
// your end-users — Stripe, Paddle, Lemon Squeezy, in-app invoicing,
// custom logic. qlaud bills YOU for the qlaud wallet; YOU bill your users.
const usage = await fetch('https://api.qlaud.ai/v1/usage', {
headers: { 'x-api-key': process.env.QLAUD_MASTER_KEY! },
}).then(r => r.json());
for (const k of usage.by_key) {
// k.key_name was your label when minting (e.g. 'user_42').
// k.cost_micros already includes our 7% markup.
console.log(`${k.key_name}: $${(k.cost_micros / 1_000_000).toFixed(4)}`);
// → Hand to your billing tool. Example with Stripe:
// stripe.invoiceItems.create({
// customer: stripeCustomerFor(k.key_name),
// amount: Math.floor(k.cost_micros / 100),
// });
}You see only your own users — never another qlaud customer's data. cost_micros already includes our flat 7% markup, so it's what you owe us; charge your users anything you want on top. qlaud doesn't require any specific billing tool — Stripe, Paddle, Lemon Squeezy, custom invoicing, all fine.
See it built end-to-end — chatai
Full production-quality chat app on top of the same three steps above, with streaming responses, live tool dispatch (web search, image generation), semantic search, and per-user spend caps. Two services, ~600 lines of glue, no database to provision — user state lives in Clerk's privateMetadata.
Already using a coding agent?
Swap with two env vars.
Claude Code, Codex, Aider, Cline, Cursor — every popular coding tool reads either an OpenAI or Anthropic base-URL setting (env var, TOML config, or in-app field). Drop in qlaud's URL and your qlk_live_… key, hit save, every call from the tool now gets metered + capped against that key.
# Same Claude Code CLI. Routes through qlaud.
export ANTHROPIC_BASE_URL=https://api.qlaud.ai
export ANTHROPIC_API_KEY=qlk_live_...
claude
# cache_control markers preserved verbatim — keep the ~75% input-cost
# discount. Tool use, vision blocks, thinking blocks all forwarded.Per-user keys, hard caps
POST /v1/keys with max_spend_usd — qlaud enforces the cap gateway-side on every request. Your user can never overspend by accident. Zero billing code in your app.
Per-user usage, ready to invoice
GET /v1/usage returns spend, requests, and tokens — broken down by every key you minted. Pipe it straight into Stripe at month-end. We do the metering, you keep the customer relationship.
Frontier models included
Claude Opus 4.7 for code, GPT-5.4 for general, Sora 2 for video, Eleven for voice — all behind the same key. No per-provider integration, no per-provider billing. One signup, one wallet.
Time to first token, head to head.
Lower is better. Cache hits are essentially free for repeated prompts — measured from the closest qlaud edge.
Approximate, p50, single-prompt warm path. Live measurements from the next probe land here automatically. Full methodology in the blog.
Stop building billing infrastructure.
Sign up, mint your master key, ship per-user AI billing in an afternoon.