For founders building AI apps

Mint a key per user.
Cap their spend.
Get paid.

qlaud is the billing layer for AI apps. Mint a qlaud key for every end-user you onboard, set their monthly cap, and we track spend per-key — Claude Opus, GPT-5.4, Sora 2, Eleven, Whisper, all included. End of month, pull the per-user invoice and bill them through Stripe.

No card required to start. Mint your first end-user key in two API calls.

Build with qlaud

Three steps. One language. Pick yours — every snippet flips together.

1

Mint a per-user key with a hard cap

Server-side, in any Node framework — Next.js API route, Express, Hono, Fastify.

// Mint a per-user qlaud key with a hard cap.
// — In Next.js API route, Express, Hono, Fastify… same call.
const r = await fetch('https://api.qlaud.ai/v1/keys', {
  method: 'POST',
  headers: {
    'x-api-key': process.env.QLAUD_MASTER_KEY,
    'content-type': 'application/json',
  },
  body: JSON.stringify({
    name: 'user_42',          // any unique label (your user id works)
    max_spend_usd: 5,         // hard cap — enforced gateway-side
  }),
});
const { secret: userKey, max_spend_usd } = await r.json();
// userKey === 'qlk_live_…'         the api key for that user
// max_spend_usd === 5              the cap qlaud now enforces

The returned secret is shown once — store it however you store secrets in your stack. The max_spend_usd is enforced gateway-side; once exceeded, requests with that key return 402.

2

Use any official frontier SDK with that key

Same code your users would already write — only the base URL + API key change.

// userKey from step 1 above.
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.qlaud.ai/v1',
  apiKey: userKey,
});

await client.chat.completions.create({
  model: 'claude-sonnet-4-6',          // any qlaud catalog slug
  messages: [{ role: 'user', content: 'hello' }],
});

We don't ship a qlaud SDK. The official OpenAI / Anthropic / ElevenLabs / LangChain package already works because we passthrough verbatim. Same applies to xAI Grok, DeepSeek, Qwen, Kimi, MiniMax via the OpenAI SDK pointed at api.qlaud.ai/v1.

3

Pull spend at month-end, bill how you want

One GET call returns spend per key. Use the numbers in whatever billing system you already run.

// Pull per-user spend from qlaud. Use the numbers however you bill
// your end-users — Stripe, Paddle, Lemon Squeezy, in-app invoicing,
// custom logic. qlaud bills YOU for the qlaud wallet; YOU bill your users.
const usage = await fetch('https://api.qlaud.ai/v1/usage', {
  headers: { 'x-api-key': process.env.QLAUD_MASTER_KEY! },
}).then(r => r.json());

for (const k of usage.by_key) {
  // k.key_name was your label when minting (e.g. 'user_42').
  // k.cost_micros already includes our 7% markup.
  console.log(`${k.key_name}: $${(k.cost_micros / 1_000_000).toFixed(4)}`);

  // → Hand to your billing tool. Example with Stripe:
  //   stripe.invoiceItems.create({
  //     customer: stripeCustomerFor(k.key_name),
  //     amount: Math.floor(k.cost_micros / 100),
  //   });
}

You see only your own users — never another qlaud customer's data. cost_micros already includes our flat 7% markup, so it's what you owe us; charge your users anything you want on top. qlaud doesn't require any specific billing tool — Stripe, Paddle, Lemon Squeezy, custom invoicing, all fine.

q.
Reference appNext.js · Clerk · qlaud · MIT

See it built end-to-end — chatai

Full production-quality chat app on top of the same three steps above, with streaming responses, live tool dispatch (web search, image generation), semantic search, and per-user spend caps. Two services, ~600 lines of glue, no database to provision — user state lives in Clerk's privateMetadata.

Already using a coding agent?
Swap with two env vars.

Claude Code, Codex, Aider, Cline, Cursor — every popular coding tool reads either an OpenAI or Anthropic base-URL setting (env var, TOML config, or in-app field). Drop in qlaud's URL and your qlk_live_… key, hit save, every call from the tool now gets metered + capped against that key.

Anthropic’s official CLI. Two env vars, cache_control + tool use preserved verbatim.
# Same Claude Code CLI. Routes through qlaud.
export ANTHROPIC_BASE_URL=https://api.qlaud.ai
export ANTHROPIC_API_KEY=qlk_live_...

claude
# cache_control markers preserved verbatim — keep the ~75% input-cost
# discount. Tool use, vision blocks, thinking blocks all forwarded.

Per-user keys, hard caps

POST /v1/keys with max_spend_usd — qlaud enforces the cap gateway-side on every request. Your user can never overspend by accident. Zero billing code in your app.

Per-user usage, ready to invoice

GET /v1/usage returns spend, requests, and tokens — broken down by every key you minted. Pipe it straight into Stripe at month-end. We do the metering, you keep the customer relationship.

Frontier models included

Claude Opus 4.7 for code, GPT-5.4 for general, Sora 2 for video, Eleven for voice — all behind the same key. No per-provider integration, no per-provider billing. One signup, one wallet.

Time to first token, head to head.

Lower is better. Cache hits are essentially free for repeated prompts — measured from the closest qlaud edge.

qlaud (cache hit)
50ms
qlaud → DeepSeek
280ms
DeepSeek direct
310ms
OpenRouter → DeepSeek
410ms
Claude direct
430ms

Approximate, p50, single-prompt warm path. Live measurements from the next probe land here automatically. Full methodology in the blog.

Stop building billing infrastructure.

Sign up, mint your master key, ship per-user AI billing in an afternoon.

qlaud — the AI Router — qlaud