OpenRouter is the canonical unified LLM router — one base URL, every frontier model, native SDK shapes preserved. qlaud is a unified router PLUS the four other layers most AI apps end up rebuilding: per-user billing, conversation state, vendor connectors, and first-party tool builtins. This post is the honest side-by-side so you can pick the right one for your app — not a diss-track.
What OpenRouter does well
OpenRouter solved the "every provider has a different SDK" problem. You point your existing OpenAI client at openrouter.ai/api/v1, pass any model slug from their catalog, and traffic routes to the right provider with native request/response shapes preserved. They handle:
- Routing across hundreds of models (Anthropic, OpenAI, Google, DeepSeek, Mistral, Llama hosts, etc.)
- Per-account budget + a unified bill
- Provider failover when a region goes down
- Streaming, tool-use, vision, JSON mode — the standard inference surface
For an internal-tools team or a single-tenant app where the only consumer is "the company", OpenRouter is excellent.
Where it stops
OpenRouter is one layer: routing. Most production AI apps need four more, and either you build them or you bolt on three more vendors:
- Per-end-user billing. OpenRouter gives you ONE account-level budget. To charge YOUR end-users by their usage, you build per-user metering, hard spend caps, and a billing export — usually with Stripe + a custom proxy.
- Conversation state. Building a chatbot? You maintain the message history yourself, in your own database. At scale that's a Postgres table + index + retention policy + GDPR delete plumbing.
- Vendor connectors (Linear, GitHub, ClickUp, Notion, Stripe, etc.). Want your model to read a Linear issue or close a Stripe refund? You set up MCP servers per vendor, handle each end-user's OAuth, store credentials encrypted, and dispatch tool calls. Composio handles this for you — but that's a separate product, separate dashboard, separate bill.
- First-party tools (code execution, web search, image generation). E2B for code, Brave/Tavily for web search, OpenAI Images for generation — each is a separate vendor key, separate billing, separate code path.
What qlaud is
qlaud is the same router (Cloudflare AI Gateway under the hood, same native-shape passthrough), with the other four layers built in. One platform, one bill, one dashboard, one SDK pattern.
1. Routing — same as OpenRouter
Same primitive: point your SDK at api.qlaud.ai, pass any model slug, native shapes preserved. Anthropic cache_control, OpenAI response_format, ElevenLabs voice cloning, Deepgram diarization — all forwarded verbatim.
2. Per-end-user billing — qlaud-only
Mint a key per end-user with a hard max_spend_usd cap. The gateway enforces the cap before forwarding to the provider — over-cap requests return 402 immediately, no upstream burn. Pull per-user usage rollups via /v1/usage and bill however you want (Stripe, Paddle, Lemon Squeezy, in-app credits). See the dedicated walkthrough: Per-user AI billing in 5 minutes.
3. Threads + semantic search — qlaud-only
POST /v1/threads creates a conversation. Each subsequent message persists, and qlaud auto-loads prior turns into the model's context. Hit GET /v1/search?q=… to semantic-search every conversation in the account — Vectorize index built in. No Postgres, no pgvector, no retention cron. Your chatbot has a real backend without you writing one.
4. 35 vendor MCP connectors — qlaud-only
Linear, GitHub, ClickUp, Asana, Notion, Stripe, Sentry, HubSpot, Salesforce, Intercom, Vercel, Supabase, Datadog, PayPal, Plaid — 35 catalog vendors, all per-user-capable, default-enabled. Your end-user just says "create a Linear issue for this bug" and the model auto-discovers Linear via qlaud_search_tools, prompts the user once for their Linear API key via a hosted connect URL (credentials never enter the chat), then dispatches the call. Zero registration code on your side.
5. 12 first-party tool builtins — qlaud-only
E2B code execution, Brave web search, OpenAI image generation, Resend email send, Slack post-message, Linear / Zendesk / GitHub ticketing, Notion append, GitHub file/code search — all first-party builtins where qlaud holds the upstream key and bills you a flat passthrough. Your model gets "run this Python and return the output" or "search the web for X" with zero per-vendor API key management.
Side-by-side
| Capability | OpenRouter | qlaud |
|---|---|---|
| Unified inference router (Claude, GPT, Gemini, …) | ✓ | ✓ |
| Native SDK passthrough (cache_control, etc.) | ✓ | ✓ |
| Per-account budget | ✓ | ✓ |
| Per-end-user keys with hard spend caps | — | ✓ |
| Per-end-user usage breakdown for billing | — | ✓ |
| Stripe wallet top-up flow | partial | ✓ |
| Managed conversation threads | — | ✓ |
| Semantic search across conversations | — | ✓ |
| Vendor MCP catalog (Linear, GitHub, etc.) | — | ✓ (35) |
| Hosted per-user connect URL flow | — | ✓ |
| First-party tool builtins (E2B, web search, …) | — | ✓ (12) |
| Background async batch jobs | — | ✓ |
| One bill for inference + tools + connectors | inference only | ✓ |
Honest pricing
OpenRouter: ~5% per inference call + ~5.5% on Stripe top-ups (so ~10.5% all-in if you fund with card). qlaud: flat 7% on inference, no separate top-up fee. Roughly comparable; depending on volume and how you fund, one or the other wins by 1-3 points.
Where qlaud wins clearly is when you also need tools and connectors. Composio, the closest thing to qlaud's connector layer alone, prices separately and you'd be paying both OpenRouter (router) and Composio (connectors) — qlaud rolls them into one bill.
Pick OpenRouter when
- You only need a model router. No end-users, no chatbot backend, no third-party tool calls.
- You already have a billing layer and tool dispatch loop you like, and don't want to migrate them.
- You want the largest possible model catalog (OpenRouter carries some long-tail Llama hosts qlaud doesn't).
Pick qlaud when
- You ship a SaaS or AI app to end-users you bill — per-user keys + spend caps + usage rollups are built-in.
- You're building a chatbot and don't want to maintain a Postgres-of-messages — threads + semantic search are managed.
- Your model needs to call Linear / GitHub / Notion / Stripe / Slack / E2B / web search — 47 tools (35 MCP + 12 builtins) are out of the box, no per-vendor wiring.
- You want one bill instead of three (router + Composio + Stripe + custom DB).
Migration is two env vars
Same OpenAI-shape and Anthropic-shape APIs as OpenRouter. Swap these and your code keeps working:
# before
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ANTHROPIC_BASE_URL=https://openrouter.ai/api
# after
OPENAI_BASE_URL=https://api.qlaud.ai/v1
ANTHROPIC_BASE_URL=https://api.qlaud.aiThen sign up for qlaud, top up $5, mint a master key. Your existing routing keeps working; the per-user / threads / tools layers are there when you need them.