Is qlaud just a fork of OpenRouter?

No. qlaud is built from scratch on Cloudflare Workers + Cloudflare AI Gateway. We share the routing pattern (one base URL, many providers, native SDK shapes preserved) but qlaud also includes per-user keys with hard spend caps, a Stripe-backed prepaid wallet, threads + semantic search as a managed primitive, 35 vendor-hosted MCP connectors that auto-enable per end-user, and 12 first-party tool builtins (E2B code execution, web search, image gen, send email, GitHub, Slack, Linear, Notion, etc.). OpenRouter is one layer of what qlaud is.

Can I migrate from OpenRouter to qlaud without rewriting?

Yes. Both accept the same OpenAI-shape and Anthropic-shape APIs. Swap two env vars (`OPENAI_BASE_URL`, `ANTHROPIC_BASE_URL`) and your existing SDK code keeps working. Model slugs differ slightly per catalog — qlaud uses provider-native model IDs (e.g. `claude-sonnet-4-6`, `gpt-5.4`) so the strings you'd already pass to the official SDK work directly.

When should I pick OpenRouter instead?

If you only need raw model routing for an internal team, no per-user billing, no managed tools, no threads — OpenRouter's a clean fit and you'll spin it up faster (no need to learn qlaud's threads/tools surface). The moment your app has end-users you bill, or your model needs to call Linear / GitHub / E2B / web search, qlaud's the more pragmatic platform — those layers are built-in instead of you wiring them yourself.

qlaud vs OpenRouter: when to pick which

OpenRouter is the canonical unified LLM router — one base URL, every frontier model, native SDK shapes preserved. qlaud is a unified router PLUS the four other layers most AI apps end up rebuilding: per-user billing, conversation state, vendor connectors, and first-party tool builtins. This post is the honest side-by-side so you can pick the right one for your app — not a diss-track.

What OpenRouter does well

OpenRouter solved the "every provider has a different SDK" problem. You point your existing OpenAI client at openrouter.ai/api/v1, pass any model slug from their catalog, and traffic routes to the right provider with native request/response shapes preserved. They handle:

Routing across hundreds of models (Anthropic, OpenAI, Google, DeepSeek, Mistral, Llama hosts, etc.)
Per-account budget + a unified bill
Provider failover when a region goes down
Streaming, tool-use, vision, JSON mode — the standard inference surface

For an internal-tools team or a single-tenant app where the only consumer is "the company", OpenRouter is excellent.

Where it stops

OpenRouter is one layer: routing. Most production AI apps need four more, and either you build them or you bolt on three more vendors:

Per-end-user billing. OpenRouter gives you ONE account-level budget. To charge YOUR end-users by their usage, you build per-user metering, hard spend caps, and a billing export — usually with Stripe + a custom proxy.
Conversation state. Building a chatbot? You maintain the message history yourself, in your own database. At scale that's a Postgres table + index + retention policy + GDPR delete plumbing.
Vendor connectors (Linear, GitHub, ClickUp, Notion, Stripe, etc.). Want your model to read a Linear issue or close a Stripe refund? You set up MCP servers per vendor, handle each end-user's OAuth, store credentials encrypted, and dispatch tool calls. Composio handles this for you — but that's a separate product, separate dashboard, separate bill.
First-party tools (code execution, web search, image generation). E2B for code, Brave/Tavily for web search, OpenAI Images for generation — each is a separate vendor key, separate billing, separate code path.

What qlaud is

qlaud is the same router (Cloudflare AI Gateway under the hood, same native-shape passthrough), with the other four layers built in. One platform, one bill, one dashboard, one SDK pattern.

1. Routing — same as OpenRouter

Same primitive: point your SDK at api.qlaud.ai, pass any model slug, native shapes preserved. Anthropic cache_control, OpenAI response_format, ElevenLabs voice cloning, Deepgram diarization — all forwarded verbatim.

2. Per-end-user billing — qlaud-only

Mint a key per end-user with a hard max_spend_usd cap. The gateway enforces the cap before forwarding to the provider — over-cap requests return 402 immediately, no upstream burn. Pull per-user usage rollups via /v1/usage and bill however you want (Stripe, Paddle, Lemon Squeezy, in-app credits). See the dedicated walkthrough: Per-user AI billing in 5 minutes.

3. Threads + semantic search — qlaud-only

POST /v1/threads creates a conversation. Each subsequent message persists, and qlaud auto-loads prior turns into the model's context. Hit GET /v1/search?q=… to semantic-search every conversation in the account — Vectorize index built in. No Postgres, no pgvector, no retention cron. Your chatbot has a real backend without you writing one.

4. 35 vendor MCP connectors — qlaud-only

Linear, GitHub, ClickUp, Asana, Notion, Stripe, Sentry, HubSpot, Salesforce, Intercom, Vercel, Supabase, Datadog, PayPal, Plaid — 35 catalog vendors, all per-user-capable, default-enabled. Your end-user just says "create a Linear issue for this bug" and the model auto-discovers Linear via qlaud_search_tools, prompts the user once for their Linear API key via a hosted connect URL (credentials never enter the chat), then dispatches the call. Zero registration code on your side.

5. 12 first-party tool builtins — qlaud-only

E2B code execution, Brave web search, OpenAI image generation, Resend email send, Slack post-message, Linear / Zendesk / GitHub ticketing, Notion append, GitHub file/code search — all first-party builtins where qlaud holds the upstream key and bills you a flat passthrough. Your model gets "run this Python and return the output" or "search the web for X" with zero per-vendor API key management.

Side-by-side

Capability	OpenRouter	qlaud
Unified inference router (Claude, GPT, Gemini, …)	✓	✓
Native SDK passthrough (cache_control, etc.)	✓	✓
Per-account budget	✓	✓
Per-end-user keys with hard spend caps	—	✓
Per-end-user usage breakdown for billing	—	✓
Stripe wallet top-up flow	partial	✓
Managed conversation threads	—	✓
Semantic search across conversations	—	✓
Vendor MCP catalog (Linear, GitHub, etc.)	—	✓ (35)
Hosted per-user connect URL flow	—	✓
First-party tool builtins (E2B, web search, …)	—	✓ (12)
Background async batch jobs	—	✓
One bill for inference + tools + connectors	inference only	✓

Honest pricing

OpenRouter: ~5% per inference call + ~5.5% on Stripe top-ups (so ~10.5% all-in if you fund with card). qlaud: flat 7% on inference, no separate top-up fee. Roughly comparable; depending on volume and how you fund, one or the other wins by 1-3 points.

Where qlaud wins clearly is when you also need tools and connectors. Composio, the closest thing to qlaud's connector layer alone, prices separately and you'd be paying both OpenRouter (router) and Composio (connectors) — qlaud rolls them into one bill.

Pick OpenRouter when

You only need a model router. No end-users, no chatbot backend, no third-party tool calls.
You already have a billing layer and tool dispatch loop you like, and don't want to migrate them.
You want the largest possible model catalog (OpenRouter carries some long-tail Llama hosts qlaud doesn't).

Pick qlaud when

You ship a SaaS or AI app to end-users you bill — per-user keys + spend caps + usage rollups are built-in.
You're building a chatbot and don't want to maintain a Postgres-of-messages — threads + semantic search are managed.
Your model needs to call Linear / GitHub / Notion / Stripe / Slack / E2B / web search — 47 tools (35 MCP + 12 builtins) are out of the box, no per-vendor wiring.
You want one bill instead of three (router + Composio + Stripe + custom DB).

Migration is two env vars

Same OpenAI-shape and Anthropic-shape APIs as OpenRouter. Swap these and your code keeps working:

# before
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ANTHROPIC_BASE_URL=https://openrouter.ai/api

# after
OPENAI_BASE_URL=https://api.qlaud.ai/v1
ANTHROPIC_BASE_URL=https://api.qlaud.ai

Then sign up for qlaud, top up $5, mint a master key. Your existing routing keeps working; the per-user / threads / tools layers are there when you need them.