Send the next user turn. qlaud loads prior messages into the model's context, persists the assistant reply, returns the response. No `messages` table to maintain. No retention cron. Pagination + GDPR-delete are API calls.
POST /v1/threads/:id/messages
Linear, GitHub, ClickUp, Notion, Stripe, Sentry, HubSpot, Salesforce, Zapier, 97 more — auto-enabled per end-user. They authorize their own account via a one-time hosted URL. Zero per-vendor wiring on your side.
/connectors · 105 live
Semantic search
Every message indexed
Vectorize-backed semantic search across every conversation in your account, scoped by end-user. No embedding pipeline, no pgvector index, no separate vector store bill.
GET /v1/search?q=…
Mint a key per end-user with a hard `max_spend_usd` cap. The gateway enforces it BEFORE forwarding to the provider — over-cap requests return 402, no upstream burn. Pull per-user usage rollups for invoicing.
POST /v1/keys
The whole conversation, with auto-discovered Linear, in one call
No tools array, no Linear setup, no DB write, no embedding job.
POST https://api.qlaud.ai/v1/threads/:thread_id/messages
Authorization: Bearer qlk_live_<your_end_user's_key>
{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"content": "Create a Linear ticket: 'Login button broken on Safari iOS'",
"stream": true
}
# What happens, autonomously, inside one HTTP call:
# 1. qlaud loads prior thread messages into context
# 2. tools_mode defaults to "dynamic" (no tools array passed)
# → 4 meta-tools injected (qlaud_search_tools, get_tool_schemas,
# multi_execute, manage_connections)
# 3. Model calls qlaud_search_tools({intent: "linear ticket"})
# 4. Catalog returns Linear (auto-enabled, not yet connected for this user)
# 5. Model calls qlaud_manage_connections({action: "connect", tool: "qlaud-mcp/linear"})
# 6. qlaud mints a hosted URL, model relays it: "open this to authorize Linear"
# 7. End-user pastes their Linear API key in the qlaud-hosted form
# 8. Model calls linear/create_issue → ticket lands in Linear
# 9. qlaud persists the assistant reply, indexes it for search,
# debits the user's wallet, returns the streamed response
One Threads call. The whole AI app backend.
Top up $5, mint a master key, ship a chatbot in 200 lines.