Building a ChatGPT clone is a 2024 cliché — except every production version still needs auth, per-user billing, message persistence, semantic search, and tool integrations, which adds up to weeks of plumbing. This is the end-to-end Next.js + Clerk + qlaud version that ships in ~200 lines because qlaud owns the bottom three layers. Full source is open at github.com/qlaud/chatai.
What we're building
- Sign-in via Clerk (email + Google + GitHub)
- Per-user qlaud key minted on signup with a $5 spend cap
- Chat UI with streaming, message history, infinite-scroll pagination
- Tool calls work out of the box — "send an email", "create a Linear ticket", "run this Python" all dispatch to qlaud's 47 catalog tools without per-tool wiring
- Semantic search across the user's past conversations
What we're NOT building (because qlaud handles it):
- A messages table or any database for chat state
- An OpenAI/Anthropic proxy
- A metering pipeline or spend-cap enforcement
- An embedding pipeline + vector index
- A tool dispatch state machine
- Per-vendor MCP server registration
1. Project setup
npx create-next-app@latest chatai --typescript --tailwind --app
cd chatai
pnpm add @clerk/nextjsAdd Clerk env vars to .env.local:
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_test_…
CLERK_SECRET_KEY=sk_test_…
CLERK_WEBHOOK_SECRET=whsec_…
QLAUD_MASTER_KEY=qlk_live_… # mint at qlaud.ai/keys2. Mint a per-user qlaud key on signup
Wire a Clerk webhook handler that fires on user.created, mints a qlaud key with a $5 cap, and stashes it in Clerk privateMetadata so it's available on every authenticated request.
// src/app/api/webhooks/clerk/route.ts
import { Webhook } from 'svix';
import { clerkClient } from '@clerk/nextjs/server';
export async function POST(req: Request) {
const wh = new Webhook(process.env.CLERK_WEBHOOK_SECRET!);
const event = wh.verify(await req.text(), {
'svix-id': req.headers.get('svix-id')!,
'svix-timestamp': req.headers.get('svix-timestamp')!,
'svix-signature': req.headers.get('svix-signature')!,
}) as { type: string; data: { id: string } };
if (event.type === 'user.created') {
const qlaud = await fetch('https://api.qlaud.ai/v1/keys', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.QLAUD_MASTER_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
name: `user_${event.data.id}`,
scope: 'standard',
max_spend_usd: 5,
}),
}).then((r) => r.json());
await (await clerkClient()).users.updateUserMetadata(event.data.id, {
privateMetadata: {
qlaud_key_id: qlaud.id,
qlaud_key: qlaud.secret, // shown ONCE — store now
},
});
}
return Response.json({ ok: true });
}3. Server-only helper to read the user's qlaud key
// src/lib/user-state.ts
import 'server-only';
import { auth, clerkClient } from '@clerk/nextjs/server';
export async function getQlaudKey(): Promise<string> {
const { userId } = await auth();
if (!userId) throw new Error('not signed in');
const u = await (await clerkClient()).users.getUser(userId);
const key = u.privateMetadata.qlaud_key as string | undefined;
if (!key) throw new Error('no qlaud key for user');
return key;
}4. Chat route — streaming, with tools, no DB
The chat route creates a thread on first message, then streams replies. tools_mode defaults to 'dynamic' — qlaud injects the 4 meta-tools and the model auto-discovers Linear / GitHub / web search / email / etc. on demand.
// src/app/api/chat/route.ts
import { getQlaudKey } from '@/lib/user-state';
const QLAUD_BASE = 'https://api.qlaud.ai';
export async function POST(req: Request) {
const { threadId, message } = await req.json();
const key = await getQlaudKey();
// Create thread on first message
let tid = threadId;
if (!tid) {
const t = await fetch(`${QLAUD_BASE}/v1/threads`, {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ end_user_id: 'self' }), // single-tenant chat app
}).then((r) => r.json());
tid = t.id;
}
// Stream the assistant reply — qlaud loads prior messages, persists this one,
// dispatches any tool calls, returns Anthropic-shape SSE.
const upstream = await fetch(`${QLAUD_BASE}/v1/threads/${tid}/messages`, {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'claude-sonnet-4-6',
max_tokens: 2048,
content: message,
stream: true,
}),
});
return new Response(upstream.body, {
headers: {
'content-type': 'text/event-stream',
'x-thread-id': tid,
},
});
}5. Message history — paginated, cursor-based
// src/app/api/threads/[id]/messages/route.ts
import { getQlaudKey } from '@/lib/user-state';
export async function GET(
req: Request,
{ params }: { params: Promise<{ id: string }> },
) {
const { id } = await params;
const key = await getQlaudKey();
const url = new URL(req.url);
const beforeSeq = url.searchParams.get('before_seq');
const r = await fetch(
`https://api.qlaud.ai/v1/threads/${id}/messages?` +
new URLSearchParams({
order: 'desc', limit: '30',
...(beforeSeq ? { before_seq: beforeSeq } : {}),
}),
{ headers: { Authorization: `Bearer ${key}` } },
);
return new Response(r.body, { headers: { 'content-type': 'application/json' } });
}6. The chat UI (client component)
// src/app/chat/page.tsx — simplified for clarity, full source on github
'use client';
import { useState } from 'react';
type Msg = { role: 'user' | 'assistant'; text: string };
export default function ChatPage() {
const [messages, setMessages] = useState<Msg[]>([]);
const [input, setInput] = useState('');
const [threadId, setThreadId] = useState<string | null>(null);
async function send() {
if (!input.trim()) return;
const userMsg: Msg = { role: 'user', text: input };
const placeholder: Msg = { role: 'assistant', text: '' };
setMessages((m) => [...m, userMsg, placeholder]);
setInput('');
const r = await fetch('/api/chat', {
method: 'POST',
body: JSON.stringify({ threadId, message: input }),
});
setThreadId(r.headers.get('x-thread-id'));
const reader = r.body!.getReader();
const decoder = new TextDecoder();
let assistantText = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Parse Anthropic-shape SSE — content_block_delta events carry text
for (const line of decoder.decode(value).split('\n')) {
if (!line.startsWith('data: ')) continue;
try {
const evt = JSON.parse(line.slice(6));
if (evt.type === 'content_block_delta' && evt.delta?.text) {
assistantText += evt.delta.text;
setMessages((m) => {
const next = [...m];
next[next.length - 1] = { role: 'assistant', text: assistantText };
return next;
});
}
} catch {}
}
}
}
return (
<div className="mx-auto max-w-2xl p-6">
<div className="space-y-3">
{messages.map((m, i) => (
<div key={i} className={m.role === 'user' ? 'text-right' : ''}>
<span className="inline-block rounded-lg border px-3 py-2">{m.text || '…'}</span>
</div>
))}
</div>
<div className="mt-4 flex gap-2">
<input
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && send()}
className="flex-1 rounded-md border px-3 py-2"
placeholder="Ask anything… ('search the web for X', 'create a Linear ticket', …)"
/>
<button onClick={send} className="rounded-md bg-black px-4 py-2 text-white">Send</button>
</div>
</div>
);
}That's the whole app
Sum it up: ~30 lines for the Clerk webhook, ~5 lines for the user-state helper, ~30 lines for the chat route, ~15 lines for the messages-list route, ~80 lines for the UI component — about 160 lines of TypeScript, plus boilerplate. Under 200 total.
What you get for those 200 lines:
- Streaming chat with conversation memory across turns
- Per-user budget enforcement (request > cap → 402)
- Tool calls — say "send an email to bob@…" or "create a Linear ticket: Fix login" and it just works (the model auto-discovers email + Linear via qlaud's meta-tools)
- Cursor-based pagination for infinite scroll
- Per-user usage rollup at
/v1/usagefor invoicing (Stripe, Paddle, whatever)
Bonus: semantic search across past chats
Add one route. Every persisted message is auto-indexed in Vectorize.
// src/app/api/search/route.ts
import { getQlaudKey } from '@/lib/user-state';
export async function GET(req: Request) {
const q = new URL(req.url).searchParams.get('q')!;
const key = await getQlaudKey();
const r = await fetch(
`https://api.qlaud.ai/v1/search?q=${encodeURIComponent(q)}&limit=10`,
{ headers: { Authorization: `Bearer ${key}` } },
);
return new Response(r.body, { headers: { 'content-type': 'application/json' } });
}Hook a search input into your UI, hit /api/search?q=…, render the snippets. No embedding pipeline, no vector index to maintain.
Get the full source
The reference implementation is open source — github.com/qlaud/chatai. Includes the polished UI (markdown rendering, code highlighting, tool-call rendering, mobile responsive), error states, the Clerk webhook handler with svix verification, and a Vercel deploy config. Fork it to start your own.
Sign up for qlaud, top up $5, mint a master key. The stack above is shippable the same afternoon.