Catalog

17 models. One base URL.

Browse every model qlaud routes to. Customer-facing prices include our 7% markup. Models with multiple hosts can be requested with :nitro for the fastest tool-verified host or :floor for the cheapest live host.

Showing all 17 models

DeepSeek V3

DeepSeekText Generation

General-purpose chat + coding model with strong tool-use reliability. Default workhorse for most coding-agent loops.

66K contextFunction calling
deepseek
in $0.289 · out $1.18/ 1M tokens

DeepSeek R1

DeepSeekText Generation

Reasoning-tuned variant with separate chain-of-thought stream. Translates into Anthropic thinking blocks; rendered natively in Claude Code.

66K contextFunction callingReasoning
deepseek
in $0.589 · out $2.34/ 1M tokens

Claude Opus 4.7

AnthropicText Generation

Frontier coding model — 87.6% on SWE-bench Verified. Best agentic-loop reliability. Routed via Anthropic native API: cache_control markers, image content blocks, and thinking blocks all preserved verbatim — Claude Code customers see ~75% input-cost reduction from prompt cache without changing client code.

200K contextFunction callingReasoningVision
anthropic
in $5.35 · out $26.75/ 1M tokens

Claude Sonnet 4.6

AnthropicText Generation

Mid-tier Anthropic model. Strong tool use + vision; the default Claude Code CLI uses this slug. Native-passthrough route preserves prompt cache + image + thinking blocks.

200K contextFunction callingVision
anthropic
in $3.21 · out $16.05/ 1M tokens

Gemini 2.5 Pro

GoogleText Generation

Frontier multimodal — text, vision, code. 2M-token context, strong reasoning. Routed via Google AI Studio native passthrough.

2M contextFunction callingReasoningVision
google-ai-studio
in $1.34 · out $10.70/ 1M tokens

Grok 4

xAIText Generation

xAI frontier model — strong on reasoning + math + code. 256K context. Routed via xAI native passthrough.

256K contextFunction callingReasoning
grok
in $3.21 · out $16.05/ 1M tokens

Qwen 3 Coder

AlibabaText Generation

Qwen 3 Coder — Alibaba's strongest open-source coding model. 128K context, OpenAI-compat tool use. Cost-effective frontier coder.

128K contextFunction calling
alibaba-cloud
in $0.428 · out $1.71/ 1M tokens

Kimi K2

MoonshotText Generation

Moonshot Kimi K2 — agentic-coding specialist. Strong tool-use, 128K context, very fast. Excellent fallback in coding rotations.

128K contextFunction calling
moonshot
in $0.642 · out $2.68/ 1M tokens

MiniMax M2.7

MiniMaxText Generation

MiniMax M2.7 — frontier coding model. ~80% on SWE-bench Verified, strong tool-use, 200K context. Cheaper alternative to Sonnet 4.6 when prompt cache is not the bottleneck.

200K contextFunction callingReasoning
minimaxi
in $0.321 · out $1.28/ 1M tokens

GPT Image 1

OpenAIImage Generation

OpenAI image generation — high-fidelity, prompt-faithful, strong at typography in images. Default for /v1/images/generations.

0 context
openai
in $0.0000 · out $0.0000/ 1M tokens

Eleven v3 (English)

ElevenLabsText-to-Speech

World-class TTS — natural prosody, expressive voices, 30+ languages. Default for /v1/audio/speech when ElevenLabs is the preferred TTS provider.

0 context
elevenlabs
in $0.0000 · out $0.0000/ 1M tokens

GPT-4o TTS

OpenAIText-to-Speech

OpenAI text-to-speech via gpt-4o-mini-tts. Cheaper baseline; good prosody on standard English. Fallback when ElevenLabs is not preferred.

0 context
openai
in $0.0000 · out $0.0000/ 1M tokens

Deepgram Nova-3

DeepgramSpeech-to-Text

Best-in-class real-time speech-to-text — low latency, strong accent coverage, speaker diarization. Default for /v1/audio/transcriptions when Deepgram is preferred.

0 context
deepgram
in $0.0000 · out $0.0000/ 1M tokens

Whisper Large v3

OpenAISpeech-to-Text

OpenAI Whisper — strong multilingual transcription, batch-mode. Fallback when Deepgram is not preferred.

0 context
openai
in $0.0000 · out $0.0000/ 1M tokens

Perplexity Sonar Pro

PerplexityWeb Search

Web-grounded answer engine — citations, freshness, structured search. Used for /v1/search and any task that needs live-web context.

200K context
perplexity
in $3.21 · out $16.05/ 1M tokens

OpenAI text-embedding-3-large

OpenAIEmbeddings

Default embeddings — 3072-dim, strong retrieval quality. Used for /v1/embeddings.

8K context
openai
in $0.139 · out $0.0000/ 1M tokens

Claude 3.5 Sonnet (Claude Code default — routed to Sonnet 4.6)

AnthropicText Generation

Backwards-compatible alias for the slug Claude Code CLI ships with as default. Routed to Anthropic Sonnet 4.6 via native passthrough — same Anthropic model, same prompt cache, same agentic UX.

200K contextFunction callingVision
anthropic
in $3.21 · out $16.05/ 1M tokens

Provider coverage

DeepSeekGroqOpenAIAnthropicMistralCerebrasxAICloudflare Workers AIOpenRouterMiniMaxElevenLabsDeepgramPerplexityCartesiaGoogle AI StudioxAI GrokAlibaba QwenMoonshot Kimi

All routed through Cloudflare AI Gateway. We add new providers as Cloudflare adds them to its supported list.

JSON catalog

Want the full machine-readable catalog? It's public and edge-cached:

curl https://api.qlaud.ai/v1/catalog
Get started