OpenRouter

Workspace Guardrails, Speech and Transcription APIs, Model Fusion, and 20 new models shipped in May.

Centralized security and governance for every request, voice in and out through the API key you already use, parallel model queries synthesized into one answer, and Private Models. We also announced our Series B fundraising and crossed over 100 trillion tokens a month. Here is everything that shipped.

Read the full May recap on the blog →

Browse all models

Platform features

Workspace Guardrails

Centralized security and governance for every request routed through your workspace. Set per-member and per-key spend limits, lock traffic to a model and provider allowlist, enforce zero data retention, block prompt injection against 30+ OWASP-derived patterns, and redact PII before it reaches a provider. Layer the rules into one guardrail or scope them to specific keys and members, with no code changes.

Read the Guardrails docs → · Read the announcement →

Speech and Transcription APIs

Add text-to-speech and speech-to-text to any application through the same API key and billing you already use. Speech-to-text is live with Whisper, GPT-4o Mini Transcribe, and Voxtral; text-to-speech exposes supported_voices in the models API. Provider failover and upstream error passthrough are built into both.

Browse audio models →

Model Fusion

Route one prompt to multiple models in parallel and synthesize their responses into a single, higher-quality answer. Model Fusion runs as an API plugin, a server tool, and in the chatroom composer. You get an ensemble of models in one call instead of betting on a single one.

Read the Model Fusion docs →

Private Models (Enterprise)

Route to your own custom, fine-tuned, or dedicated endpoints through the standard completions and responses API. Your private models get the same routing, guardrails, observability, and billing as any public model on the platform. Available exclusively on the Enterprise plan.

Read the Private Models docs →

Pareto Code Router

Set min_coding_score and route to the cheapest code-capable model that clears your bar. Configure a default coding-quality tier per workspace in plugin settings.

Try the Pareto Code Router →

Workspace controls

IP allowlist enforcement. API keys with an allowlist now block unauthorized requests with a 403, upgraded from observe-only.
BYOK management API. List, create, update, and delete provider keys across workspaces, grouped by priority with drag-and-drop reordering.
Observability destinations API. CRUD endpoints for Datadog, Langfuse, LangSmith, and other integrations via management key.
Per-provider ZDR. Separate Zero Data Retention toggles for non-frontier, Anthropic, OpenAI, and Google providers.
Copy guardrails across workspaces. Standardize safety policies in a few clicks.

Read the workspaces docs →

Also shipped

Model comparison page. Compare up to five models side by side on pricing, context length, and benchmarks, with a Highlight best toggle and provider-coded charts for Intelligence, Coding, and Agentic metrics.
Presets API. Version a preset straight from a request body, now with Anthropic Messages and Responses skins, plus TypeScript and Python SDKs.
Human-in-the-loop tools. An SDK tool type that pauses execution and waits for human input, for agents that need judgment mid-task.
Session stickiness. Requests sharing a session_id pin to the same provider and model across turns for better cache hits.
Auto router cost_quality_tradeoff. A 0 to 10 dial replacing the old binary toggle for finer cost-versus-quality control.
Requests tab in logs. Request-level drill-down with request-ID filtering and time-picker shorthand.

New models

20 launched in May, across text, speech, image, video, and code, including Claude Opus 4.8, Gemini 3.5 Flash, Grok 4.3, Qwen3.7 Max, Grok Imagine Video, and Recraft V3, V4, and V4 Pro.

Browse all models →

Latest from the blog