May Release Spotlight

Nick Livermore · 6/1/2026

We closed our $113M Series B(opens in new tab), and we're now routing 100 trillion tokens a month. Here's everything else that shipped in May.

Workspace Guardrails

Centralized security and governance for every request routed through your workspace. Set per-member and per-key spend limits, lock traffic to a model and provider allowlist, enforce zero data retention, block prompt injection against 30+ OWASP-derived patterns, and redact PII before it reaches a provider. Layer the rules into one guardrail, or scope them to specific API keys and members, with no code changes.

Docs(opens in new tab) · Announcement(opens in new tab)

Speech and Transcription APIs

Add voice to any application through the same API key you already use. Speech-to-text is live with Whisper, GPT-4o Mini Transcribe, and Voxtral; text-to-speech exposes supported_voices in the models API. Provider failover and upstream error passthrough are built into both.

Browse audio models(opens in new tab) · Announcement(opens in new tab)

Model Fusion

Route your prompt to multiple models in parallel and synthesize their responses into a single, higher-quality answer. Model Fusion is now available as an API plugin, a server tool, and in the chatroom composer. You get an ensemble of experts in a single call instead of relying on one model.

Try Model Fusion(opens in new tab) · Docs(opens in new tab)

Model Comparison

Compare up to five models side by side on pricing, context length, and benchmark scores. The rebuilt comparison page includes a "Highlight best" toggle, provider-coded benchmark charts for Intelligence, Coding, and Agentic metrics, and interactive slot cards to quickly add models.

Compare models(opens in new tab)

Private Models (Enterprise)

Route to your own custom, fine-tuned, or dedicated model endpoints through the standard completions and responses API. Your private models get the same guardrails, observability, and billing as any public model on the platform. Available exclusively on the Enterprise plan.

Docs(opens in new tab)

Pareto Code Router

Set min_coding_score and route to the cheapest code-capable model that clears your quality bar. Your coding agents stop overpaying for good-enough code. Configurable defaults per workspace in plugin settings.

Try it(opens in new tab)

Enterprise & Workspace Controls

A set of releases for teams running OpenRouter at scale:

IP allowlist enforcement. API keys with an IP allowlist now actively block requests from unauthorized IPs with a 403, upgraded from observe-only mode. Docs(opens in new tab)
BYOK management API. Programmatically list, create, update, and delete bring-your-own-key credentials across workspaces. Keys are now grouped by priority with drag-and-drop reordering and a one-click "Test Key" for failed requests. API docs(opens in new tab)
Observability destinations API. CRUD endpoints for managing Datadog, Langfuse, LangSmith, and other observability integrations via management key. API docs(opens in new tab)
Per-provider ZDR controls. Separate Zero Data Retention toggles for non-frontier, Anthropic, OpenAI, and Google providers, so you can meet compliance requirements per provider without restricting your entire model catalog.
Copy guardrails across workspaces. Standardize safety policies across all workspaces in a few clicks via the "Copy to..." menu.

Also shipped this month

Presets API. Create or version a preset directly from an inference request body, now with Anthropic Messages and Responses skins, plus TypeScript and Python SDK support. Docs(opens in new tab)

Human-in-the-loop tools. A new SDK tool type that pauses execution and waits for human input before returning results, for agents that need human judgment mid-task. Blog post(opens in new tab)

Session-id provider stickiness. Requests sharing a session_id now route to the same provider and pin to the same concrete model across turns, improving cache hit rates for multi-turn agentic workflows. Docs(opens in new tab)

Auto router cost_quality_tradeoff. A 0 to 10 integer replacing the old binary toggle for finer control over cost versus quality when using the auto router. Docs(opens in new tab)

Redesigned model pages. New model page header, step-by-step API tab with /responses and /messages endpoints, full-screen model selector, and playground side panel for inline testing.

Requests tab in logs. Full request-level drill-down alongside generation logs, with request ID filtering and time picker shorthand (15min, 1h, 3d). Logs(opens in new tab)

Improved coding agent attribution. Cursor, GitHub Copilot, Cline, RooCode, Kilo Code, Zed, and OpenCode are now properly identified in activity logs so you can see which tools drive your usage.

Usage & Budgets on API keys. Spend charts and budget progress by guardrail layer, directly on each API key.

Rankings daily dataset. GET /api/v1/datasets/rankings-daily returns top-50 models by daily token volume for programmatic analysis.

New models

20 models launched in May, spanning text, speech, image, video, and coding:

Anthropic Claude Opus 4.8: Anthropic's latest Opus with mid-session system support, plus a fast variant
Google Gemini 3.5 Flash: Google's newest Flash model
xAI Grok 4.3: xAI's latest frontier model
xAI Grok Imagine Video: Video generation from xAI
xAI Grok Build 0.1: xAI's code generation model
Qwen Qwen3.7 Max: Qwen's latest max-tier model
Recraft V3, V4, V4 Pro: Three new image generation models
Mistral Voxtral Mini Transcribe: Mistral's speech-to-text model

Plus: Gemini 3.1 Flash Lite, GPT Chat Latest, CoBuddy (free), Ring-2.6-1T (free), Perceptron Mk1, and more.

Everything above is live now. Browse the full model catalog(opens in new tab), or tell us what's missing on Discord(opens in new tab).