← Back to index
01 - Inference Router DigitalOcean · Inference Hub

Inference Router

I designed the feature that led DigitalOcean's Deploy 2026 keynote. The Inference Router was demoed live on stage as the flagship capability of Inference Hub's Public Preview, and the Playground patterns I built were forked into the Gen AI Tool Catalog before launch. When the same UX shell serves two different product areas without modification, the design was right.

DigitalOcean · Feb – Apr 2026 · Product Design & Front-end Prototyping
RoleProduct Design & Front-end Prototyping
PlatformDigitalOcean Inference Hub
TimelineFebruary – April 2026
StatusShipped · Deploy 2026 Keynote
ScopeGetting Started · My Routers · Playground · Analyze · Create Router
DownstreamGen AI Tool Catalog Playground (pattern fork)
Executive Summary Inference Router · Public Preview
Large language model adoption has outpaced how teams choose which model to call. A single flagship model for every prompt is expensive, slow, and brittle. The Inference Router evaluates each request and routes it to the model and task policy that best fits the workload.
The work

I designed the end-to-end product experience: a four-tab IA, a hero-led Getting Started catalog, lifecycle management in My Routers, a dual-pane Playground for side-by-side comparison with routing metadata, and an Analyze tab for operational insight.

The timeline

February through April 2026 as part of Inference Hub's Public Preview, with iterative design and build sessions that hardened naming (preset routers), comparison affordances, sub-cent cost display, documentation patterns, and a downstream reuse in the Gen AI Tool Catalog Playground.

The outcome

The Inference Router was the centerpiece of DigitalOcean's Deploy 2026 keynote, demoed live on stage as the flagship capability of Inference Hub's Public Preview launch.

One model for everything is an overkill. Route your requests to the right model.

Inference Hub already exposed a model catalog and standalone Model Playground. What was missing was a routing layer that developers could configure, test, and trust. Without router-specific UX, teams either hardcoded one model ID or built custom routing logic outside the platform, neither observable nor aligned with DigitalOcean's benchmarking and policy primitives.

The original routing JSON spec design inherited from backend What design inherited — a technically complete routing spec with no user experience attached to it
Before: every prompt routes to one flagship model Before — every prompt routes to one flagship model regardless of task
After: the router evaluates each request and sends it to the right model After — the router evaluates each request and sends it to the right model

The people feeling this most directly were platform engineers integrating Inference Hub into production apps, ML leads defining task boundaries and model pools, and developers who needed to justify a router over a single model with real cost and latency evidence. Post-launch, operators needed to see match rates, fallbacks, and token usage without digging through logs.

The core tension: routing is intelligent only if users can see the intelligence working, in the Playground before launch and in Analyze after.

That tension shaped the whole product. Every major problem had a specific design response: model sprawl got preset routers with benchmark-backed copy; opaque routing got ResponseInfo in the Playground; sub-cent cost differences got threshold-based decimal formatting; configuration complexity got an accordion task picker with inline docs; distrust of "defaults" got a rename to "preset" with documented hybrid evaluation methodology.

Inference Router shipped as a four-tab product: Getting Started, My Routers, Playground, and Analyze. The tab order maps to how a developer actually adopts a new platform capability — first you learn, then you configure, then you test, then you operate.

Getting Started

The Getting Started hero leads with the product's argument, not its features. The headline meets developers in the framing they already use, cost and reliability, before introducing any new concepts. I positioned the docs link and Public Preview terms adjacent to the hero, visible without blocking the primary scan path. Two pathway cards bridge users to Preset Routers and the Playground, each with a single CTA. The goal was avoiding a dead end on the catalog page. The preset section was renamed from "default": "default" implied system-imposed immutability, "preset" signals curated and overridable. That change rippled across the UI, docs, and marketing copy.

Getting Started hero and preset router catalog Getting Started — plain-language headline, preset pathways, and Public Preview terms without blocking the scan path
Preset router cards Preset routers — curated, benchmark-backed starting points. "Preset" over "default" communicates a starting point, not a system-imposed setting.

The rename from "default" to "preset" sounds small but had a long tail. "Default" read as system-imposed and unchangeable. "Preset" communicated a curated starting point. It required updates across the UI, docs copy, and marketing before launch.

Earlier Getting Started iteration Earlier iteration — Getting Started before the hero and pathway structure were finalized
Second earlier Getting Started iteration Another pass — entry points and CTA structure still being worked out
Earlier design with 'default routers' label before the rename to preset Before the rename — "default routers" still in place. Feedback made clear users read "default" as immutable and system-imposed.

My Routers

The inventory needed to be scannable and make the create path obvious. Create Router lives in the page header so it's always visible, not buried in a hero. The page title matches the tab label for consistent wayfinding. Creating a router is staged to match backend concepts: router, then tasks, then models, then fallbacks. Preset tasks use an accordion and checkbox picker so task lists can grow without becoming a flat wall of options. Each task supports up to five models and a policy choice (Optimal, Cost, Speed, or Manual Ranking for custom tasks). A five-model cap prevents unbounded pools that would break policy semantics. "Learn more" links open iframe slideouts anchored to docs.digitalocean.com, keeping users in Model Studio during first-time setup.

My Routers inventory My Routers — scannable list, row actions, and Create Router always visible in the header
Router detail view showing tasks, models, and API snippet Router detail — task list, model pool, and API snippet all in one view
Create Router form Create Router — description doubles as a routing prompt; tasks and fallbacks configured in staged modals
Add Preset Tasks modal Add Preset Tasks — accordion + checkbox scales to many task categories; reused later in the Gen AI Tool Catalog
Edit Preset Task modal — name, description, policy, and model pool Edit Preset Task — policy selection, model pool with per-model pricing, Optimal badge
Earlier iteration of the Create Router screen Earlier Create Router iteration — task and policy structure before the final layout was locked
Manual ranking policy in task configuration Manual Ranking — available for custom tasks when teams want a deterministic priority order instead of delegating to the policy engine

Playground

The Playground is where routing becomes concrete. Symmetric panes make for a fair A/B. Either side can be a router or a model. Selector labels ("Inference Routing" vs. "Model") prevent users from accidentally comparing two models when they mean to compare routing against a baseline. ResponseInfo surfaces task match, model selected, cost, and latency per turn. Cost display uses threshold logic: values above $0.001 and below $0.01 show three decimal places. At two decimals, sub-cent costs display as $0.00, which makes routing look free when it isn't and breaks cost-led evaluations. Writing Preset surfaces sample prompt bubbles only when a Writing Preset router is selected. It reduces the friction of inventing realistic test scenarios from scratch and was one of the more deliberate conditional UI decisions in the project.

Inference Router Playground — dual-pane comparison after a run Playground — routing decisions made visible: task match, model selected, cost, and latency side by side with the qualitative response
Playground dual pane Dual pane — either side can be a router or a model; symmetric layout prevents implied A/B bias
ResponseInfo strip ResponseInfo — task matched, model used, cost, latency, tokens, fallback status

Analyze

Analyze closes the feedback loop after launch. The charts show request volume, latency distribution, task and model match rates, and fallback frequency. The log table is filterable by router and time range and shows per-request detail: matched task, model, latency, and whether it fell back.

Those two views answer different questions. The charts tell you if the router is behaving as expected in aggregate. The logs tell you which specific request went sideways.

The hardest design decision was figuring out what not to show. Routing generates a lot of observability data, but what operators actually need post-launch is pretty specific: are task policies matching as configured, is the fallback rate reasonable, and is latency in an acceptable range. Everything else is noise until something breaks. I kept the first version tight around those three.

Earlier iteration of Analyze tab with Manage panel Earlier iteration — Analyze and Manage explored as a combined view before the tab structure was finalized
Analyze tab — request volume, latency, and task/model distribution charts Analyze — request volume, latency, and task/model distribution
Router logs — filterable by router and time range with per-request routing detail Router logs — filterable by router and time range, with per-request routing detail

Design and implementation proceeded in tight loops across February–April 2026.

PhaseFocusOutcome
Feb 2026 - FoundationFour-tab IA, Getting Started hero, catalog cardsUsers can discover preset routers and understand Public Preview scope
Feb–Mar 2026 - Naming & trustRename default to preset routers; benchmark copyReduces confusion with platform "defaults"; reinforces evaluation story
Mar 2026 - PlaygroundDual-pane comparison, selectors, ResponseInfoDevelopers see task match and economic delta vs. single model
Mar 2026 - Create flowAdd Preset Task modal, policies, 5-model capConfiguration matches backend capabilities without overwhelm
Apr 2026 - Writing pathSample prompt bubbles for Writing PresetFaster time-to-first meaningful comparison
Apr 2026 - PolishHero media, CTA targets, cost decimalsGetting Started ready for keynote; trustworthy comparison numbers
Apr 2026 - ReuseTool Catalog Playground fork in ui-gen-aiProves router playground patterns generalize to tool-calling

Deploy 2026 keynote

The Inference Router was the headline feature of DigitalOcean's Deploy 2026 keynote, demoed live on stage as the defining capability of the Inference Hub Public Preview. The live demo followed exactly the Getting Started to Playground path, showing a preset router comparison before the audience had time to wonder what routing meant. That wasn't luck. It was the IA working as intended.

Inference Router on stage at Deploy 2026 keynote Deploy 2026 keynote — Inference Router demoed live as the headline feature of Inference Hub's Public Preview
3mo
Zero to shipped
From first design session to Public Preview in the production codebase
4
Product surfaces
Getting Started, My Routers, Playground, Analyze - one cohesive product
2
Teams using the patterns
Playground shell forked into Gen AI Tool Catalog; two product teams now ship from the same patterns

Pattern leverage

The Playground comparison shell, ResponseInfo strip, selector grouping, and cost formatting were forked into the Gen AI Tool Catalog Playground, giving agent-platform builders a familiar evaluation surface and reducing duplicate UX work across Inference Hub and the Agent Platform.

The hardest design problem on this project wasn't any single screen. It was making an invisible process visible. Routing happens in milliseconds in a backend system. If the UI doesn't show users what happened and why, it might as well not be there. ResponseInfo and the comparison tabs aren't ornaments. They're what makes routing legible.

Naming mattered more than I expected. "Default router" implied immutability and lack of rigor. "Preset" communicated curation and starting point without suggesting lock-in. That single rename aligned the UI, docs, and marketing around one consistent term, a small decision with a disproportionate effect on how the product was understood.

Two things I'd underestimate on another project: cost formatting and docs integration. Sub-cent precision is a UX requirement. At two decimals, routing looks free when it isn't, which breaks cost-led evaluations. Iframe slideouts anchored to real documentation sections are slower to build but meaningfully better than a Learn More link that goes nowhere useful.

I'd also prototype in production code earlier. Layout issues in the dual-pane Playground (height constraints, hover clipping, tab style conflicts) only showed up under real component CSS. Figma alone would have missed them.

Full Figma files are available for this project and can be shared on request.