Feature

Provider Routing

Smart multi-provider routing and automatic failover for the same model

nexusflow configures multiple provider endpoints for the same model. The system intelligently selects the best endpoint based on each one's real-time performance and automatically switches when a provider is unavailable — entirely transparent to the caller.

Smart Routing

When you make an API request, the system filters all matching provider endpoints by model ID and request protocol, then selects the best one based on factors like stability and latency. When an endpoint misbehaves, its priority is lowered automatically; once it recovers, it returns to normal with no manual intervention.

┌──────────────┐     ┌─────────────────┐     ┌──────────────────┐
│  Client      │────▶│  nexusflow      │────▶│  Provider A (main)│
│  POST /v1/   │     │  Smart Routing  │     │  Qwen            │
│  chat/compl  │     │                 │     └──────────────────┘
└──────────────┘     │  Health Checks  │     ┌──────────────────┐
                     │  Latency Watch  │────▶│  Provider B (bak)│
                     │  Auto Failover  │     │  (extensible)    │
                     └─────────────────┘     └──────────────────┘

Health Checks

Periodically check upstream endpoint status; consecutive failures are marked degraded / down automatically.

Weighted Selection

Load-balances by combining endpoint priority and weight, preferring healthy endpoints.

Transparent Recovery

Failed endpoints return to normal priority once recovered, with no manual intervention.

Automatic Failover

When the selected endpoint returns a server error (5xx) or rate limit (429), the system automatically retries the next available endpoint until the request succeeds or all endpoints have been tried. 400-class client errors do not trigger retries.

Endpoint Status	Condition	Behavior
`healthy`	Consecutive successes	Used normally, highest priority
`degraded`	Consecutive failures >= 3	Lower priority, still selectable
`down`	Consecutive failures >= 5	Skip the endpoint, no longer tried

Notes

Streaming Requests

If an endpoint fails before it starts returning data, the system can seamlessly switch to the next one. But once data has begun streaming to the client, the endpoint can no longer be switched, and the error is passed directly to the client.

We recommend implementing error handling for streaming requests on the client to handle stream interruptions.

Protocol Mismatch

Failover capability depends on how many endpoints are available for the model under the current request protocol, not the total number of providers. For example, if a model has two providers but only one supports the Anthropic protocol, requests via the Anthropic protocol cannot fail over.

If you need failover guarantees, use the protocol the model supports most widely (usually the OpenAI protocol), or configure fallback models with Model Fallback.

Related Docs

Model Fallback

Automatically switch to a backup model when all providers fail

Multi-Protocol Support

Learn which protocol fits each scenario