nexusflow
Online
Feature

Provider Routing

Smart routing and automatic fault tolerance across multiple providers for the same model

NexusFlow configures multiple provider endpoints for the same model. The system intelligently selects the optimal endpoint based on real-time performance and automatically switches when a provider is unavailable - the entire process is transparent to the caller.

Smart Routing

When you make an API request, the system filters all matching provider endpoints based on model ID and request protocol, then selects the optimal endpoint considering stability, latency, and other factors. When an endpoint encounters issues, the system automatically lowers its priority; upon recovery, it returns to normal priority without manual intervention.

┌──────────────┐     ┌─────────────────┐     ┌──────────────────┐
│  Client      │────▶│  NexusFlow      │────▶│  Provider A (Pri) │
│  POST /v1/   │     │  Smart Routing  │     │  DashScope       │
│  chat/compl  │     │                 │     └──────────────────┘
└──────────────┘     │  Health Check   │     ┌──────────────────┐
                     │  Latency Monitor│────▶│  Provider B (Bkp)│
                     │  Auto Switch    │     │  (Extensible)        │
                     └─────────────────┘     └──────────────────┘
Health Detection
Periodically checks upstream endpoint status; consecutive failures automatically mark as degraded / down.
Weighted Selection
Load balancing based on endpoint priority and weight; healthy endpoints are preferred.
Transparent Recovery
Faulty endpoints automatically return to normal priority upon recovery, no manual intervention needed.

Automatic Fault Tolerance

When the selected endpoint returns a server error (5xx) or rate limit (429), the system automatically retries the next available endpoint until the request succeeds or all endpoints have been tried. 400-class client errors do not trigger retries.

Endpoint StatusConditionBehavior
healthyConsecutive successesNormal use, highest priority
degradedConsecutive failures >= 3Lower priority, still selectable
downConsecutive failures >= 5Skip this endpoint, no longer tried

Important Notes

Streaming Requests
If the endpoint fails before starting to return data, the system can seamlessly switch to the next endpoint. However, once data has started being sent to the client, switching is no longer possible - errors are directly forwarded to the client.

We recommend implementing streaming request error handling logic on the client side to handle stream interruptions.
Protocol Mismatch
Fault tolerance depends on how many available endpoints a model has under the current request protocol, not the total number of providers. For example, a model may have two providers, but only one supports the Anthropic protocol - requesting via Anthropic protocol would have no fault tolerance.

If fault tolerance is needed, we recommend using the protocol with the broadest model support (typically the OpenAI protocol), or combining withModel Fallbackto configure backup models.

Related Docs