nexusflow Developer Docs

One-stop access to models including Qwen, DeepSeek, GLM, Kimi, and HappyHorse. Supports three protocols (OpenAI Chat, Anthropic Messages, Responses API) with unified billing, keys, and monitoring.

Model Services

Unified model list, pricing, and capabilities

Three-Protocol Access

OpenAI Chat / Anthropic Messages / Responses API compatible endpoints

Monitoring & Evaluation

Monitoring page, error codes, and performance metrics

High-Concurrency Readiness

Rate limits, queues, task pipelines, and capacity scaling

Three Compatible Protocols

OpenAI-compatible

/v1/chat/completions

Anthropic Messages

/v1/messages

Ideal for reusing the Anthropic SDK, Claude Code-style clients, and the Messages request format.

Responses API

/v1/responses

Built-in tools like web search and code interpreter, simplified context management, and multi-turn support via previous_response_id.

Quick Start

From your first chat request to async task integration

Models Overview

Browse model services, dedicated pages, and pricing

API Reference

OpenAI Chat, Anthropic Messages, and Responses compatible protocols

Monitoring & Rate Limits

Quotas, queues, and observability for production traffic

Popular Models

Qwen3.5 Omni PlusNew

Alibaba Cloud · Flagship omni-modal model with audio/video input and output

View Details

HappyHorse 1.0

Alibaba · Video generation page with task pipeline integration guide

View Details

Qwen3 MaxRecommended

Alibaba Cloud · Flagship reasoning for complex tasks

View Details

Qwen3.5 PlusPopular

Alibaba Cloud · Cost-effective choice, balanced and efficient

View Details

DeepSeek R1

DeepSeek · Open-source reasoning with coding expertise

View Details

API Endpoints

POST/v1/chat/completionsChat Completions POST/v1/messagesAnthropic Messages compatible POST/v1/responsesResponses API (built-in tools, multi-turn context)POST/v1/embeddingsText Embedding POST/v1/tasksImage / video async task submission GET/v1/tasks/:idAsync task polling

Production Traffic Recommendations

Use sync endpoint for chat

Prefer `/v1/chat/completions` for chat and reasoning models to avoid unnecessary polling complexity.

Use async tasks for media

Route image and video through `/v1/tasks`, using task status to handle high latency and peak loads.

Check limits & monitoring before launch

Confirm peak-request strategy on the rate limits, error codes, and monitoring pages to avoid discovering bottlenecks during traffic spikes.

Quick Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-air-your-key",
    base_url="https://nexusflow.hk/v1",
)

response = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(response.choices[0].message.content)

Use the standard OpenAI SDK — just change the base_url to connect to nexusflow.View full tutorial →