nexusflow
Online

Qwen (Tongyi) Series

by Alibaba Cloud

Qwen (Tongyi) is Alibaba Cloud's self-developed large language model. Currently showcasing the Qwen3.6 and Qwen3.5 series, covering long text, function calling, code generation, and complex reasoning for common production scenarios. Text models can be integrated via OpenAI Chat, Anthropic Messages, and Gemini-compatible three public protocols.

Integration Protocols

Core Advantages

🇨🇳
Chinese Optimization
Native-level understanding
📚
Long Text
Million context
âš¡
High Cost-Efficiency
Affordable pricing
🔗
Full Ecosystem
Alibaba Cloud seamless

Available Models

Qwen3.7 Max

NEW
qwen3.7-max
FlagshipLatestThinking ModeAgent

Qwen 3.7 generation flagship model, designed for the AI agent era. Coding, office tasks, and long-cycle autonomous execution capabilities comprehensively improved. Supports thinking mode switching, function calling, and web search. Million-level context.

Context Window
1,000,000
Max Output
65,536
Input Price
Â¥12/M
Output Price
Â¥36/M
Function CallingThinking ModeWeb SearchMillion Context

Qwen3.6 Max Preview

NEW
qwen3.6-max-preview
FlagshipThinking Mode

The strongest preview model in the Qwen3.6 series, suited for complex reasoning, multi-step code generation, and tool-based tasks.

Context Window
262,144
Max Output
65,536
Input Price
Â¥9/M
Output Price
Â¥54/M
Function CallingComplex ReasoningCode Generation

Qwen3.6 Plus

NEW
qwen3.6-plus
RecommendedBalanced

Balanced flagship model, supports million-level context window, function calling, and built-in tools. Suited for most production scenarios.

Context Window
1,000,000
Max Output
65,536
Input Price
Â¥2/M
Output Price
Â¥12/M
Image UnderstandingFunction CallingCode GenerationMillion Context

Qwen3.5 Plus

NEW
qwen3.5-plus
RecommendedBalanced

Balanced performance model, suited for most production scenarios. Outstanding Chinese capability and fast response speed.

Context Window
1,000,000
Max Output
65,536
Input Price
Â¥0.8/M
Output Price
Â¥4.8/M
Image UnderstandingFunction CallingCode Generation

Qwen3.5 Flash

qwen3.5-flash
FastBudget

High-speed response model, suited for latency-sensitive scenarios. Highly cost-efficient.

Context Window
1,000,000
Max Output
65,536
Input Price
Â¥0.2/M
Output Price
Â¥2/M
Function CallingLow Cost

Usage Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-air-your-key",
    base_url="https://nexusflow.vip/v1",
)

# Use Qwen to handle long text
response = client.chat.completions.create(
    model="qwen3.6-plus",
    messages=[
        {"role": "system", "content": "You are a professional document analysis assistant."},
        {"role": "user", "content": "Please summarize the main points of the following long document content... (you can input very long text here)"}
    ],
    max_tokens=4096,
)

print(response.choices[0].message.content)

Related Docs