nexusflow
Online
API Reference

Parameters Matrix

This page lists parameters based on backend actual forwarding and protocol conversion logic. Text-type models support OpenAI, Anthropic, and Gemini protocols; non-text models use image, audio, embedding, or async task interfaces based on model capabilities.

OpenAI Chat Completions

Parameter
Type/Map
Status
Description
model
string
Required
Model ID. Text, reasoning, multimodal, coding, and professional models support Chat Completions.
messages
array
Required
Chat message array, passed in sequential order: system, user, assistant, tool.
messages[].role
string
Required
system / user / assistant / tool. The tool role is used for passing back tool execution results.
messages[].content
string | array
Required
Text can be passed directly as a string; multimodal input uses a content block array.
messages[].content[].type
string
Multimodal
Common values include text / image_url; video, input_audio, and other Bailian extension content blocks depend on the specific model.
messages[].content[].text
string
Multimodal
Text content when type=text.
messages[].content[].image_url.url
string
Multimodal
Image URL or data URL; requires the model to support vision understanding.
stream
boolean
Optional
Enable SSE streaming output. Recommended for long text, reasoning models, and interactive scenarios.
stream_options.include_usage
boolean
Optional
Return usage in the final streaming response chunk. Recommended when billing, statistics, or smoke checks are needed.
temperature
number
Optional
Sampling temperature. Range typically 0 to 2; higher values produce more random output.
top_p
number
Optional
Top-p sampling threshold. It is recommended not to adjust both temperature and top_p significantly at the same time.
max_tokens
integer
Optional
Maximum output token count; cannot exceed the model's maximum output.
stop
string | string[]
Optional
Stop sequences; the model stops output after hitting one.
presence_penalty
number
Optional
Presence penalty, typical range -2 to 2, increases tendency for new topics.
frequency_penalty
number
Optional
Frequency penalty, typical range -2 to 2, reduces repetition in output.
tools
array
Optional
Function calling definition array. Only models that support tool calling will reliably return tool_calls.
tools[].type
string
Tool
Fixed as function.
tools[].function.name
string
Tool
Function name. Recommended to use letters, digits, and underscores.
tools[].function.description
string
Tool
Description of the function's purpose, affects the model's accuracy in selecting tools.
tools[].function.parameters
object
Tool
JSON Schema describing the function's input parameters.
tool_choice
string | object
Optional
Supports auto / none, or specifying {type:'function', function:{name}}. For thinking mode models, forcing specific tools is not recommended.
response_format
object
Optional
Output format control. Common values: {"type":"text"} or {"type":"json_object"}.
enable_thinking
boolean
Optional
Toggle thinking mode. Only verified hybrid thinking models can be disabled; pure thinking models will ignore false and continue returning reasoning_content.
thinking_budget
integer
Optional
Limit thinking token upper bound, passed through based on model ID prefix (qwen3.7- / qwen3.6- / qwen3.5- / qwen3-).
preserve_thinking
boolean
Optional
Pass historical reasoning_content in messages back to the model. Supports qwen3.7-max, qwen3.6-max-preview, qwen3.6-plus, kimi-k2.6.
enable_search
boolean
Optional
Web search, supported by Qwen (Tongyi) text-type models (not VL / math series).
search_options
object
Optional
Web search configuration, used together with enable_search.
enable_context_caching
boolean
Optional
Enable Context Caching. Repeated prompt prefixes are automatically cached; hits are billed at 0.1x input price. Supports Qwen (Tongyi), GLM series.
seed
integer
Optional
Random seed, supported by Qwen (Tongyi) text models for pass-through.
top_k
integer
Optional
Top-K sampling, supported by Qwen (Tongyi) text models for pass-through.
logprobs
boolean
Optional
Return log probabilities, supported by Qwen (Tongyi) text models for pass-through.
repetition_penalty
number
Optional
Repetition penalty, supported by Qwen (Tongyi) text models for pass-through.
parallel_tool_calls
boolean
Optional
Parallel tool calling, supported by Qwen (Tongyi), DeepSeek, GLM, Anthropic models.

Not Yet Supported Fields

The table below lists fields that are not yet reliably forwarded through the public Chat endpoint. Production code should not depend on these.

Parameter
Type/Map
Status
Description
max_completion_tokens
integer
Not yet forwarded
Please use the currently supported max_tokens instead.

Thinking Mode Support

This section lists the actual behavior of NexusFlow's online OpenAI Chat endpoint. Support may change with upstream model versions; production code should use explicit configuration based on model ID.

Parameter
Type/Map
Status
Description
qwen3.7-max
hybrid thinking
Supports true / false
Thinking enabled by default; true returns reasoning_content; false does not. Supports thinking_budget and preserve_thinking.
qwen3.5-flash
hybrid thinking
Supports true / false
Verified: true returns reasoning_content; false does not.
qwen3-max
hybrid thinking
Supports true / false
Verified: true returns reasoning_content; false does not.
qwq-plus
pure thinking
false cannot be disabled
Verified: true/false both return reasoning_content.
qwen-math-plus
not handled as thinking toggle
Do not pass
Verified: true/false do not yet return reasoning_content.
deepseek-r1
pure thinking
false cannot be disabled
Verified: true/false both return reasoning_content.
deepseek-v3.2
hybrid thinking
Supports true / false
Verified: true returns reasoning_content; false does not.
deepseek-v4-pro
hybrid thinking
Supports true / false
Verified: true returns reasoning_content; false does not.
glm-5.1
hybrid thinking
Supports true / false
Verified: true returns reasoning_content; false does not.

Anthropic Messages Mapping

Parameter
Type/Map
Status
model
model
Model ID, mapped to OpenAI model.
system
messages[0].role=system
System prompt. Supports string or text blocks.
messages
messages
user / assistant messages are converted to OpenAI messages.
messages[].content[].text
messages[].content
Text block. Pure text blocks are merged into a string.
messages[].content[].image
image_url
Supports url or base64 source, converted to OpenAI image_url.
messages[].content[].tool_use
assistant.tool_calls
Assistant tool call result.
messages[].content[].tool_result
role=tool
Tool execution result passed back.
max_tokens
max_tokens
Maximum output tokens.
temperature
temperature
Sampling temperature.
top_p
top_p
Nucleus sampling.
stop_sequences
stop
Stop sequence array.
stream
stream
Enable Anthropic SSE event stream.
tools
tools
Anthropic tools are converted to OpenAI function tools.
tool_choice
tool_choice
auto / none / any / tool is converted to OpenAI tool_choice.

Gemini GenerateContent Mapping

Parameter
Type/Map
Status
contents
messages
Message array. String contents are also wrapped into user text messages.
contents[].role
messages[].role
user maps to user, model maps to assistant.
contents[].parts[].text
content text
Text content.
contents[].parts[].inlineData
image_url data URL
Base64 image content, converted to image_url.
contents[].parts[].fileData
image_url
File URL, converted to image_url.
contents[].parts[].functionCall
assistant.tool_calls
Model function calling.
contents[].parts[].functionResponse
role=tool
Tool execution result.
systemInstruction
system message
System prompt, supports string or parts.
generationConfig.temperature
temperature
Sampling temperature.
generationConfig.topP
top_p
Nucleus sampling.
generationConfig.maxOutputTokens
max_tokens
Maximum output tokens.
generationConfig.stopSequences
stop
Stop sequence array.
tools[].functionDeclarations
tools
Function declarations, converted to OpenAI function tools.
toolConfig.functionCallingConfig.mode
tool_choice
AUTO / ANY / NONE map to auto / required / none respectively; some upstream models may not decline required.
streamGenerateContent
stream=true
Streaming interface. Use ?alt=sse for SSE-formatted responses.

Response Fields

Parameter
Type/Map
choices[].message.content
Non-streaming text output.
choices[].message.reasoning_content
Thinking content field that reasoning models may return.
choices[].message.tool_calls
Returned when the model requests a tool call.
choices[].delta.content
Streaming text increment.
choices[].delta.reasoning_content
Streaming thinking increment, may be returned by reasoning models.
choices[].finish_reason
stop / length / tool_calls / content_filter.
usage.prompt_tokens
Input tokens.
usage.completion_tokens
Output tokens.
usage.total_tokens
Total tokens.
usage.completion_tokens_details.reasoning_tokens
Reasoning tokens, returned by some models.
Production recommendation: Use stream=true and stream_options.include_usage=true for reasoning models; for hybrid thinking models in low-cost, low-latency scenarios, explicitly pass enable_thinking=false. More examples at Chat Completions API and Gemini Protocol.