通用对话接口（流式）

通过 OpenAI Chat Completions 协议的流式输出接口，实时获取模型的逐字回复，适用于聊天应用和交互式场景。

在线体验

快速开始

第 1 步： 获取您的 API Key（在控制台创建）。

第 2 步： 发送流式请求：

cURLPythonNode.js

bash

curl -X POST "https://open.dieyuyun.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "stream": true,
    "messages": [
      {"role": "user", "content": "简要介绍量子计算"}
    ]
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

stream = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "简要介绍量子计算"}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

javascript

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const stream = await client.chat.completions.create({
  model: 'deepseek-v4-flash',
  messages: [{ role: 'user', content: '简要介绍量子计算' }],
  stream: true,
})

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '')
}

第 3 步： 解析 SSE 数据流，逐块拼接 choices[0].delta.content 即可获得完整回复。

请求端点

项目	值
方法	POST
路径	`/v1/chat/completions`
Base URL	`https://open.dieyuyun.com`
兼容协议	OpenAI Chat Completions

认证

所有请求均需在请求头中携带 Bearer Token：

http

Authorization: Bearer sk-xxx

支持模型

模型	提供商	上下文长度	说明
deepseek-v4-flash	DeepSeek	1M	极速响应、低成本，适合高频调用
deepseek-v4-pro	DeepSeek	1M	复杂推理、高质量输出
qwen3.7-max	通义千问	1M	通义千问旗舰模型
glm-5.7	智谱清言	200K	智谱 GLM 旗舰模型
kimi-k2.6	月之暗面	256K	超长上下文，擅长长文档分析
minimax-m3	稀宇科技	1M	MiniMax 旗舰模型

TIP

完整模型列表请在控制台查看。模型可用性可能随时调整。

标准请求字段

字段	类型	必填	默认值	说明
model	string	是	—	模型标识符，如 `deepseek-v4-flash`
messages	array	是	—	消息列表，每条包含 `role` 和 `content`
stream	boolean	否	false	设为 `true` 启用流式输出
stream_options	object	否	—	流式配置，如 `{"include_usage": true}` 可在最终块中返回用量
temperature	number	否	1	控制输出随机性，范围 0~2
top_p	number	否	1	核采样参数，与 temperature 二选一使用
max_tokens	integer	否	模型默认	最大生成 Token 数
max_completion_tokens	integer	否	模型默认	最大补全 Token 数（新参数，部分模型支持）
stop	string / array	否	null	停止生成的标记
presence_penalty	number	否	0	存在惩罚，范围 -2~2
frequency_penalty	number	否	0	频率惩罚，范围 -2~2
n	integer	否	1	生成的候选回复数量
tools	array	否	—	工具（函数）定义列表
tool_choice	string / object	否	auto	工具调用策略：`auto`、`none`、`required` 或指定函数
response_format	object	否	—	响应格式，如 `{"type": "json_object"}` 强制返回 JSON
reasoning_effort	string	否	—	推理力度（仅推理模型支持），可选 `low`、`medium`、`high`

消息角色

角色	说明
system	系统提示词，设定模型的行为和角色
user	用户消息
assistant	模型的历史回复
tool	工具调用结果

请求示例

基础对话

cURLPythonNode.js

bash

curl -X POST "https://open.dieyuyun.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "stream": true,
    "stream_options": {"include_usage": true},
    "messages": [
      {"role": "system", "content": "你是一个专业的 AI 助手。"},
      {"role": "user", "content": "介绍一下人工智能的发展历史。"}
    ]
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

stream = client.chat.completions.create(
    model="deepseek-v4-flash",
    stream=True,
    stream_options={"include_usage": True},
    messages=[
        {"role": "system", "content": "你是一个专业的 AI 助手。"},
        {"role": "user", "content": "介绍一下人工智能的发展历史。"}
    ]
)

for chunk in stream:
    # 最后一个 chunk 包含 usage 信息
    if chunk.choices:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="", flush=True)
    if hasattr(chunk, 'usage') and chunk.usage:
        print(f"\n\n[用量] 输入: {chunk.usage.prompt_tokens}, "
              f"输出: {chunk.usage.completion_tokens}")

javascript

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const stream = await client.chat.completions.create({
  model: 'deepseek-v4-flash',
  stream: true,
  stream_options: { include_usage: true },
  messages: [
    { role: 'system', content: '你是一个专业的 AI 助手。' },
    { role: 'user', content: '介绍一下人工智能的发展历史。' },
  ],
})

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content
  if (content) process.stdout.write(content)

  if (chunk.usage) {
    console.log(`\n\n[用量] 输入: ${chunk.usage.prompt_tokens}, ` + `输出: ${chunk.usage.completion_tokens}`)
  }
}

带工具调用的流式对话

json

{
  "model": "deepseek-v4-flash",
  "stream": true,
  "messages": [{ "role": "user", "content": "北京今天天气怎么样？" }],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "获取指定城市的天气信息",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "城市名称"
            }
          },
          "required": ["city"]
        }
      }
    }
  ]
}

响应格式

流式 SSE 数据块

流式响应以 Server-Sent Events（SSE）格式返回，每个数据块以 data: 前缀开头：

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1712345678,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1712345678,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{"content":"人工"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1712345678,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{"content":"智能"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1712345678,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":156,"total_tokens":184}}

data: [DONE]

字段说明

字段	说明
id	请求唯一标识
object	固定为 `chat.completion.chunk`
created	Unix 时间戳
model	实际使用的模型 ID
choices[].delta.role	角色（仅在第一个块中出现）
choices[].delta.content	增量文本内容
choices[].delta.tool_calls	增量工具调用信息
choices[].finish_reason	结束原因：`stop`、`length`、`tool_calls`、`content_filter`
usage	仅在最后一个块中返回（需设置 `stream_options.include_usage`）

错误响应

json

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "param": null,
    "code": "invalid_api_key"
  }
}

详见错误码。

兼容性说明

特性	本平台	上游（OpenAI 原生）
流式 SSE 格式	完全兼容	原生协议
stream_options	支持 include_usage	支持
tools / function	支持透传	原生支持
response_format	支持 json_object	原生支持
reasoning_effort	支持（仅推理模型）	支持（o1/o3 系列）
多模态输入	支持 images、audio	原生支持
端点路径	`/v1/chat/completions`	`/v1/chat/completions`

TIP

平台保持与 OpenAI 协议的完全兼容，成功响应直接返回上游原始格式，不会额外包裹 code / data 字段。

最佳实践

设置 stream_options：添加 {"include_usage": true} 可在流的最后一个块中获取 Token 用量，便于计费和监控。
正确处理连接中断：流式输出可能因网络问题中断，建议实现重连和超时机制。
合理设置 max_tokens：避免生成过长内容导致不必要的消耗和延迟。
使用 system 角色：通过系统提示词设定模型行为，比在用户消息中重复指令更高效。
实现指数退避：遇到 429 错误时，使用指数退避策略重试（1s → 2s → 4s）。
及时消费数据流：不要将流式数据全部缓存到内存，应逐块处理和输出。

速率限制

详见速率限制。

通用对话接口（流式） ​

快速开始 ​

请求端点 ​

认证 ​

支持模型 ​

标准请求字段 ​

消息角色 ​

请求示例 ​

基础对话 ​

带工具调用的流式对话 ​

响应格式 ​

流式 SSE 数据块 ​

字段说明 ​

错误响应 ​

兼容性说明 ​

最佳实践 ​

速率限制 ​

相关文档 ​