OpenAI Responses
Call models via the OpenAI Responses API, a next-generation API format that supports rich multimodal input, built-in tool calling, and conversation context management.
Try it now
Quick Start
Step 1: Get your API Key from the Console.
Step 2: Send a request:
curl -X POST "https://open.dieyuyun.com/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxx" \
-d '{
"model": "deepseek-v4-flash",
"input": "Briefly describe the history of artificial intelligence"
}'from openai import OpenAI
client = OpenAI(
api_key="sk-xxx",
base_url="https://open.dieyuyun.com/v1"
)
response = client.responses.create(
model="deepseek-v4-flash",
input="Briefly describe the history of artificial intelligence"
)
print(response.output[0].content[0].text)import OpenAI from 'openai'
const client = new OpenAI({
apiKey: 'sk-xxx',
baseURL: 'https://open.dieyuyun.com/v1',
})
const response = await client.responses.create({
model: 'deepseek-v4-flash',
input: 'Briefly describe the history of artificial intelligence',
})
console.log(response.output[0].content[0].text)Step 3: Read the model response from output[0].content[0].text.
Endpoint
| Item | Value |
|---|---|
| Method | POST |
| Path | /v1/responses |
/v1/responses/compact (context compression) | |
| Base URL | https://open.dieyuyun.com |
| Protocol | OpenAI Responses API |
Authentication
All requests require a Bearer Token in the request header:
Authorization: Bearer sk-xxxSupported Models
| Model | Provider | Context Length | Description |
|---|---|---|---|
| deepseek-v4-flash | DeepSeek | 1M | Fast response, low cost, high-frequency |
| deepseek-v4-pro | DeepSeek | 1M | Complex reasoning, high quality output |
| qwen3.7-max | Qwen | 1M | Qwen flagship model |
TIP
See the full model list in the Console. Model support for the Responses API may differ from Chat Completions.
Request Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| model | string | Yes | — | Model identifier, e.g. deepseek-v4-flash |
| input | string / array | Yes | — | Input content: simple string or structured message array |
| instructions | string | No | — | System instructions (equivalent to system prompt) |
| previous_response_id | string | No | — | Reference a previous response ID for multi-turn conversation |
| stream | boolean | No | false | Enable streaming output |
| stream_options | object | No | — | Streaming configuration |
| tools | array | No | — | Tool definitions |
| tool_choice | string / object | No | auto | Tool calling strategy |
| temperature | number | No | 1 | Sampling randomness |
| top_p | number | No | 1 | Nucleus sampling parameter |
| max_output_tokens | integer | No | — | Maximum output tokens |
| reasoning | object | No | — | Reasoning configuration (reasoning models only) |
| parallel_tool_calls | boolean | No | true | Allow parallel tool calls |
Input Format
The Responses API input field supports two formats:
Simple string:
{
"model": "deepseek-v4-flash",
"input": "Hello"
}Structured array (multimodal):
{
"model": "deepseek-v4-flash",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "What's in this image?" },
{ "type": "input_image", "image_url": "https://example.com/photo.jpg" }
]
}
]
}Input Content Types
| Type | Description | Fields |
|---|---|---|
| input_text | Text input | text |
| input_image | Image input | image_url or image_file |
| input_file | File input | file |
Request Examples
Multimodal Input
curl -X POST "https://open.dieyuyun.com/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxx" \
-d '{
"model": "deepseek-v4-flash",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "What's in this image?"},
{"type": "input_image", "image_url": "https://example.com/photo.jpg"}
]
}
]
}'from openai import OpenAI
client = OpenAI(
api_key="sk-xxx",
base_url="https://open.dieyuyun.com/v1"
)
response = client.responses.create(
model="deepseek-v4-flash",
input=[
{
"role": "user",
"content": [
{"type": "input_text", "text": "What's in this image?"},
{"type": "input_image", "image_url": "https://example.com/photo.jpg"}
]
}
]
)
print(response.output[0].content[0].text)import OpenAI from 'openai'
const client = new OpenAI({
apiKey: 'sk-xxx',
baseURL: 'https://open.dieyuyun.com/v1',
})
const response = await client.responses.create({
model: 'deepseek-v4-flash',
input: [
{
role: 'user',
content: [
{ type: 'input_text', text: "What's in this image?" },
{ type: 'input_image', image_url: 'https://example.com/photo.jpg' },
],
},
],
})
console.log(response.output[0].content[0].text)Multi-Turn Conversation (using previous_response_id)
// First request
{
"model": "deepseek-v4-flash",
"instructions": "You are a technical assistant.",
"input": "What is machine learning?"
}
// Second request - reference the previous response
{
"model": "deepseek-v4-flash",
"previous_response_id": "resp_abc123",
"input": "How is it different from deep learning?"
}With Tool Calling
{
"model": "deepseek-v4-flash",
"instructions": "You are a helpful assistant.",
"input": "What's the weather like in Beijing today?",
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get weather information for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name"
}
},
"required": ["city"]
}
}
]
}Streaming Output
from openai import OpenAI
client = OpenAI(
api_key="sk-xxx",
base_url="https://open.dieyuyun.com/v1"
)
stream = client.responses.create(
model="deepseek-v4-flash",
input="Write a quicksort algorithm in Python",
stream=True
)
for event in stream:
if hasattr(event, 'delta') and event.delta:
print(event.delta, end="", flush=True)Response Format
Successful Response
{
"id": "resp_abc123",
"object": "response",
"created_at": 1712345678,
"status": "completed",
"model": "deepseek-v4-flash",
"output": [
{
"id": "msg_abc123",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "The history of artificial intelligence can be traced back to the 1950s. The 1956 Dartmouth Conference is widely considered the founding moment of AI...",
"annotations": []
}
]
}
],
"parallel_tool_calls": true,
"tools": [],
"usage": {
"input_tokens": 8,
"output_tokens": 256,
"total_tokens": 264
}
}Field Reference
| Field | Description |
|---|---|
| id | Unique response identifier |
| object | Always response |
| status | Status: completed, in_progress, failed |
| output | Output content array (replaces Chat Completions' choices) |
| output[].type | Output type: message, tool_call |
| output[].content[].type | Content type: output_text |
| output[].content[].text | Model response text |
| output[].content[].annotations | Annotation information |
| usage.input_tokens | Number of input tokens |
| usage.output_tokens | Number of output tokens |
| usage.total_tokens | Total token count |
Streaming SSE Events
Streaming output returns the following event types:
data: {"type":"response.created","response":{"id":"resp_abc123",...}}
data: {"type":"response.output_text.delta","delta":"Artificial"}
data: {"type":"response.output_text.delta","delta":" intelligence"}
data: {"type":"response.output_text.delta","delta":" history"}
data: {"type":"response.output_text.done","text":"Artificial intelligence history..."}
data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed",...}}Error Response
{
"error": {
"message": "Invalid input: expected string or array",
"type": "invalid_request_error",
"param": "input",
"code": "invalid_input"
}
}See Error Codes for details.
Compatibility
| Feature | Wuliang AI | OpenAI Responses API |
|---|---|---|
| Response format | Fully compatible | Native protocol |
| Multimodal input | Supports text / image / file | Native support |
| previous_response_id | Supported | Native support |
| Tool calling (function) | Supported | Native support |
| Built-in tools (web/file) | Depends on upstream routing | Native support |
| Streaming output | Supported | Native support |
| /responses/compact | Supported | Native support |
| Endpoint path | /v1/responses | /v1/responses |
TIP
The Responses API uses the output array instead of Chat Completions' choices, and input instead of messages. If your application already uses the Chat Completions API, there's no need to migrate -- both formats work seamlessly.
Best Practices
- Choose the right API format: The Responses API is ideal for multimodal input and built-in context management; Chat Completions is better for simple conversation scenarios.
- Leverage
previous_response_id: Reference previous response IDs for multi-turn conversations without manually managing message history. - Use
instructionsinstead ofsystem: The Responses API uses theinstructionsfield for system-level behavior. - Combine with
/responses/compact: For long conversations, use the compact endpoint to compress context and save tokens. - Check SDK version: The Responses API requires a recent version of the OpenAI SDK (Python >= 1.50.0, Node.js >= 4.50.0).
Rate Limits
See Rate Limits for details.
Related Docs
- Chat Completions (Streaming) - Use the OpenAI Chat Completions protocol
- Chat Completions (Non-Streaming) - Non-streaming conversations
- Manage API Keys - Create and configure API keys