OpenAI Responses

Call models via the OpenAI Responses API, a next-generation API format that supports rich multimodal input, built-in tool calling, and conversation context management.

Try it now

Test in Playground ↗ · View Model List ↗

Quick Start

Step 1: Get your API Key from the Console.

Step 2: Send a request:

cURLPythonNode.js

bash

curl -X POST "https://open.dieyuyun.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "input": "Briefly describe the history of artificial intelligence"
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

response = client.responses.create(
    model="deepseek-v4-flash",
    input="Briefly describe the history of artificial intelligence"
)

print(response.output[0].content[0].text)

javascript

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const response = await client.responses.create({
  model: 'deepseek-v4-flash',
  input: 'Briefly describe the history of artificial intelligence',
})

console.log(response.output[0].content[0].text)

Step 3: Read the model response from output[0].content[0].text.

Endpoint

Item	Value
Method	POST
Path	`/v1/responses`
	`/v1/responses/compact` (context compression)
Base URL	`https://open.dieyuyun.com`
Protocol	OpenAI Responses API

Authentication

All requests require a Bearer Token in the request header:

http

Authorization: Bearer sk-xxx

Supported Models

Model	Provider	Context Length	Description
deepseek-v4-flash	DeepSeek	1M	Fast response, low cost, high-frequency
deepseek-v4-pro	DeepSeek	1M	Complex reasoning, high quality output
qwen3.7-max	Qwen	1M	Qwen flagship model

TIP

See the full model list in the Console. Model support for the Responses API may differ from Chat Completions.

Request Parameters

Field	Type	Required	Default	Description
model	string	Yes	—	Model identifier, e.g. `deepseek-v4-flash`
input	string / array	Yes	—	Input content: simple string or structured message array
instructions	string	No	—	System instructions (equivalent to system prompt)
previous_response_id	string	No	—	Reference a previous response ID for multi-turn conversation
stream	boolean	No	false	Enable streaming output
stream_options	object	No	—	Streaming configuration
tools	array	No	—	Tool definitions
tool_choice	string / object	No	auto	Tool calling strategy
temperature	number	No	1	Sampling randomness
top_p	number	No	1	Nucleus sampling parameter
max_output_tokens	integer	No	—	Maximum output tokens
reasoning	object	No	—	Reasoning configuration (reasoning models only)
parallel_tool_calls	boolean	No	true	Allow parallel tool calls

Input Format

The Responses API input field supports two formats:

Simple string:

json

{
  "model": "deepseek-v4-flash",
  "input": "Hello"
}

Structured array (multimodal):

json

{
  "model": "deepseek-v4-flash",
  "input": [
    {
      "role": "user",
      "content": [
        { "type": "input_text", "text": "What's in this image?" },
        { "type": "input_image", "image_url": "https://example.com/photo.jpg" }
      ]
    }
  ]
}

Input Content Types

Type	Description	Fields
input_text	Text input	`text`
input_image	Image input	`image_url` or `image_file`
input_file	File input	`file`

Request Examples

Multimodal Input

cURLPythonNode.js

bash

curl -X POST "https://open.dieyuyun.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "input": [
      {
        "role": "user",
        "content": [
          {"type": "input_text", "text": "What's in this image?"},
          {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
        ]
      }
    ]
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

response = client.responses.create(
    model="deepseek-v4-flash",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "What's in this image?"},
                {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
            ]
        }
    ]
)

print(response.output[0].content[0].text)

javascript

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const response = await client.responses.create({
  model: 'deepseek-v4-flash',
  input: [
    {
      role: 'user',
      content: [
        { type: 'input_text', text: "What's in this image?" },
        { type: 'input_image', image_url: 'https://example.com/photo.jpg' },
      ],
    },
  ],
})

console.log(response.output[0].content[0].text)

Multi-Turn Conversation (using previous_response_id)

json

// First request
{
  "model": "deepseek-v4-flash",
  "instructions": "You are a technical assistant.",
  "input": "What is machine learning?"
}

// Second request - reference the previous response
{
  "model": "deepseek-v4-flash",
  "previous_response_id": "resp_abc123",
  "input": "How is it different from deep learning?"
}

With Tool Calling

json

{
  "model": "deepseek-v4-flash",
  "instructions": "You are a helpful assistant.",
  "input": "What's the weather like in Beijing today?",
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get weather information for a city",
      "parameters": {
        "type": "object",
        "properties": {
          "city": {
            "type": "string",
            "description": "City name"
          }
        },
        "required": ["city"]
      }
    }
  ]
}

Streaming Output

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

stream = client.responses.create(
    model="deepseek-v4-flash",
    input="Write a quicksort algorithm in Python",
    stream=True
)

for event in stream:
    if hasattr(event, 'delta') and event.delta:
        print(event.delta, end="", flush=True)

Response Format

Successful Response

json

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1712345678,
  "status": "completed",
  "model": "deepseek-v4-flash",
  "output": [
    {
      "id": "msg_abc123",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The history of artificial intelligence can be traced back to the 1950s. The 1956 Dartmouth Conference is widely considered the founding moment of AI...",
          "annotations": []
        }
      ]
    }
  ],
  "parallel_tool_calls": true,
  "tools": [],
  "usage": {
    "input_tokens": 8,
    "output_tokens": 256,
    "total_tokens": 264
  }
}

Field Reference

Field	Description
id	Unique response identifier
object	Always `response`
status	Status: `completed`, `in_progress`, `failed`
output	Output content array (replaces Chat Completions' `choices`)
output[].type	Output type: `message`, `tool_call`
output[].content[].type	Content type: `output_text`
output[].content[].text	Model response text
output[].content[].annotations	Annotation information
usage.input_tokens	Number of input tokens
usage.output_tokens	Number of output tokens
usage.total_tokens	Total token count

Streaming SSE Events

Streaming output returns the following event types:

data: {"type":"response.created","response":{"id":"resp_abc123",...}}

data: {"type":"response.output_text.delta","delta":"Artificial"}

data: {"type":"response.output_text.delta","delta":" intelligence"}

data: {"type":"response.output_text.delta","delta":" history"}

data: {"type":"response.output_text.done","text":"Artificial intelligence history..."}

data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed",...}}

Error Response

json

{
  "error": {
    "message": "Invalid input: expected string or array",
    "type": "invalid_request_error",
    "param": "input",
    "code": "invalid_input"
  }
}

See Error Codes for details.

Compatibility

Feature	Wuliang AI	OpenAI Responses API
Response format	Fully compatible	Native protocol
Multimodal input	Supports text / image / file	Native support
previous_response_id	Supported	Native support
Tool calling (function)	Supported	Native support
Built-in tools (web/file)	Depends on upstream routing	Native support
Streaming output	Supported	Native support
/responses/compact	Supported	Native support
Endpoint path	`/v1/responses`	`/v1/responses`

TIP

The Responses API uses the output array instead of Chat Completions' choices, and input instead of messages. If your application already uses the Chat Completions API, there's no need to migrate -- both formats work seamlessly.

Best Practices

Choose the right API format: The Responses API is ideal for multimodal input and built-in context management; Chat Completions is better for simple conversation scenarios.
Leverage previous_response_id: Reference previous response IDs for multi-turn conversations without manually managing message history.
Use instructions instead of system: The Responses API uses the instructions field for system-level behavior.
Combine with /responses/compact: For long conversations, use the compact endpoint to compress context and save tokens.
Check SDK version: The Responses API requires a recent version of the OpenAI SDK (Python >= 1.50.0, Node.js >= 4.50.0).

Rate Limits

See Rate Limits for details.

Chat Completions (Streaming) - Use the OpenAI Chat Completions protocol
Chat Completions (Non-Streaming) - Non-streaming conversations
Manage API Keys - Create and configure API keys

OpenAI Responses ​

Quick Start ​

Endpoint ​

Authentication ​

Supported Models ​

Request Parameters ​

Input Format ​

Input Content Types ​

Request Examples ​

Multimodal Input ​

Multi-Turn Conversation (using previous_response_id) ​

With Tool Calling ​

Streaming Output ​

Response Format ​

Successful Response ​

Field Reference ​

Streaming SSE Events ​

Error Response ​

Compatibility ​

Best Practices ​

Rate Limits ​

Related Docs ​

OpenAI Responses

Quick Start

Endpoint

Authentication

Supported Models

Request Parameters

Input Format

Input Content Types

Request Examples

Multimodal Input

Multi-Turn Conversation (using previous_response_id)

With Tool Calling

Streaming Output

Response Format

Successful Response

Field Reference

Streaming SSE Events

Error Response

Compatibility

Best Practices

Rate Limits

Related Docs