Skip to content

OpenAI Responses

Call models via the OpenAI Responses API, a next-generation API format that supports rich multimodal input, built-in tool calling, and conversation context management.

Quick Start

Step 1: Get your API Key from the Console.

Step 2: Send a request:

bash
curl -X POST "https://open.dieyuyun.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "input": "Briefly describe the history of artificial intelligence"
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

response = client.responses.create(
    model="deepseek-v4-flash",
    input="Briefly describe the history of artificial intelligence"
)

print(response.output[0].content[0].text)
javascript
import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const response = await client.responses.create({
  model: 'deepseek-v4-flash',
  input: 'Briefly describe the history of artificial intelligence',
})

console.log(response.output[0].content[0].text)

Step 3: Read the model response from output[0].content[0].text.

Endpoint

ItemValue
MethodPOST
Path/v1/responses
/v1/responses/compact (context compression)
Base URLhttps://open.dieyuyun.com
ProtocolOpenAI Responses API

Authentication

All requests require a Bearer Token in the request header:

http
Authorization: Bearer sk-xxx

Supported Models

ModelProviderContext LengthDescription
deepseek-v4-flashDeepSeek1MFast response, low cost, high-frequency
deepseek-v4-proDeepSeek1MComplex reasoning, high quality output
qwen3.7-maxQwen1MQwen flagship model

TIP

See the full model list in the Console. Model support for the Responses API may differ from Chat Completions.

Request Parameters

FieldTypeRequiredDefaultDescription
modelstringYesModel identifier, e.g. deepseek-v4-flash
inputstring / arrayYesInput content: simple string or structured message array
instructionsstringNoSystem instructions (equivalent to system prompt)
previous_response_idstringNoReference a previous response ID for multi-turn conversation
streambooleanNofalseEnable streaming output
stream_optionsobjectNoStreaming configuration
toolsarrayNoTool definitions
tool_choicestring / objectNoautoTool calling strategy
temperaturenumberNo1Sampling randomness
top_pnumberNo1Nucleus sampling parameter
max_output_tokensintegerNoMaximum output tokens
reasoningobjectNoReasoning configuration (reasoning models only)
parallel_tool_callsbooleanNotrueAllow parallel tool calls

Input Format

The Responses API input field supports two formats:

Simple string:

json
{
  "model": "deepseek-v4-flash",
  "input": "Hello"
}

Structured array (multimodal):

json
{
  "model": "deepseek-v4-flash",
  "input": [
    {
      "role": "user",
      "content": [
        { "type": "input_text", "text": "What's in this image?" },
        { "type": "input_image", "image_url": "https://example.com/photo.jpg" }
      ]
    }
  ]
}

Input Content Types

TypeDescriptionFields
input_textText inputtext
input_imageImage inputimage_url or image_file
input_fileFile inputfile

Request Examples

Multimodal Input

bash
curl -X POST "https://open.dieyuyun.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "deepseek-v4-flash",
    "input": [
      {
        "role": "user",
        "content": [
          {"type": "input_text", "text": "What's in this image?"},
          {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
        ]
      }
    ]
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

response = client.responses.create(
    model="deepseek-v4-flash",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "What's in this image?"},
                {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
            ]
        }
    ]
)

print(response.output[0].content[0].text)
javascript
import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-xxx',
  baseURL: 'https://open.dieyuyun.com/v1',
})

const response = await client.responses.create({
  model: 'deepseek-v4-flash',
  input: [
    {
      role: 'user',
      content: [
        { type: 'input_text', text: "What's in this image?" },
        { type: 'input_image', image_url: 'https://example.com/photo.jpg' },
      ],
    },
  ],
})

console.log(response.output[0].content[0].text)

Multi-Turn Conversation (using previous_response_id)

json
// First request
{
  "model": "deepseek-v4-flash",
  "instructions": "You are a technical assistant.",
  "input": "What is machine learning?"
}

// Second request - reference the previous response
{
  "model": "deepseek-v4-flash",
  "previous_response_id": "resp_abc123",
  "input": "How is it different from deep learning?"
}

With Tool Calling

json
{
  "model": "deepseek-v4-flash",
  "instructions": "You are a helpful assistant.",
  "input": "What's the weather like in Beijing today?",
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get weather information for a city",
      "parameters": {
        "type": "object",
        "properties": {
          "city": {
            "type": "string",
            "description": "City name"
          }
        },
        "required": ["city"]
      }
    }
  ]
}

Streaming Output

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://open.dieyuyun.com/v1"
)

stream = client.responses.create(
    model="deepseek-v4-flash",
    input="Write a quicksort algorithm in Python",
    stream=True
)

for event in stream:
    if hasattr(event, 'delta') and event.delta:
        print(event.delta, end="", flush=True)

Response Format

Successful Response

json
{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1712345678,
  "status": "completed",
  "model": "deepseek-v4-flash",
  "output": [
    {
      "id": "msg_abc123",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The history of artificial intelligence can be traced back to the 1950s. The 1956 Dartmouth Conference is widely considered the founding moment of AI...",
          "annotations": []
        }
      ]
    }
  ],
  "parallel_tool_calls": true,
  "tools": [],
  "usage": {
    "input_tokens": 8,
    "output_tokens": 256,
    "total_tokens": 264
  }
}

Field Reference

FieldDescription
idUnique response identifier
objectAlways response
statusStatus: completed, in_progress, failed
outputOutput content array (replaces Chat Completions' choices)
output[].typeOutput type: message, tool_call
output[].content[].typeContent type: output_text
output[].content[].textModel response text
output[].content[].annotationsAnnotation information
usage.input_tokensNumber of input tokens
usage.output_tokensNumber of output tokens
usage.total_tokensTotal token count

Streaming SSE Events

Streaming output returns the following event types:

data: {"type":"response.created","response":{"id":"resp_abc123",...}}

data: {"type":"response.output_text.delta","delta":"Artificial"}

data: {"type":"response.output_text.delta","delta":" intelligence"}

data: {"type":"response.output_text.delta","delta":" history"}

data: {"type":"response.output_text.done","text":"Artificial intelligence history..."}

data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed",...}}

Error Response

json
{
  "error": {
    "message": "Invalid input: expected string or array",
    "type": "invalid_request_error",
    "param": "input",
    "code": "invalid_input"
  }
}

See Error Codes for details.

Compatibility

FeatureWuliang AIOpenAI Responses API
Response formatFully compatibleNative protocol
Multimodal inputSupports text / image / fileNative support
previous_response_idSupportedNative support
Tool calling (function)SupportedNative support
Built-in tools (web/file)Depends on upstream routingNative support
Streaming outputSupportedNative support
/responses/compactSupportedNative support
Endpoint path/v1/responses/v1/responses

TIP

The Responses API uses the output array instead of Chat Completions' choices, and input instead of messages. If your application already uses the Chat Completions API, there's no need to migrate -- both formats work seamlessly.

Best Practices

  • Choose the right API format: The Responses API is ideal for multimodal input and built-in context management; Chat Completions is better for simple conversation scenarios.
  • Leverage previous_response_id: Reference previous response IDs for multi-turn conversations without manually managing message history.
  • Use instructions instead of system: The Responses API uses the instructions field for system-level behavior.
  • Combine with /responses/compact: For long conversations, use the compact endpoint to compress context and save tokens.
  • Check SDK version: The Responses API requires a recent version of the OpenAI SDK (Python >= 1.50.0, Node.js >= 4.50.0).

Rate Limits

See Rate Limits for details.