Skip to content

Make API Calls

Overview

The Wuliang AI API provides an OpenAI-compatible interface for integrating AI capabilities into your applications. This guide covers everything you need to start making API calls, including authentication, request formatting, code examples in multiple languages, error handling, and rate limiting.

API calling overview

Prerequisites

  • A registered Wuliang AI account
  • At least one active API Key. See Manage API Keys for instructions.
  • Your account has a positive balance. Check your balance on the console dashboard.

Base URL

All API requests should be sent to the following base URL:

https://open.dieyuyun.com

Authentication

All API requests require authentication via your API Key. The platform supports two authentication methods:

Include your API Key in the Authorization header using the Bearer scheme:

Authorization: Bearer sk-your-api-key-here

x-api-key Header

Alternatively, use the x-api-key header (default for Anthropic SDK):

x-api-key: sk-your-api-key-here

Authentication header example

Quick Start

Python (OpenAI SDK)

Install the OpenAI Python SDK:

bash
pip install openai

Send your first request:

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key-here",
    base_url="https://open.dieyuyun.com"
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

Python example

Node.js (OpenAI SDK)

Install the OpenAI Node.js SDK:

bash
npm install openai

Send your first request:

javascript
import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'sk-your-api-key-here',
  baseURL: 'https://open.dieyuyun.com',
})

const response = await client.chat.completions.create({
  model: 'deepseek-v4-flash',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
  temperature: 0.7,
  max_tokens: 1024,
})

console.log(response.choices[0].message.content)

Node.js example

cURL

bash
curl https://open.dieyuyun.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

cURL example

Anthropic SDK

The platform is compatible with the Anthropic Messages API:

python
import anthropic

client = anthropic.Anthropic(
    base_url="https://open.dieyuyun.com",
    api_key="sk-your-api-key-here"
)

message = client.messages.create(
    model="deepseek-v4-flash",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(message.content[0].text)

Anthropic SDK example

Request Parameters

Chat Completions

POST /v1/chat/completions
ParameterTypeRequiredDescription
modelstringYesModel code, e.g. deepseek-v4-flash
messagesarrayYesList of message objects
messages.rolestringYesRole: system, user, assistant
messages.contentstringYesMessage content
temperaturenumberNoRandomness control (0-2, default 1)
max_tokensintegerNoMaximum tokens to generate
top_pnumberNoNucleus sampling (0-1, default 1)
streambooleanNoEnable streaming (default false)
stopstring/arrayNoStop sequences

Request parameters table

Streaming Requests

Set stream: true to receive Server-Sent Events (SSE). Tokens are delivered incrementally as they are generated:

python
stream = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Streaming example

The streaming response uses the following SSE format:

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"},"finish_reason":null}]}

data: [DONE]

Response Format

Standard Response

json
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}

Response format

Response Fields

FieldDescription
idUnique request identifier
objectResponse object type
modelModel that processed the request
choicesArray of response choices
choices[].message.contentGenerated text content
choices[].finish_reasonReason for stopping (stop, length)
usage.prompt_tokensNumber of input tokens
usage.completion_tokensNumber of output tokens
usage.total_tokensTotal tokens consumed

Response fields

Error Handling

Error Response Format

All errors are returned in a consistent JSON format:

json
{
  "error": {
    "code": "RATE_LIMIT",
    "message": "Rate limit exceeded. Please retry after 60s.",
    "request_id": "req-xxx"
  }
}

Error response format

Error Codes

HTTP StatusCodeDescription
400INVALID_PARAMSInvalid request parameters
401UNAUTHORIZEDInvalid or expired API Key
403FORBIDDENAccess denied
404MODEL_NOT_FOUNDModel not found or offline
429RATE_LIMITRate limit exceeded
500INTERNAL_ERRORInternal server error
502UPSTREAM_ERRORUpstream model service error
504GATEWAY_TIMEOUTRequest timeout

Error codes table

Handling Errors in Code

python
from openai import OpenAI, APIError, AuthenticationError, RateLimitError

client = OpenAI(
    api_key="sk-your-api-key-here",
    base_url="https://open.dieyuyun.com"
)

try:
    response = client.chat.completions.create(
        model="deepseek-v4-flash",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)
except AuthenticationError:
    print("Invalid API Key. Check your credentials.")
except RateLimitError:
    print("Rate limit exceeded. Please retry later.")
except APIError as e:
    print(f"API error: {e}")

Error handling example

Rate Limiting

The platform enforces rate limits to ensure fair usage and system stability.

Default Limits

LimitDefault Value
RPM (Requests Per Minute)60
TPM (Tokens Per Minute)100,000

Rate Limit Headers

When a rate limit applies, the response includes these headers:

HeaderDescription
X-RateLimit-LimitTotal limit for the current window
X-RateLimit-RemainingRemaining requests in the current window
Retry-AfterSuggested retry wait time in seconds

Rate limit headers

Custom Rate Limits

You can configure per-key RPM and TPM limits when creating or editing an API Key. See Manage API Keys for details.

Best Practices

  1. Always handle errors - Implement proper error handling with retry logic for transient failures.
  2. Use streaming for long responses - Streaming reduces perceived latency by delivering tokens incrementally.
  3. Set reasonable max_tokens - Avoid unnecessarily high max_tokens values to control costs.
  4. Monitor usage - Regularly check the Request Logs to track usage patterns.
  5. Secure your API Key - Never expose your API Key in client-side code. Use environment variables or a backend proxy.
  6. Implement exponential backoff - When retrying failed requests, use exponential backoff to avoid overwhelming the API.

Best practices

Notes

  • The API is compatible with the OpenAI SDK format. Any tool or library that supports the OpenAI API can be used with Wuliang AI by changing the base_url.
  • The Anthropic Messages API is also supported at /v1/messages.
  • All requests must use HTTPS. HTTP requests will be rejected.
  • Request and response payloads are UTF-8 encoded JSON.