Make API Calls
Overview
The Wuliang AI API provides an OpenAI-compatible interface for integrating AI capabilities into your applications. This guide covers everything you need to start making API calls, including authentication, request formatting, code examples in multiple languages, error handling, and rate limiting.
Prerequisites
- A registered Wuliang AI account
- At least one active API Key. See Manage API Keys for instructions.
- Your account has a positive balance. Check your balance on the console dashboard.
Base URL
All API requests should be sent to the following base URL:
https://open.dieyuyun.comAuthentication
All API requests require authentication via your API Key. The platform supports two authentication methods:
Bearer Token (Recommended)
Include your API Key in the Authorization header using the Bearer scheme:
Authorization: Bearer sk-your-api-key-herex-api-key Header
Alternatively, use the x-api-key header (default for Anthropic SDK):
x-api-key: sk-your-api-key-hereQuick Start
Python (OpenAI SDK)
Install the OpenAI Python SDK:
pip install openaiSend your first request:
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key-here",
base_url="https://open.dieyuyun.com"
)
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
temperature=0.7,
max_tokens=1024
)
print(response.choices[0].message.content)Node.js (OpenAI SDK)
Install the OpenAI Node.js SDK:
npm install openaiSend your first request:
import OpenAI from 'openai'
const client = new OpenAI({
apiKey: 'sk-your-api-key-here',
baseURL: 'https://open.dieyuyun.com',
})
const response = await client.chat.completions.create({
model: 'deepseek-v4-flash',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' },
],
temperature: 0.7,
max_tokens: 1024,
})
console.log(response.choices[0].message.content)cURL
curl https://open.dieyuyun.com/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1024
}'Anthropic SDK
The platform is compatible with the Anthropic Messages API:
import anthropic
client = anthropic.Anthropic(
base_url="https://open.dieyuyun.com",
api_key="sk-your-api-key-here"
)
message = client.messages.create(
model="deepseek-v4-flash",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(message.content[0].text)Request Parameters
Chat Completions
POST /v1/chat/completions| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model code, e.g. deepseek-v4-flash |
messages | array | Yes | List of message objects |
messages.role | string | Yes | Role: system, user, assistant |
messages.content | string | Yes | Message content |
temperature | number | No | Randomness control (0-2, default 1) |
max_tokens | integer | No | Maximum tokens to generate |
top_p | number | No | Nucleus sampling (0-1, default 1) |
stream | boolean | No | Enable streaming (default false) |
stop | string/array | No | Stop sequences |
Streaming Requests
Set stream: true to receive Server-Sent Events (SSE). Tokens are delivered incrementally as they are generated:
stream = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")The streaming response uses the following SSE format:
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"},"finish_reason":null}]}
data: [DONE]Response Format
Standard Response
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1700000000,
"model": "deepseek-v4-flash",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 10,
"total_tokens": 30
}
}Response Fields
| Field | Description |
|---|---|
id | Unique request identifier |
object | Response object type |
model | Model that processed the request |
choices | Array of response choices |
choices[].message.content | Generated text content |
choices[].finish_reason | Reason for stopping (stop, length) |
usage.prompt_tokens | Number of input tokens |
usage.completion_tokens | Number of output tokens |
usage.total_tokens | Total tokens consumed |
Error Handling
Error Response Format
All errors are returned in a consistent JSON format:
{
"error": {
"code": "RATE_LIMIT",
"message": "Rate limit exceeded. Please retry after 60s.",
"request_id": "req-xxx"
}
}Error Codes
| HTTP Status | Code | Description |
|---|---|---|
| 400 | INVALID_PARAMS | Invalid request parameters |
| 401 | UNAUTHORIZED | Invalid or expired API Key |
| 403 | FORBIDDEN | Access denied |
| 404 | MODEL_NOT_FOUND | Model not found or offline |
| 429 | RATE_LIMIT | Rate limit exceeded |
| 500 | INTERNAL_ERROR | Internal server error |
| 502 | UPSTREAM_ERROR | Upstream model service error |
| 504 | GATEWAY_TIMEOUT | Request timeout |
Handling Errors in Code
from openai import OpenAI, APIError, AuthenticationError, RateLimitError
client = OpenAI(
api_key="sk-your-api-key-here",
base_url="https://open.dieyuyun.com"
)
try:
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
except AuthenticationError:
print("Invalid API Key. Check your credentials.")
except RateLimitError:
print("Rate limit exceeded. Please retry later.")
except APIError as e:
print(f"API error: {e}")Rate Limiting
The platform enforces rate limits to ensure fair usage and system stability.
Default Limits
| Limit | Default Value |
|---|---|
| RPM (Requests Per Minute) | 60 |
| TPM (Tokens Per Minute) | 100,000 |
Rate Limit Headers
When a rate limit applies, the response includes these headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Total limit for the current window |
X-RateLimit-Remaining | Remaining requests in the current window |
Retry-After | Suggested retry wait time in seconds |
Custom Rate Limits
You can configure per-key RPM and TPM limits when creating or editing an API Key. See Manage API Keys for details.
Best Practices
- Always handle errors - Implement proper error handling with retry logic for transient failures.
- Use streaming for long responses - Streaming reduces perceived latency by delivering tokens incrementally.
- Set reasonable max_tokens - Avoid unnecessarily high
max_tokensvalues to control costs. - Monitor usage - Regularly check the Request Logs to track usage patterns.
- Secure your API Key - Never expose your API Key in client-side code. Use environment variables or a backend proxy.
- Implement exponential backoff - When retrying failed requests, use exponential backoff to avoid overwhelming the API.
Notes
- The API is compatible with the OpenAI SDK format. Any tool or library that supports the OpenAI API can be used with Wuliang AI by changing the
base_url. - The Anthropic Messages API is also supported at
/v1/messages. - All requests must use HTTPS. HTTP requests will be rejected.
- Request and response payloads are UTF-8 encoded JSON.
Related Documentation
- Manage API Keys - Create and configure API Keys
- Models & Pricing - Browse available models and pricing
- Request Logs - View and analyze API request logs
- API Reference - Complete API reference documentation
- Playground - Test API endpoints interactively