Best open-source model, reasoning rivals GPT-5
from ¥2/MTok
Alibaba flagship, strong multilingual performance
from ¥4/MTok
Zhipu AI latest, top-tier Chinese understanding
from ¥3/MTok
Deep reasoning model, specialized in math/logic
from ¥6/MTok
Change one line of base_url. Works with OpenAI SDK, LangChain, LlamaIndex and your existing toolchain.
Chinese models cost far less than US counterparts. DeepSeek V4 is 10x cheaper than GPT-4o, with rapidly closing quality gap.
Full SSE streaming and standard response support, for both real-time chat and batch processing.
Singapore/US-West/Tokyo multi-node deployment. Global average latency <200ms.
Works with any OpenAI SDK. Just change the base URL.
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key",
base_url="https://api.relayai.app/v1"
)
response = client.chat.completions.create(
model="deepseek-v4",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)| Model | Input (per MTok) | Output (per MTok) | Context |
|---|---|---|---|
| DeepSeek-V4 | ¥2.0 | ¥8.0 | 1M |
| Qwen3-Max | ¥4.0 | ¥12.0 | 128K |
| GLM-5-Flash | ¥1.0 | ¥4.0 | 128K |
| DeepSeek-R1 | ¥6.0 | ¥24.0 | 64K |
Prices in CNY. Compared to GPT-4o at ¥20/MTok input.