Access powerful open-weight language models through an OpenAI-compatible API. Pay only for what you use.
Prices in USD per million tokens. Use the model ID shown below each name in your API requests.
| Model | Provider | Input / M tokens | Output / M tokens |
|---|---|---|---|
GPT OSS 120B gpt-oss-120b | OpenAI OSS | $0.07 | $0.27 |
GPT OSS 20B gpt-oss-20b | OpenAI OSS | $0.05 | $0.19 |
GLM-5 glm-5 | Zhipu AI | $0.75 | $2.40 |
GLM-4.7 glm-4.7 | Zhipu AI | $0.45 | $1.65 |
The API is fully compatible with the OpenAI SDK — just swap the base URL and your API key.
Pass your API key in the Authorization header of every request:
Send a POST request to /chat/completions with a model ID and messages array.
curl https://api.dinference.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DINFERENCE_API_KEY" \
-d '{
"model": "glm-5",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'Add "stream": true to your request body (or stream=True in the SDK) to receive a server-sent events stream using the standard OpenAI streaming format.
Create an account and start using the API — no credit card required.