Skip to content

OpenAI-compatible API

POST /v1/chat/completions

Creates a chat completion. Fully compatible with the OpenAI Chat Completions API.

Request headers

HeaderDescription
Authorization: Bearer sk-ru-…Required. Your ru-llm API key.
Content-Type: application/jsonRequired.

Request body

FieldTypeRequiredDescription
modelstringYesModel ID (e.g. gpt-4o, claude-3-5-sonnet-20241022). See Models.
messagesarrayYesArray of {role, content} objects. Roles: system, user, assistant.
streambooleanNoSet to true to receive a Server-Sent Events stream. Default: false.
max_tokensintegerNoMaximum tokens to generate.
temperaturenumberNoSampling temperature 0–2. Default: 1.
top_pnumberNoNucleus sampling.
stopstring or arrayNoStop sequences.
nintegerNoNumber of completions. Default: 1.

Non-streaming example

Terminal window
curl https://api.ru-llm.relay2.xyz/v1/chat/completions \
-H "Authorization: Bearer sk-ru-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'

Response:

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1710000000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 9,
"total_tokens": 33
}
}

Streaming example

Terminal window
curl https://api.ru-llm.relay2.xyz/v1/chat/completions \
-H "Authorization: Bearer sk-ru-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Count to three."}],
"stream": true
}'

Response stream (SSE):

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""},"index":0}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"1"},"index":0}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":", 2, 3."},"index":0}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]

Each event is a data: line containing JSON. The stream ends with data: [DONE].

Usage object

The usage object in non-streaming responses reports token consumption:

{
"prompt_tokens": 24,
"completion_tokens": 9,
"total_tokens": 33
}

Cached input tokens may be reported in prompt_tokens_details.cached_tokens for supporting models.


GET /v1/models

Lists all models available on the gateway.

Terminal window
curl https://api.ru-llm.relay2.xyz/v1/models \
-H "Authorization: Bearer sk-ru-YOUR_KEY"

Response:

{
"object": "list",
"data": [
{"id": "gpt-4o", "object": "model", "created": 1715000000, "owned_by": "openai"},
{"id": "claude-3-5-sonnet-20241022", "object": "model", "created": 1710000000, "owned_by": "anthropic"}
]
}

Using the OpenAI SDK

Point the OpenAI SDK’s base_url at the gateway and use your sk-ru-… key as the API key. No other changes are required.

Python

from openai import OpenAI
client = OpenAI(
base_url="https://api.ru-llm.relay2.xyz/v1",
api_key="sk-ru-YOUR_KEY",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

TypeScript / Node.js

import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.ru-llm.relay2.xyz/v1',
apiKey: 'sk-ru-YOUR_KEY',
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);