OpenAI-compatible API

POST /v1/chat/completions

Creates a chat completion. Fully compatible with the OpenAI Chat Completions API.

Request headers

Header	Description
`Authorization: Bearer sk-ru-…`	Required. Your ru-llm API key.
`Content-Type: application/json`	Required.

Request body

Field	Type	Required	Description
`model`	string	Yes	Model ID (e.g. `gpt-4o`, `claude-3-5-sonnet-20241022`). See Models.
`messages`	array	Yes	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`.
`stream`	boolean	No	Set to `true` to receive a Server-Sent Events stream. Default: `false`.
`max_tokens`	integer	No	Maximum tokens to generate.
`temperature`	number	No	Sampling temperature 0–2. Default: 1.
`top_p`	number	No	Nucleus sampling.
`stop`	string or array	No	Stop sequences.
`n`	integer	No	Number of completions. Default: 1.

Non-streaming example

curl https://api.ru-llm.relay2.xyz/v1/chat/completions \
  -H "Authorization: Bearer sk-ru-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

Streaming example

curl https://api.ru-llm.relay2.xyz/v1/chat/completions \
  -H "Authorization: Bearer sk-ru-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Count to three."}],
    "stream": true
  }'

Response stream (SSE):

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"1"},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":", 2, 3."},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]}

data: [DONE]

Each event is a data: line containing JSON. The stream ends with data: [DONE].

Usage object

The usage object in non-streaming responses reports token consumption:

{
  "prompt_tokens": 24,
  "completion_tokens": 9,
  "total_tokens": 33
}

Cached input tokens may be reported in prompt_tokens_details.cached_tokens for supporting models.

GET /v1/models

Lists all models available on the gateway.

curl https://api.ru-llm.relay2.xyz/v1/models \
  -H "Authorization: Bearer sk-ru-YOUR_KEY"

Response:

{
  "object": "list",
  "data": [
    {"id": "gpt-4o", "object": "model", "created": 1715000000, "owned_by": "openai"},
    {"id": "claude-3-5-sonnet-20241022", "object": "model", "created": 1710000000, "owned_by": "anthropic"}
  ]
}

Using the OpenAI SDK

Point the OpenAI SDK’s base_url at the gateway and use your sk-ru-… key as the API key. No other changes are required.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.ru-llm.relay2.xyz/v1",
    api_key="sk-ru-YOUR_KEY",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

TypeScript / Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.ru-llm.relay2.xyz/v1',
  apiKey: 'sk-ru-YOUR_KEY',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);