Anthropic-compatible API

POST /v1/messages

Creates a message completion. Fully compatible with the Anthropic Messages API.

Request headers

Header	Description
`x-api-key: sk-ru-…`	Required. Your ru-llm API key. (Also accepted as `Authorization: Bearer sk-ru-…`.)
`anthropic-version: 2023-06-01`	Required. API version string.
`Content-Type: application/json`	Required.

Request body

Field	Type	Required	Description
`model`	string	Yes	Model ID (e.g. `claude-3-5-sonnet-20241022`, `gpt-4o`). See Models.
`messages`	array	Yes	Array of `{role, content}` objects. Roles: `user`, `assistant`.
`max_tokens`	integer	Yes	Maximum tokens to generate.
`system`	string	No	System prompt.
`stream`	boolean	No	Set to `true` for SSE streaming. Default: `false`.
`temperature`	number	No	Sampling temperature 0–1.
`top_p`	number	No	Nucleus sampling.
`stop_sequences`	array	No	Stop sequences.

Non-streaming example

curl https://api.ru-llm.relay2.xyz/v1/messages \
  -H "x-api-key: sk-ru-YOUR_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "system": "You are a helpful assistant.",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Response:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {"type": "text", "text": "The capital of France is Paris."}
  ],
  "model": "claude-3-5-sonnet-20241022",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 19,
    "output_tokens": 9
  }
}

Streaming example

curl https://api.ru-llm.relay2.xyz/v1/messages \
  -H "x-api-key: sk-ru-YOUR_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Count to three."}],
    "stream": true
  }'

Response stream (SSE):

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc","type":"message","role":"assistant","content":[],"model":"claude-3-5-sonnet-20241022","stop_reason":null,"usage":{"input_tokens":14,"output_tokens":1}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"1, 2, 3."}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":9}}

event: message_stop
data: {"type":"message_stop"}

The streaming event sequence is: message_start → content_block_start → one or more content_block_delta → content_block_stop → message_delta → message_stop.

Usage object

The usage object reports token consumption:

{
  "input_tokens": 19,
  "output_tokens": 9,
  "cache_read_input_tokens": 0,
  "cache_creation_input_tokens": 0
}

Cache token fields are present for models that support prompt caching.

Using the Anthropic SDK

Point the Anthropic SDK’s base_url at the gateway and use your sk-ru-… key as api_key. No other changes are required.

Python

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.ru-llm.relay2.xyz",
    api_key="sk-ru-YOUR_KEY",
)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)

TypeScript / Node.js

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.ru-llm.relay2.xyz',
  apiKey: 'sk-ru-YOUR_KEY',
});

const message = await client.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 256,
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(message.content[0].text);