Skip to content

Anthropic-compatible API

POST /v1/messages

Creates a message completion. Fully compatible with the Anthropic Messages API.

Request headers

HeaderDescription
x-api-key: sk-ru-…Required. Your ru-llm API key. (Also accepted as Authorization: Bearer sk-ru-….)
anthropic-version: 2023-06-01Required. API version string.
Content-Type: application/jsonRequired.

Request body

FieldTypeRequiredDescription
modelstringYesModel ID (e.g. claude-3-5-sonnet-20241022, gpt-4o). See Models.
messagesarrayYesArray of {role, content} objects. Roles: user, assistant.
max_tokensintegerYesMaximum tokens to generate.
systemstringNoSystem prompt.
streambooleanNoSet to true for SSE streaming. Default: false.
temperaturenumberNoSampling temperature 0–1.
top_pnumberNoNucleus sampling.
stop_sequencesarrayNoStop sequences.

Non-streaming example

Terminal window
curl https://api.ru-llm.relay2.xyz/v1/messages \
-H "x-api-key: sk-ru-YOUR_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 256,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'

Response:

{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{"type": "text", "text": "The capital of France is Paris."}
],
"model": "claude-3-5-sonnet-20241022",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 19,
"output_tokens": 9
}
}

Streaming example

Terminal window
curl https://api.ru-llm.relay2.xyz/v1/messages \
-H "x-api-key: sk-ru-YOUR_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Count to three."}],
"stream": true
}'

Response stream (SSE):

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc","type":"message","role":"assistant","content":[],"model":"claude-3-5-sonnet-20241022","stop_reason":null,"usage":{"input_tokens":14,"output_tokens":1}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"1, 2, 3."}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":9}}
event: message_stop
data: {"type":"message_stop"}

The streaming event sequence is: message_startcontent_block_start → one or more content_block_deltacontent_block_stopmessage_deltamessage_stop.

Usage object

The usage object reports token consumption:

{
"input_tokens": 19,
"output_tokens": 9,
"cache_read_input_tokens": 0,
"cache_creation_input_tokens": 0
}

Cache token fields are present for models that support prompt caching.


Using the Anthropic SDK

Point the Anthropic SDK’s base_url at the gateway and use your sk-ru-… key as api_key. No other changes are required.

Python

import anthropic
client = anthropic.Anthropic(
base_url="https://api.ru-llm.relay2.xyz",
api_key="sk-ru-YOUR_KEY",
)
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=256,
messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)

TypeScript / Node.js

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
baseURL: 'https://api.ru-llm.relay2.xyz',
apiKey: 'sk-ru-YOUR_KEY',
});
const message = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 256,
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(message.content[0].text);