Anthropic-compatible API
POST /v1/messages
Creates a message completion. Fully compatible with the Anthropic Messages API.
Request headers
| Header | Description |
|---|---|
x-api-key: sk-ru-… | Required. Your ru-llm API key. (Also accepted as Authorization: Bearer sk-ru-….) |
anthropic-version: 2023-06-01 | Required. API version string. |
Content-Type: application/json | Required. |
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (e.g. claude-3-5-sonnet-20241022, gpt-4o). See Models. |
messages | array | Yes | Array of {role, content} objects. Roles: user, assistant. |
max_tokens | integer | Yes | Maximum tokens to generate. |
system | string | No | System prompt. |
stream | boolean | No | Set to true for SSE streaming. Default: false. |
temperature | number | No | Sampling temperature 0–1. |
top_p | number | No | Nucleus sampling. |
stop_sequences | array | No | Stop sequences. |
Non-streaming example
curl https://api.ru-llm.relay2.xyz/v1/messages \ -H "x-api-key: sk-ru-YOUR_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 256, "system": "You are a helpful assistant.", "messages": [ {"role": "user", "content": "What is the capital of France?"} ] }'Response:
{ "id": "msg_01XFDUDYJgAACzvnptvVoYEL", "type": "message", "role": "assistant", "content": [ {"type": "text", "text": "The capital of France is Paris."} ], "model": "claude-3-5-sonnet-20241022", "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 19, "output_tokens": 9 }}Streaming example
curl https://api.ru-llm.relay2.xyz/v1/messages \ -H "x-api-key: sk-ru-YOUR_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 256, "messages": [{"role": "user", "content": "Count to three."}], "stream": true }'Response stream (SSE):
event: message_startdata: {"type":"message_start","message":{"id":"msg_abc","type":"message","role":"assistant","content":[],"model":"claude-3-5-sonnet-20241022","stop_reason":null,"usage":{"input_tokens":14,"output_tokens":1}}}
event: content_block_startdata: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_deltadata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"1, 2, 3."}}
event: content_block_stopdata: {"type":"content_block_stop","index":0}
event: message_deltadata: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":9}}
event: message_stopdata: {"type":"message_stop"}The streaming event sequence is: message_start → content_block_start → one or more content_block_delta → content_block_stop → message_delta → message_stop.
Usage object
The usage object reports token consumption:
{ "input_tokens": 19, "output_tokens": 9, "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0}Cache token fields are present for models that support prompt caching.
Using the Anthropic SDK
Point the Anthropic SDK’s base_url at the gateway and use your sk-ru-… key as api_key. No other changes are required.
Python
import anthropic
client = anthropic.Anthropic( base_url="https://api.ru-llm.relay2.xyz", api_key="sk-ru-YOUR_KEY",)
message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=256, messages=[{"role": "user", "content": "Hello!"}],)print(message.content[0].text)TypeScript / Node.js
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ baseURL: 'https://api.ru-llm.relay2.xyz', apiKey: 'sk-ru-YOUR_KEY',});
const message = await client.messages.create({ model: 'claude-3-5-sonnet-20241022', max_tokens: 256, messages: [{ role: 'user', content: 'Hello!' }],});console.log(message.content[0].text);