Chat Completions

Generate chat completions using the OpenAI-compatible API.

Endpoint

POST /v1/chat/completions

Request Body

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "max_tokens": 100,
  "temperature": 0.7,
  "stream": false
}

Parameters

Parameter
Type
Required
Description

messages

array

Yes

Conversation messages

max_tokens

integer

No

Maximum tokens to generate

temperature

float

No

Sampling temperature (0-2)

top_p

float

No

Nucleus sampling (0-1)

stream

boolean

No

Enable streaming response

stop

string/array

No

Stop sequences

Message Format

Supported roles: system, user, assistant

Response

Streaming

Enable streaming for real-time token generation:

Streaming Response

Examples

Python

JavaScript

curl

Last updated