Skip to main content

πŸ—¨οΈ LLM Chat API

API usage​

Our OpenAI-compatible chat API is designed to be a helpful assistant. We offer three state-of-the-art AI models: Spark, Radiance, and Supernova. All of them can be used for a wide range of applications. They excel in tasks such as common sense reasoning, world knowledge, reading comprehension, code-related tasks, and much more. Try it yourself in our AI API playground.

API Usage​

Our OpenAI-compatible API takes a list of messages as input and provides a generated message from the AI (assistant) as output. If you haven't done so already, you'll need to create an API key to authenticate your requests.

// npm install --save openai or yarn add openai
import OpenAI from "openai";

const openai = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://apigateway.avangenio.net",
});

const completion = await openai.chat.completions.create({
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "How many days are in a year?" },
],
model: "radiance",
});

console.log(completion.choices[0].message.content);

API Response​

The available format for obtaining the transcript is a file format that includes timestamps and can be used to display subtitles in video players.

{
"id": "cmpl-8a9ba025b8a744e881636351a26e4642",
"object": "chat.completion",
"created": 1697721484,
"model": "radiance",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "There are typically 365 days in a year. However, in a leap year, which occurs every four years, there are 366 days. Leap years are used to account for the extra fraction of a day that is not included in a regular year."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 30,
"total_tokens": 90,
"completion_tokens": 60
}
}

If you want only the response message, use the following code:

response["choices"][0]["message"]["content"];

API Parameters​

The endpoint POST https://apigateway.avangenio.net/v1/chat/completions accepts the following parameters:

ParameterRequiredTypeDefaultDescription
messagesyesarray-A list of message objects representing the ongoing conversation, each containing a role (system, user, assistant) and content. See an example here.
modelyesstring-Specify the model identifier to be used. We support spark, radiance, and supernova.
max_tokensnointegerinfinitySet the output token limit, ensuring cost-effectiveness.
nnointeger1Specify the number of chat completion options to be generated.
stopnostring or arraynullA sequence of tokens that instructs the API to stop generation when it appears in the output.
streamnobooleanfalseWhen the parameter is enabled, incremental message updates are sent as server-sent events with data-only messages. Tokens are delivered as they become available. The completion of the stream is indicated by data: [DONE].

Example code with Python library
frequency_penaltynointeger0It ranges between -2.0 and 2.0. Positive values decrease the likelihood of generating identical text by penalizing tokens based on how frequently they were used in the previous text.
precense_penaltynointeger0It ranges between -2.0 and 2.0. Positive values increase the likelihood of using different words by penalizing tokens that were used in the previous text, which enhances the probability of introducing new topics.
temperaturenointeger1It ranges between 0.0 and 2.0. Higher values, such as 1.5, introduce randomness into the output, while lower values, such as 0.3, bring focus and determinism.
top_pnointeger1Top-p, also known as nucleus sampling, shapes token selection. Adjust it to focus or diversify the results. For example, a top-p value of 1 includes all tokens, while a lower value like 0.2 prioritizes high-probability tokens for a more focused outcome.

Tips​

  • Conversations usually start with a system message, followed by alternating messages from the user and the assistant.
  • The system message determines the assistant's behavior and can be customized, but it is not mandatory.
  • User messages make requests or comments, while assistant messages store previous responses or demonstrate desired behaviors.
  • It can be helpful to include the conversation history when user instructions reference previous messages, as models do not remember past requests.