POST
/
api
/
request-logs
/
create

The Async logging endpoint allows you to directly log an LLM inference to Keywords AI, instead of using Keywords AI as a proxy with the chat completion endpoint.

model
string
required

Model used for the LLM inference. Default is an empty string. See the list of model here

prompt_messages
array
required

An Array of prompt messages. Default is an empty list.

"prompt_messages": [
  {
    "role": "user",
    "content": "Hi"
  },
  # optional function call
  {
    "role": "tool",
    "tool_call_id": "your tool call id",
    "content": "...." # tool call content
  }
],
completion_message
dict
required

Completion message in JSON format. Default is an empty dictionary.

"completion_message": {
    "role": "assistant",
    "content": "Hi, how can I assist you today?"
},
cost
float
default:
0

Cost of the inference in US dollars.

completion_tokens
integer

Number of tokens in the completion.

completion_unit_price
number

Pass this parameter in if you want to log your self-host / fine-tuned model.

customer_params
string

Parameters related to the customer. Default is an empty dictionary.

custom_identifier
string

You can use this parameter to send an extra custom tag with your request. This will help you to identify LLM logs faster than metadata parameter, because it’s indexed. You can see it in Logs with name Custom ID field.

error_message
text

Error message if the LLM inference failed. Default is an empty string.

full_request
object

The full request object. Default is an empty dictionary. This is optional and it is helpful for logging configurations such as temperature, precence_penalty etc.

completion_messages, tool_calls will be automatically extracted from full_request

{
"full_request": {
"temperature": 0.5,
"top_p": 0.5,
//... other parameters
},
}

frequency_penalty
number

Specify how much to penalize new tokens based on their existing frequency in the text so far. Decreases the model’s likelihood of repeating the same line verbatim

generation_time
float
default:
0

Total generation time. Generation time = TTFT (Time To First Token) + TPOT (Time Per Output Token) * #tokens. Do not confuse this with ttft.

keywordsai_api_controls
object

Use this parameter to control the behavior of the Keywords AI API. Default is an empty dictionary.

metadata
dict

You can add any key-value pair to this metadata field for your reference.

presence_penalty
number

Specify how much to penalize new tokens based on whether they appear in the text so far. Increases the model’s likelihood of talking about new topics

prompt_tokens
integer

Number of tokens in the prompt.

prompt_unit_price
number

Pass this parameter in if you want to log your self-host / fine-tuned model.

response_format
object

The format of the response.

stream
boolean

Whether the LLM inference was streamed. Default is false.

status_code
integer
default:
200

The status code of the LLM inference. Default is 200 (ok). See supported status codes here.

stop
array[string]

Stop sequence

temperature
number
default:
1

Controls randomness in the output in the range of 0-2, higher temperature will a more random response.

tools
array

A list of tools the model may call. Currently, only functions are supported as a tool.

tool_choice
object

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message.

top_p
number
default:
1

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

ttft
float
default:
0

Time to first token. The time it takes for the model to generate the first token after receiving a request.

usage
object

Usage details for the LLM inference. Currently, only support Prompt Caching.

warnings
string

Any warnings that occurred during the LLM inference. You could pass a warning message here. Default is an empty string.