Create log

This guide shows you how to log any type of LLM request to Keywords AI using the universal input/output design that supports all span types.

Log size limit: 20MBEach log payload has a maximum size limit of 20MB. This includes the input, output, and all other fields combined. Logs exceeding this limit will be rejected.

Input/Output

Keywords AI uses universal input and output fields across all span types.

Chat completions: Messages arrays
Embeddings: Text strings or arrays
Transcriptions: Audio metadata → text
Speech: Text → audio
Workflows/Tasks: Any custom data structure
Agent operations: Complex nested objects

How it works:

You provide input and output fields in any structure (string, object, array, etc.)
Set log_type to indicate span type ("chat", "embedding", "workflow", etc.)
Keywords AI automatically extracts type-specific fields for backward compatibility
Your data is stored efficiently and retrieved with both universal and type-specific fields

For complete log_type specifications, see log types.

Legacy field support

For backward compatibility, Keywords AI still supports legacy fields:

prompt_messages

array

Legacy field. Use input instead.

completion_message

object

Legacy field. Use output instead.

Request body

Core fields

input

string | object | array

Universal input field for the span. Structure depends on log_type:

Chat: JSON string of messages array or messages array directly
Embedding: Text string or array of strings
Workflow/Task: Any JSON-serializable structure
Transcription: Audio file reference or metadata object
Speech: Text string or TTS configuration object

See the Span Types section below for complete specifications.

Example for Chat

"input": "[{\"role\":\"system\",\"content\":\"You are helpful.\"},{\"role\":\"user\",\"content\":\"Hello\"}]"

Example for Embedding

"input": "Keywords AI is an LLM observability platform"

Example for Workflow

"input": "{\"query\":\"Help with order #12345\",\"context\":{\"user_id\":\"123\"}}"

output

string | object | array

Universal output field for the span. Structure depends on log_type:

Chat: JSON string of completion message or message object directly
Embedding: Array of vector embeddings
Workflow/Task: Any JSON-serializable result structure
Transcription: Transcribed text string
Speech: Audio file reference or base64 audio data

Example for Chat

"output": "{\"role\":\"assistant\",\"content\":\"Hello! How can I help you?\"}"

Example for Embedding

"output": "[0.123, -0.456, 0.789, ...]"

log_type

string

default:"chat"

Type of span being logged. Determines how input and output are parsed.Supported types:

"chat" - Chat completion requests (default)
"completion" - Legacy completion requests
"response" - OpenAI Response API
"embedding" - Embedding generation
"transcription" - Speech-to-text
"speech" - Text-to-speech
"workflow" or "agent" - Workflow/agent execution
"task" or "tool" - Task/tool execution
"function" - Function call
"generation" - Generation span
"handoff" - Agent handoff
"guardrail" - Safety check
"custom" - Custom span type

Default Behavior

If not specified, defaults to "chat". For chat types, the system automatically extracts prompt_messages and completion_message from input and output for backward compatibility.For complete specifications of each type, see log types.

model

string

The model used for the inference. Optional but recommended for chat/completion/embedding types.

Example

"model": "gpt-4o-mini"

Telemetry

Performance metrics and cost tracking for monitoring LLM efficiency.

usage

object

Token usage information for the request.

Properties

prompt_tokens

integer

Number of tokens in the prompt/input.

completion_tokens

integer

Number of tokens in the completion/output.

total_tokens

integer

Total tokens (prompt + completion).

prompt_tokens_details

object

Detailed breakdown of prompt tokens (e.g., cached tokens).

cache_creation_prompt_tokens

integer

For Anthropic models: tokens used to create the cache.

Example

{
  "usage": {
    "prompt_tokens": 150,
    "completion_tokens": 85,
    "total_tokens": 235,
    "prompt_tokens_details": {
      "cached_tokens": 10
    }
  }
}

cost

float

Cost of the inference in US dollars. If not provided, will be calculated automatically based on model pricing.

latency

float

Total request latency in seconds (replaces deprecated generation_time).

Previously called generation_time. For backward compatibility, both field names are supported.

time_to_first_token

float

Time to first token (TTFT) in seconds. Useful for streaming responses and voice AI applications.

Previously called ttft. Both field names are supported.

tokens_per_second

float

Generation speed in tokens per second.

Metadata

Custom tracking and identification parameters for advanced analytics and filtering.

metadata

object

You can add any key-value pair to this metadata field for your reference. Useful for custom analytics and filtering.

Example

{
  "metadata": {
    "language": "en",
    "environment": "production",
    "version": "v1.0.0",
    "feature": "chat_support",
    "user_tier": "premium"
  }
}

customer_identifier

string

An identifier for the customer that invoked this request. Helps with visualizing user activities. See customer identifier details.

Example

"customer_identifier": "user_123"

customer_params

object

Extended customer information (alternative to individual customer fields).

Properties

customer_identifier

string

Customer identifier.

name

string

Customer name.

string

Customer email.

Example

{
  "customer_params": {
    "customer_identifier": "customer_123",
    "name": "John Doe",
    "email": "john.doe@example.com"
  }
}

thread_identifier

string

A unique identifier for the conversation thread. Useful for multi-turn conversations.

custom_identifier

string

Same functionality as metadata, but indexed for faster querying.

Example

"custom_identifier": "ticket_12345"

group_identifier

string

Group identifier. Use to group related logs together.

Workflow & tracing

Parameters for distributed tracing and workflow tracking.

trace_unique_id

string

Unique identifier for the trace. Used to link multiple spans together in distributed tracing.

span_workflow_name

string

Name of the workflow this span belongs to.

span_name

string

Name of this specific span/task within the workflow.

span_parent_id

string

ID of the parent span. Used to build the trace hierarchy.

Advanced parameters

Tool calls and function calling

tools

array

A list of tools the model may call. Currently, only functions are supported as a tool.

Properties

type

string

required

The type of the tool. Currently, only function is supported.

function

object

required

Properties

name

string

required

The name of the function.

description

string

A description of what the function does.

parameters

object

The parameters the function accepts.

Example

"tools": [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

tool_choice

string | object

Controls which (if any) tool is called by the model. Can be "none", "auto", or an object specifying a specific tool.

Example

"tool_choice": {
    "type": "function",
    "function": {
        "name": "get_current_weather"
    }
}

Response configuration

response_format

object

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs.

Possible types

Text: { "type": "text" } - Default response format
JSON Schema: { "type": "json_schema", "json_schema": {...} } - Structured outputs
JSON Object: { "type": "json_object" } - Legacy JSON format

Model configuration

temperature

number

default:1

Controls randomness in the output (0-2). Higher values produce more random responses.

top_p

number

default:1

Nucleus sampling parameter. Alternative to temperature.

frequency_penalty

number

Penalizes tokens based on their frequency in the text so far.

presence_penalty

number

Penalizes tokens based on whether they appear in the text so far.

max_tokens

integer

Maximum number of tokens to generate.

stop

array[string]

Stop sequences where generation will stop.

Error handling and status

status_code

integer

default:200

The HTTP status code for the request. Default is 200 (success).

Supported status codes

All valid HTTP status codes are supported: 200, 201, 400, 401, 403, 404, 429, 500, 502, 503, 504, etc.

error_message

string

Error message if the request failed. Default is empty string.

warnings

string | object

Any warnings that occurred during the request.

status

string

Request status. Common values: "success", "error".

Additional configuration

stream

boolean

default:false

Whether the response was streamed.

prompt_id

string

ID of the prompt template used. See Prompts documentation.

prompt_name

string

Name of the prompt template.

is_custom_prompt

boolean

default:false

Whether the prompt is a custom prompt. Set to true if using custom prompt_id.

timestamp

string

ISO 8601 timestamp when the request completed.

Example

"timestamp": "2025-01-01T10:30:00Z"

start_time

string

ISO 8601 timestamp when the request started.

full_request

object

The full request object. Useful for logging additional configuration parameters.

Tool calls and other nested objects will be automatically extracted from full_request.

full_response

object

The full response object from the model provider.

Pricing configuration

prompt_unit_price

number

Custom price per 1M prompt tokens. Used for self-hosted or fine-tuned models.

Example

"prompt_unit_price": 0.0042  // $0.0042 per 1M tokens

completion_unit_price

number

Custom price per 1M completion tokens. Used for self-hosted or fine-tuned models.

Example

"completion_unit_price": 0.0042  // $0.0042 per 1M tokens

API controls

keywordsai_api_controls

object

Control the behavior of the Keywords AI logging API.

Properties

block

boolean

default:true

If false, the server immediately returns initialization status without waiting for log completion.

Example

{
  "keywordsai_api_controls": {
    "block": true
  }
}

positive_feedback

boolean

Whether the user liked the output. true means positive feedback.

Observe

Develop

Evals

Manage

Automation

Reference

Input/Output

Legacy field support

Request body

Core fields

Telemetry

Metadata

Workflow & tracing

Advanced parameters

Tool calls and function calling

Response configuration

Model configuration

Error handling and status

Additional configuration

Pricing configuration

API controls

Observe

Develop

Evals

Manage

Automation

Reference

​Input/Output

​Legacy field support

​Request body

​Core fields

​Telemetry

​Metadata

​Workflow & tracing

​Advanced parameters

​Tool calls and function calling

​Response configuration

​Model configuration

​Error handling and status

​Additional configuration

​Pricing configuration

​API controls

Input/Output

Legacy field support

Request body

Core fields

Telemetry

Metadata

Workflow & tracing

Advanced parameters

Tool calls and function calling

Response configuration

Model configuration

Error handling and status

Additional configuration

Pricing configuration

API controls