Skip to main content
POST
https://api.keywordsai.co
/
api
/
datasets
Create dataset
curl --request POST \
  --url https://api.keywordsai.co/api/datasets/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "description": "<string>",
  "sampling": 123,
  "start_time": "<string>",
  "end_time": "<string>",
  "is_empty": true,
  "initial_log_filters": {}
}
'
{
  "id": "6d0b2c7e-3a6a-4c09-9c7e-1f2d9e2d3f0a",
  "name": "Support Conversations - July",
  "description": "Sampled support chats for July",
  "type": "sampling",
  "status": "ready",
  "log_count": 250,
  "created_at": "2025-07-26T00:00:00Z"
}
Creates a new dataset, either populated with existing logs or empty for manual population. You can specify filters and sampling rate to select which logs to include, or create an empty dataset to add logs manually later.

Authentication

All endpoints require API key authentication:
Authorization: Bearer YOUR_API_KEY

Parameters

name
string
required
The name of the dataset.
description
string
A description of the dataset.
sampling
integer
default:"100"
Percentage of logs to include (0-100).
start_time
string
ISO 8601 timestamp for log filtering. Required when is_empty is false or not provided. Ignored when is_empty is true.
end_time
string
ISO 8601 timestamp for log filtering. Required when is_empty is false or not provided. Ignored when is_empty is true.
is_empty
boolean
default:"false"
Create empty dataset. When true, start_time and end_time are ignored.
initial_log_filters
object
Filters to apply to select logs for the dataset.
{
  "id": {
    "operator": "in",
    "value": ["log_id_1", "log_id_2"]
  }
}

Request Examples

import requests
import json

url = "https://api.keywordsai.co/api/datasets/"

payload = json.dumps({
  "name": "Support Conversations - July",
  "description": "Sampled support chats for July",
  "sampling": 40,
  "start_time": "2025-07-01T00:00:00Z",
  "end_time": "2025-07-31T23:59:59Z",
  "initial_log_filters": {
    "status_code": {
      "operator": "eq",
      "value": 200
    }
  }
})
headers = {
  'Authorization': 'Bearer YOUR_API_KEY',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Response

{
  "id": "6d0b2c7e-3a6a-4c09-9c7e-1f2d9e2d3f0a",
  "name": "Support Conversations - July",
  "description": "Sampled support chats for July",
  "type": "sampling",
  "status": "ready",
  "log_count": 250,
  "created_at": "2025-07-26T00:00:00Z"
}