Speech to text

You could use Keywords AI’s unified LLM API to call Speech-to-text model to turn audio into text.

Keywords AI now supports whisper-1 from OpenAI.

Integration steps:

Get your OpenAI API key

Go to OpenAI API plaform to get your OpenAI API key.

Add OpenAI's API key in Providers

You should add your OpenAI’s API key on Keywords AI credentials page.

Call your speech-to-text model

from openai import OpenAI

client = OpenAI(
    base_url="https://api.keywordsai.co/api/",
    api_key="YOUR_KEYWORDSAI_API_KEY",
)

audio_file= open("/path/to/file/audio.mp3", "rb")

response = client.audio.transcriptions.create(
    model="whisper-1",
    file=audio_file,
    extra_body={"customer_identifier": "customer_11"}  # All Keywords AI parameters are supported
)

OpenAI parameters

file

required

The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

model

string

required

ID of the model to use. Only whisper-1 is currently available.

language

string

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

prompt

string

An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.

response_format

string

default:"json"

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

temperature

number

default:0

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

timestamp_granularities

array

default:"segment"

The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.

Keywords AI parameters

See how to make a standard Keywords AI API call in the Quick Start guide.

Generation parameters

customer_credentials

object

You can pass in your customer’s credentials for supported providers and use their credits when our proxy is calling models from those providers.
See details here

Example

disable_log

boolean

When set to true, only the request and performance metrics will be recorded, input and output messages will be omitted from the log.

Observability parameters

metadata

dict

You can add any key-value pair to this metadata field for your reference. Check the details of metadata here.

Contact team@keywordsai.co if you need extra parameter support for your use case.

Example

customer_identifier

string

Use this as a tag to identify the user associated with the API call. See the details of customer identifier here.

Example

customer_email

string

This is the email address of the user associated with the API call. You can add your corresponding user’s email address to the request.

You could also edit customer’s emails on the platform. Check the details of user editing here.

Example

thread_identifier

string

See logs as a conversation log thread. Pass all logs with the same thread_identifier to see them in the same thread.

Example

request_breakdown

boolean

default:false

Adding this returns the summarization of the response in the response body. If streaming is on, the metrics will be streamed as the last chunk.

Properties

{
  "id": "chatcmpl-7476cf3f-fcc9-4902-a548-a12489856d8a",
  //... main part of the response body ...
  "request_breakdown": {
    "prompt_tokens": 6,
    "completion_tokens": 9,
    "cost": 4.8e-5,
    "prompt_messages": [
      {
        "role": "user",
        "content": "How are you doing today?"
      }
    ],
    "completion_message": {
      "content": " I'm doing well, thanks for asking!",
      "role": "assistant"
    },
    "model": "claude-2",
    "cached": false,
    "timestamp": "2024-02-20T01:23:39.329729Z",
    "status_code": 200,
    "stream": false,
    "latency": 1.8415491580963135,
    "scores": {},
    "category": "Questions",
    "metadata": {},
    "routing_time": 0.18612787732854486,
    "full_request": {
      "messages": [
        {
          "role": "user",
          "content": "How are you doing today?"
        }
      ],
      "model": "claude-2",
      "logprobs": true
    },
    "sentiment_score": 0
  }
}

Integration methods

Logs

Threads

Prompts

Prompt versions

Multimodal integrations

User

Model

API keys management

Speech to text

Integration steps:

OpenAI parameters

Keywords AI parameters

Generation parameters

Observability parameters

Integration methods

Logs

Threads

Prompts

Prompt versions

Multimodal integrations

User

Model

API keys management

​Integration steps:

​OpenAI parameters

​Keywords AI parameters

​Generation parameters

​Observability parameters

Integration steps:

OpenAI parameters

Keywords AI parameters

Generation parameters

Observability parameters