POST
/
api
/
audio
/
transcription

You could use Keywords AI’s unified LLM API to call Speech-to-text model to turn audio into text.

Keywords AI now supports whisper-1 from OpenAI.

Integration steps:

1

Get your OpenAI API key

Go to OpenAI API plaform to get your OpenAI API key.

2

Add your credentials on Keywords AI credentials page

You should add your OpenAI’s API key on Keywords AI credentials page.

3

Call your speech-to-text model

OpenAI parameters

file
file
required

The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

model
string
required

ID of the model to use. Only whisper-1 is currently available.

language
string

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

prompt
string

An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.

response_format
string
default: "json"

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

temperature
number
default: 0

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

timestamp_granularities
array
default: "segment"

The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.

Keywords AI parameters

See how to make a standard Keywords AI API call in the Quick Start guide.

Generation parameters

customer_credentials
object

You can pass in your customer’s credentials for supported providers and use their credits when our proxy is calling models from those providers.
See details here

disable_log
boolean

When set to true, only the request and performance metrics will be recorded, input and output messages will be omitted from the log.

Observability parameters

metadata
dict

You can add any key-value pair to this metadata field for your reference. Check the details of metadata here.

Contact team@keywordsai.co if you need extra parameter support for your use case.

customer_identifier
string

Use this as a tag to identify the user associated with the API call. See the details of customer identifier here.

customer_email
string

This is the email address of the user associated with the API call. You can add your corresponding user’s email address to the request.

You could also edit customer’s emails on the platform. Check the details of user editing here.

thread_identifier
string

See logs as a conversation log thread. Pass all logs with the same thread_identifier to see them in the same thread.

request_breakdown
boolean
default: false

Adding this returns the summarization of the response in the response body. If streaming is on, the metrics will be streamed as the last chunk.