transcription

OpenAI logoWhisper V3 Turbo

A low-latency Whisper V3 Turbo deployment optimized for shorter audio clips

Model details

Example usage

The model accepts a single URL to an audio file, such as a .mp3 or .wav. The audio file should contain clearly audible speech. This example transcribes a ten-second snippet of a recitation of the Gettysburg address.

The JSON output includes the auto-detected language, transcription segments with timestamps, and the complete transcribed text.

Input
1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14      "url": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav",
15    }
16)
17
18print(resp.content.decode("utf-8"))
JSON output
1{
2    "segments": [
3        {
4            "start": 0,
5            "end": 9.8,
6            "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal."
7        }
8    ],
9    "language_code": "en"
10}

transcription models

See all
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3 Turbo

H100 MIG 40GB

OpenAI models

See all
OpenAI logo
Model API
LLM

GPT OSS 120B

MoE
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB

🔥 Trending models