Baseten acquires Parsed: Own your intelligence by unifying training and inference. READ

transcription

Whisper V3 Turbo

A low-latency Whisper V3 Turbo deployment optimized for shorter audio clips

‌

Model details

Developed by
OpenAI
Model family
Whisper
Use case
transcription
Version
V3
Variant
Turbo
Hardware
H100 MIG 40GB
License
MIT

Example usage

The model accepts a single URL to an audio file, such as a .mp3 or .wav. The audio file should contain clearly audible speech. This example transcribes a ten-second snippet of a recitation of the Gettysburg address.

The JSON output includes the auto-detected language, transcription segments with timestamps, and the complete transcribed text.

Input

1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14      "url": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav",
15    }
16)
17
18print(resp.content.decode("utf-8"))

JSON output

1{
2    "segments": [
3        {
4            "start": 0,
5            "end": 9.8,
6            "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal."
7        }
8    ],
9    "language_code": "en"
10}