API Documentation
Complete guide to integrate ThunderScribe's transcription API into your applications
API Access Required
API access is not available for all users. You need to apply for access.
To request API access, please contact our support team at [email protected] with your use case and requirements.
Quick Start
Get started with our transcription API in minutes
https://thunderscribe.ai/api/v3/audio/transcriptions
https://thunderscribe.ai/api/v1/upload
(for large files)OpenAI Compatible: Use the OpenAI Python SDK with our API endpoint for seamless integration.
Authentication
Get your API token to authenticate your requests
Getting your API Token:
- Visit: https://thunderscribe.ai/dashboard?profileOpen=true
- Copy your API Token from the profile section
- Include it in the Authorization header as shown below
Include this token in your request headers:
Authorization: Bearer YOUR_API_TOKEN
API Examples
Multiple ways to interact with our transcription API
Python with requests (Direct Upload)
import requests
api_token = "YOUR_API_TOKEN"
file_path = "audio.mp3"
response = requests.post(
"https://thunderscribe.ai/api/v3/audio/transcriptions",
headers={
"Authorization": f"Bearer {api_token}",
},
files={"file": open(file_path, "rb")},
data={
"language": "en", # Optional
"model": "base", # Optional
"response_format": "json", # text, json, verbose_json, srt, vtt
"temperature": 0.0, # Optional
"prompt": "Custom prompt", # Optional
"diarization": True, # Optional speaker diarization
"split_segments_by_speaker": True, # Optional
"max_speakers": 5, # Optional
"min_speakers": 2, # Optional
"num_speakers": 3, # Optional
"timestamp_granularities": ["segment", "word"], # Optional
"folder_id": "folder_uuid", # Optional
}
)
print(response.json())
Python with requests (Large File Upload)
import requests
api_token = "YOUR_API_TOKEN"
file_path = "large_audio.mp3"
# First upload the file
upload_response = requests.post(
"https://thunderscribe.ai/api/v1/upload",
headers={
"Authorization": f"Bearer {api_token}",
},
files={"file": open(file_path, "rb")}
)
file_id = upload_response.json()["file_id"]
# Then transcribe using file_id
transcribe_response = requests.post(
"https://thunderscribe.ai/api/v3/audio/transcriptions",
headers={
"Authorization": f"Bearer {api_token}",
},
data={
"file_id": file_id,
"language": "en",
"response_format": "json",
"diarization": True,
}
)
print(transcribe_response.json())
Chunked Upload Workflow
How to handle large file uploads with chunking
When to Use Chunked Upload:
- • Files larger than 10MB
- • Unreliable network connections
- • Need for upload progress tracking
- • Resume capability for interrupted uploads
Chunked Upload Process:
- First Chunk: Send the first chunk with
chunked: true
header. The server returns afile_hash
for subsequent chunks. - Subsequent Chunks: Send each chunk with the
file-hash
header received from the first chunk. - Final Chunk: Send the last chunk with
last-chunk: true
header. The server returns the finalfile_id
. - Transcribe: Use the
file_id
to start transcription via the/api/v1/audio/transcriptions
endpoint.
Parameters
Available parameters for the transcription API
Audio/Transcription Parameters
Parameter | Type | Required | Description |
---|---|---|---|
file | file | Required* | The audio file to transcribe (*or use file_id) |
file_id | string | Required* | ID of previously uploaded file (*or use file) |
model | string | Optional | Model version (currently only `thunderscribe`) |
language | string | Required* | Language code (e.g., 'en', 'es', 'fr'). |
response_format | string | Optional | Response format: text, json, verbose_json, srt, vtt (default: text) |
temperature | float | Optional | Sampling temperature (0.0 to 1.0) |
prompt | string | Optional | Custom prompt to guide the model |
timestamp_granularities | array | Optional | Array of ["segment", "word"] for timestamp levels |
folder_id | string | Optional | Folder ID for organization (Default `API`) |
Diarization Parameters
Parameter | Type | Required | Description |
---|---|---|---|
diarization | boolean | Optional | Enable speaker diarization (true/false) |
split_segments_by_speaker | boolean | Optional | Split segments by speaker (default: true) |
max_speakers | integer | Optional | Maximum number of speakers to detect |
min_speakers | integer | Optional | Minimum number of speakers to detect |
num_speakers | integer | Optional | Exact number of speakers (if known) |
Upload Parameters (for chunked uploads)
Header | Type | Required | Description |
---|---|---|---|
chunked | string | Optional | Set to "true" for chunked uploads |
file-hash | string | Optional | File hash for chunk continuation |
last-chunk | string | Optional | Set to "true" for the final chunk |
Response Format
Expected response structure from the API
Text Response (response_format: "text")
{
"text": "This is the complete transcribed text from the audio file."
}
JSON Response (response_format: "json" or "verbose_json")
{
"task": "transcribe",
"language": "en",
"duration": 125.5,
"text": "This is the complete transcribed text.",
"segments": [
{
"start": 0.0,
"end": 5.2,
"text": "This is the transcribed text.",
"speaker": "Speaker 1",
"words": [
{
"start": 0.0,
"end": 0.5,
"word": "This"
},
{
"start": 0.5,
"end": 1.0,
"word": "is"
}
]
}
],
"diarization_segments": [
{
"start": 0.0,
"end": 5.2,
"speaker": "Speaker 1"
}
]
}
SRT/VTT Response (response_format: "srt" or "vtt")
1
00:00:00,000 --> 00:00:05,200
This is the transcribed text.
2
00:00:05,200 --> 00:00:10,400
This is the next segment of text.
Upload Response (/api/v1/upload)
{
"file_id": "uuid-string-here",
"file_hash": "hash-string-here"
}
Response Fields
task
Type of task performed (transcribe)language
Detected or specified language codeduration
Duration of audio file in secondstext
Complete transcribed textsegments
Array of text segments with timestamps and speaker informationdiarization_segments
Array of speaker diarization segments (if enabled)words
Word-level timestamps (if timestamp_granularities includes "word")Error Handling
Common error responses and how to handle them
401 Unauthorized
{
"error": "Invalid or missing API token",
"message": "Please provide a valid Authorization header"
}
400 Bad Request
{
"error": "Invalid file format",
"message": "Supported formats"
}
429 Too Many Requests
{
"error": "Rate limit exceeded",
"message": "Please wait before making another request"
}
Rate Limits & File Constraints
API usage limits and file size constraints
File Limits
- • Maximum file size: 5GB
- • Maximum duration: 2 hours
- • Chunked upload for files > 10MB
Best Practices
- • Use chunked upload for large files (>10MB)
- • Implement proper error handling
- • Monitor upload progress for large files
Rate Limits
Rate limits vary by plan and are enforced per API key. Contact support for specific limits for your account.
Support
Need help? We're here to assist you