Best Audio Datasets & Databases
Easily explore, compare & preview top Audio Datasets via Datarade.
46 Results
Mixed Speech Data |5,000 Hours |Code-switching|Audio Data| Speech Recognition Data| AI Datasets
by
Nexdata
Audio Data and 800TB of Annotated Imagery Data. ... The audio data is rich in content and accurate in transcription.
1.
Available for 29 countries
50K Hours
5 years of historical data
98% sentence/word
Starts at
$20,000 / purchase
Free sample preview
TagX Data collection for AI/ ML training | LLM data | Data collection for AI development & model finetuning | Text, image, audio, and document data
by
TagX
Whether you need raw data or a processed dataset, we can deliver the data in your preferred format, including ... We provide In-field data collection for speech, image, text, and survey data.
Available for 249 countries
10K images/document
99% %
Starts at
$1,000 / month
Shaip - Multilingual Conversational AI Training Data (Text & Audio)
by
ShAIp
We offered audio data collection and transcription services based on their requirements while fully customizing ... We offered audio data collection and transcription services based on their requirements while fully customizing
Available for 215 countries
20K Hours of Audio
95% Match Rate
Available Pricing:
One-off purchase
Norwegian audio dataset for speech recognition 20 hours (1/5)
by
StageZero
resell the data. ... - Maximum four hours of speech per person in the dataset.
Available for 1 countries
20 hours
Starts at
€2,500 / purchase
Multi-lingual audio recognition service dataset
by
Overtone
Overtone's APIs allow customers to use our state of the art machine learning and large language models ... We can ingest various types of audio content (including speech, video) and generate text output (STT)
Available for 157 countries
Pricing available upon request
AI-Machine Learning Sound / Audio / Snippet Recordings Database
by
SoundPrint
Snippets database has sound / audio / sonic recordings across all kinds of venues (restaurants, bars, ... Snippets database has sound / audio / sonic recordings across all kinds of venues (restaurants, bars,
Available for 249 countries
2 years of historical data
Pricing available upon request
Deeply Korean Read Speech Corpus - Audio AI & ML Training Data
by
Deeply
The Read Speech dataset consists of 289.9 hours of audio clips of reading the scripts with 3 text sentiments ... The dataset also includes metadata such as a script(speech-to-text aligned), speaker, age, sex, noise
Available for 1 countries
190K records
99% Validity
Pricing available upon request
Data Collection by EPIC Translations: Copywriting, Text & Audio Data Data for AI & ML Training
Text Data Collection
6. Audio Data Collection
7. Chatbot Training Data
8. Copywriting
9. ... Data Entry
11. Data Mining
12.
Available for 215 countries
50K sentences
12 weeks of historical data
100% match rate
Pricing available upon request
10% Datarade discount
10% revenue share
Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training Data| AI Datasets
by
Nexdata
Audio Data and 800TB of Annotated Imagery Data. ... Speech Synthesis speech data is recorded by native speaker, with authentic accent and sweet sound.
Available for 62 countries
400 hours
5 years of historical data
95% sentence accuracy
Starts at
$20,000 / purchase
Free sample preview
Data Collection by Shaip: Text, Audio, Image, Video for AI & ML Training
by
ShAIp
data across multiple data types, including text, audio, speech, image & video data to manage complex ... multi-lingual text data (Business Card Dataset, Document Dataset, Menu Dataset, Receipt Dataset, Ticket
Available for 213 countries
10 years of historical data
95% match rate
Available Pricing:
One-off purchase
Can't find the data you're looking for?
Let data providers come to you by posting your request
Post your request
More Audio Data Products
Discover related audio data products.
400 hours
95% sentence accuracy
62 countries covered
Speech Synthesis speech data is recorded by native speaker, with authentic accent and sweet sound. The phoneme coverage is balanced. Professional phonetician...
100K hours per month
99.5% word accuracy
119 countries covered
Nexdata is equipped with professional recording equipment and has resources pool of 70+ countries and regions, and provide various types of speech recognitio...
160 Records
236 countries covered
1 years of historical data
The world’s largest noise complaint dataset with over 160K reports including labeled noise sources. Ideal for AI training in acoustic event detection and urb...
20K voice memos
240 countries covered
We help clients source, curate, and transcribe data for AI and machine learning models. Our services include customized audio data collection and transcripti...
10M Hours
95% Precision
236 countries covered
Starter dataset for AI teams with sampled noise (from 10M+ hours of measurements), mobility, and POI data. Ideal for rapid prototyping and AI research. CSV o...
35B Data Points
95% Precision
236 countries covered
Combines 10M+ hours of noise data with mobility and POI visitation data. Ideal for AI models combining environmental, mobility, and behavioral signals. CSV o...
20K voice memos
240 countries covered
We help clients source, curate, and transcribe data for AI and machine learning models. Our services include customized audio data collection and transcripti...
190K records
99% Validity
South Korea covered
Pairs of Korean speakers reading a script with 3 distinct text sentiments, with 3 distinct voice sentiments, are recorded. The recordings took place in 3 dif...
65K Hours
98% sentence/word
103 countries covered
Off-the-shelf Scripted Monologues Speech Datasets cover 100+ languages. All the Machine Learning (ML) Data are collected from native speakers, with signed au...
20 hours
Norway covered
The fourth part of 20 hours of Norwegian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-qual...
10M hours
236 countries covered
2 years of historical data
Silencio provides the world’s largest real-world street and venue noise-level dataset, combining over 35 billion datapoints with AI-powered interpolation. Fu...
10M Measurements
95% Precision
237 countries covered
Street noise-level data from any city. Analyze noise exposure across 200+ countries for risk modeling, real estate, AI-training and health studies. Real meas...
10M Measurements
95% Precision
237 countries covered
Street noise-level data from any city. Analyze noise exposure across 200+ countries for risk modeling, real estate, AI-training and health studies. Real meas...
10M Hours
95% Precision
236 countries covered
Starter dataset for AI teams with sampled noise (from 10M+ hours of measurements), mobility, and POI data. Ideal for rapid prototyping and AI research. CSV o...
35B Data Points
95% Precision
236 countries covered
Combines 10M+ hours of noise data with mobility and POI visitation data. Ideal for AI models combining environmental, mobility, and behavioral signals. CSV o...
35B Data Points
95% Accuracy
236 countries covered
Interpolated noise dataset built on 10M+ hours of real-world acoustic data combined with AI-generated predictions. Ideal for map generation, AI training, and...
35B Data Points
3.7% Horizontal Accuracy (Meters)
236 countries covered
Time-series dataset based on 10M+ hours of measured dBA data. Includes hourly, daily, and seasonal noise patterns. Ideal for AI models focused on forecasting...
160K Records
236 countries covered
2 years of historical data
Contains user-submitted noise complaints recorded via mobile devices. Each entry captures the time, location, type of noise source, and the emotional respons...