Best Audio Datasets & Databases
Easily explore, compare & preview top Audio Datasets via Datarade.
47 Results
Mixed Speech Data |5,000 Hours |Code-switching|Audio Data| Speech Recognition Data| AI Datasets
by
Nexdata
Audio Data and 800TB of Annotated Imagery Data. ... The audio data is rich in content and accurate in transcription.
1.
Available for 29 countries
50K Hours
5 years of historical data
98% sentence/word
Starts at
$20,000 / purchase
Free sample preview
WebAutomation Off the Shelf Datasets | Audio Data for AI & ML Training | 600+ Hours of Recording | Speech Recognition, Natural Language Processing
We offer a comprehensive collection of audio data, amounting to over 600 hours of high-quality recordings ... High-Quality Recordings: We prioritize the quality of our audio data, ensuring clear and professional
Available for 64 countries
600 Hours of Recording
Pricing available upon request
TagX Data collection for AI/ ML training | LLM data | Data collection for AI development & model finetuning | Text, image, audio, and document data
by
TagX
Whether you need raw data or a processed dataset, we can deliver the data in your preferred format, including ... We provide In-field data collection for speech, image, text, and survey data.
Available for 249 countries
10K images/document
99% %
Starts at
$1,000 / month
Shaip - Multilingual Conversational AI Training Data (Text & Audio)
by
ShAIp
We offered audio data collection and transcription services based on their requirements while fully customizing ... We offered audio data collection and transcription services based on their requirements while fully customizing
Available for 215 countries
20K Hours of Audio
95% Match Rate
Available Pricing:
One-off purchase
Multi-lingual audio recognition service dataset
by
Overtone
Overtone's APIs allow customers to use our state of the art machine learning and large language models ... We can ingest various types of audio content (including speech, video) and generate text output (STT)
Available for 157 countries
Pricing available upon request
AI-Machine Learning Sound / Audio / Snippet Recordings Database
by
SoundPrint
Snippets database has sound / audio / sonic recordings across all kinds of venues (restaurants, bars, ... Snippets database has sound / audio / sonic recordings across all kinds of venues (restaurants, bars,
Available for 249 countries
2 years of historical data
Pricing available upon request
Norwegian audio dataset for speech recognition 20 hours (1/5)
by
StageZero
resell the data. ... - Maximum four hours of speech per person in the dataset.
Available for 1 countries
20 hours
Starts at
€2,500 / purchase
Deeply Korean Read Speech Corpus - Audio AI & ML Training Data
by
Deeply
The Read Speech dataset consists of 289.9 hours of audio clips of reading the scripts with 3 text sentiments ... The dataset also includes metadata such as a script(speech-to-text aligned), speaker, age, sex, noise
Available for 1 countries
190K records
99% Validity
Pricing available upon request
Data Collection by EPIC Translations: Copywriting, Text & Audio Data Data for AI & ML Training
Text Data Collection
6. Audio Data Collection
7. Chatbot Training Data
8. Copywriting
9. ... Data Entry
11. Data Mining
12.
Available for 215 countries
50K sentences
12 weeks of historical data
100% match rate
Pricing available upon request
10% Datarade discount
10% revenue share
Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training Data| AI Datasets
by
Nexdata
Audio Data and 800TB of Annotated Imagery Data. ... Speech Synthesis speech data is recorded by native speaker, with authentic accent and sweet sound.
Available for 62 countries
400 hours
5 years of historical data
95% sentence accuracy
Starts at
$20,000 / purchase
Free sample preview
Can't find the data you're looking for?
Let data providers come to you by posting your request
Post your request
More Audio Data Products
Discover related audio data products.
15K Hours
98% sentence/word
85 countries covered
Nexdata has off-the-shelf 15,000 hours Machine Learning (ML) Data of 8kHz conversational speech, covering 100+ countries including English, German, French, S...
65K Hours
98% sentence/word
103 countries covered
Off-the-shelf Scripted Monologues Speech Datasets cover 100+ languages. All the Machine Learning (ML) Data are collected from native speakers, with signed au...
15K Hours
98% sentence/word
83 countries covered
The Natural Language Processing (NLP) Data of in-car speech covers 20+ languages, including read, wake-up word, commend word, code-swithing, multimodal and n...
160K Records
236 countries covered
2 years of historical data
Contains user-submitted noise complaints recorded via mobile devices. Each entry captures the time, location, type of noise source, and the emotional respons...
10M Measurements
95% Precision
237 countries covered
Street noise-level data from any city. Analyze noise exposure across 200+ countries for risk modeling, real estate, AI-training and health studies. Real meas...
10M Records
95% Precision
236 countries covered
Real-world venue noise-level data (restaurants, nightlife, gyms, etc.) based on 10M+ hours of measured dBA data. Ideal for AI training in acoustic classifica...
95% match rate
213 countries covered
10 years of historical data
Custom Data Collection Services by ShAIp - Any subject. Any scenario be it Text, Audio, Image or Video.
1K hour per month
99.5% word accuracy
116 countries covered
Nexdata provides multi-language, multi-timbre, multi-domain and multi-style speech synthesis data collection servicesfor Deep Learning Data.
190K records
99% Validity
South Korea covered
Pairs of Korean speakers reading a script with 3 distinct text sentiments, with 3 distinct voice sentiments, are recorded. The recordings took place in 3 dif...
65K Hours
98% sentence/word
103 countries covered
Off-the-shelf Scripted Monologues Speech Datasets cover 100+ languages. All the Machine Learning (ML) Data are collected from native speakers, with signed au...
20K Hours
98% accuracy
72 countries covered
Off-the-shelf 20,000 hours Unscripted Call Center Telephony Speech Data, covering 30+ languages including English, German, French, Spanish, Italian, Portugue...
10M hours
236 countries covered
2 years of historical data
Silencio provides the world’s largest real-world street and venue noise-level dataset, combining over 35 billion datapoints with AI-powered interpolation. Fu...
10M Measurements
95% Precision
237 countries covered
Street noise-level data from any city. Analyze noise exposure across 200+ countries for risk modeling, real estate, AI-training and health studies. Real meas...
10M Hours
95% Precision
236 countries covered
Starter dataset for AI teams with sampled noise (from 10M+ hours of measurements), mobility, and POI data. Ideal for rapid prototyping and AI research. CSV o...
35B Data Points
95% Precision
236 countries covered
Combines 10M+ hours of noise data with mobility and POI visitation data. Ideal for AI models combining environmental, mobility, and behavioral signals. CSV o...
35B Data Points
95% Accuracy
236 countries covered
Interpolated noise dataset built on 10M+ hours of real-world acoustic data combined with AI-generated predictions. Ideal for map generation, AI training, and...
35B Data Points
3.7% Horizontal Accuracy (Meters)
236 countries covered
Time-series dataset based on 10M+ hours of measured dBA data. Includes hourly, daily, and seasonal noise patterns. Ideal for AI models focused on forecasting...
160K Records
236 countries covered
2 years of historical data
Contains user-submitted noise complaints recorded via mobile devices. Each entry captures the time, location, type of noise source, and the emotional respons...