Best Speech Datasets & Databases
Easily explore, compare & preview top Speech Datasets via Datarade.
Refine your data search
Refine your data search
Recommended Speech Data Products
31 Results
Nexdata | Multilingual Code-switching Speech Data | 5,000 Hours |Audio Data| Speech Recognition Data|AI Training Data
by
Nexdata
data, 800TB of Annotated Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data ... Accuracy rate : 97%
About Nexdata
Nexdata owns off-the-shelf 1,000,000 hours of speech recognition
Available for 29 countries
50K Hours
5 years of historical data
98% sentence/word
Starts at
$5,000 / purchase
Free sample preview
WebAutomation Off the Shelf Datasets | Audio Data for AI & ML Training | 600+ Hours of Recording | Speech Recognition, Natural Language Processing
recordings that capture the intricacies of human speech. ... We offer a comprehensive collection of audio data, amounting to over 600 hours of high-quality recordings
Available for 64 countries
600 Hours of Recording
Pricing available upon request
Way With Words' Afrikaans Speech Collection Dataset
by
WayWithWords
Thank you for your interest in Way With Words’ off-the-shelf Speech Collection Dataset in South African
Available for 1 countries
50 Hours
99% Accurate
Available Pricing:
One-off purchase
Usage-based
Free sample preview
FactSquared Stock Sentiment Speech Analytics Data USA
by
FactSquared
and more than 250 other factors, indexed to speaker’s historical speech patterns. ... FactSquared Analyze offers unique data-driven insights into what public figures are – and aren’t – saying
Available for 1 countries
Pricing available upon request
Deeply Korean Read Speech Corpus - Audio AI & ML Training Data
by
Deeply
The dataset also includes metadata such as a script(speech-to-text aligned), speaker, age, sex, noise ... The Read Speech dataset consists of 289.9 hours of audio clips of reading the scripts with 3 text sentiments
Available for 1 countries
190K records
99% Validity
Pricing available upon request
Bulgarian audio dataset for speech recognition 10 hours (4/4)
by
StageZero
the data. ... Speech is recorded and transcribed on separate tracks.
Available for 1 countries
10 hours
Starts at
€1,250 / purchase
Nexdata | Multilingual Read Speech Data | 65,000 Hours | Generative AI Audio Data| Speech Recognition Data | Machine Learning (ML) Data
by
Nexdata
Off-the-shelf read speech data cover 100+ languages. ... About Nexdata
Nexdata owns off-the-shelf 1,000,000 hours of speech recognition data, 800TB of Annotated
Available for 102 countries
65K Hours
5 years of historical data
98% sentence/word
Starts at
$5,000 / purchase
Free sample preview
Way With Words' seSotho Speech Collection Dataset
by
WayWithWords
Thank you for your interest in Way With Words’ off-the-shelf Speech Collection Dataset in South African
Available for 1 countries
50 Hours
99% Accurate
Available Pricing:
One-off purchase
Usage-based
Free sample preview
Bulgarian audio dataset for speech recognition 20 hours (3/4)
by
StageZero
the data. ... Speech is recorded and transcribed on separate tracks.
Available for 1 countries
20 hours
Starts at
€2,500 / purchase
Nexdata |Multilingual Conversational Speech Data | 8kHz Telephone| 15,000 Hours | Audio Data | Speech Recognition Data| Machine Learning (ML) Data
by
Nexdata
Nexdata has off-the-shelf 15,000 hours Machine Learning (ML) Data of 8kHz conversational speech, covering ... (NLP) Data.
Available for 77 countries
15K Hours
5 years of historical data
98% sentence/word
Starts at
$5,000 / purchase
Free sample preview
Monetize data on Datarade Marketplace
List your data on our global B2B marketplace to reach 100k monthly buyers
More Speech Data Products
Discover related speech data products.
100K hours per month
99.5% word accuracy
124 countries covered
Nexdata provides high-quality Speech Data services for speech cleaning, speech transcription, phoneme annotation etc, with word accuracy of 99.5% and phoneme...
100K hours per month
99.5% word accuracy
122 countries covered
Nexdata is equipped with professional recording equipment and has resources pool of 70+ countries and regions, and provide various types of speech recognitio...
40K Hours
98% sentence/word
55 countries covered
The speech data is collected from native English speakers in 40 countries,covering a varity of pronunciation habits and characteristics. The script is design...
400 hours
95% sentence accuracy
61 countries covered
The AI Training Data is recorded by native speaker, with authentic accent and sweet sound. The phoneme coverage is balanced. Professional phonetician partici...
15K Hours
98% sentence/word
77 countries covered
Nexdata has off-the-shelf 15,000 hours Machine Learning (ML) Data of 8kHz conversational speech, covering 100+ countries including English, German, French, S...
15K Hours
98% sentence/word
83 countries covered
The Natural Language Processing (NLP) Data of in-car speech covers 20+ languages, including read, wake-up word, commend word, code-swithing, multimodal and n...
40K Hours
98% sentence/word
55 countries covered
The speech data is collected from native English speakers in 40 countries,covering a varity of pronunciation habits and characteristics. The script is design...
USA covered
FactSquared Analyze offers unique data-driven insights into what public figures are -- and aren’t -- saying in their public comments on market-moving topics.
190K records
99% Validity
South Korea covered
Pairs of Korean speakers reading a script with 3 distinct text sentiments, with 3 distinct voice sentiments, are recorded. The recordings took place in 3 dif...
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
50+ participants from KwaZulu-Natal, ...
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
49 participants from Limpopo, North-W...
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
63 participants from all South Africa...
10 hours
Bulgaria covered
Fourth dataset of 10 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-quali...
20 hours
Bulgaria covered
The third dataset of 20 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-qu...
20 hours
Bulgaria covered
The second dataset of 20 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-q...
20 hours
Bulgaria covered
The first dataset of 20 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-qu...
10 hours
Lithuania covered
The fifth dataset consisting 10 hours of Lithuanian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise an...
20 hours
Lithuania covered
Fourth dataset of 20 hours of Lithuanian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-qual...