Audio conversations segmented and transcribed for speech recognition product image in hero

Audio conversations segmented and transcribed for speech recognition

StageZero
Start iconNo reviews yetBadge iconVerified Data Provider
#
speaker2
342506
10703
104888
3278
HeiKari hallo hallo du?
1 xxxxxxxxxx Xxxxxxxxx xxxxxx xxxxxxxxxx Xxxxx Xxxxxx
2 Xxxxxxxxxx Xxxxxx Xxxxxxxxx Xxxxxxxxxx xxxxxxxxx Xxxxxxxxx
3 xxxxxxxxx Xxxxxxx xxxxxx Xxxxx xxxxxxxxxx xxxxxx
4 Xxxxxxxxxx xxxxxx Xxxxx Xxxxxx xxxxx xxxxxxxx
5 xxxxxxx Xxxxx Xxxxxxxx xxxxxxxxxx xxxxxx Xxxxxxxxx
6 xxxxxx Xxxxxxxxx Xxxxxxxxx xxxxxxxxxx Xxxxxx Xxxxx
7 xxxxxx xxxxxxx xxxxxxx Xxxxx xxxxxx Xxxxxxxxxx
8 xxxxxxxx xxxxxx Xxxxx Xxxxxxx xxxxxx Xxxxxxxx
9 Xxxxxxx Xxxxx xxxxxx xxxxxxxxxx Xxxxx xxxxxxxxxx
10 xxxxxxxxx Xxxxxxx xxxxxxxx xxxxxxxx Xxxxxxxxxx Xxxxxxxx
... Xxxxxxxx xxxxxxxxx Xxxxxxxxxx Xxxxxx Xxxxxxxxx xxxxx
Sign In To Preview Data
Volume
100
hour
Avail. Formats
.json and .csv
File
Coverage
6
Countries

Data Dictionary

[Sample] Norwegian sample.csv
Attribute Type Example Mapping
speaker2
String speaker2
342506
Integer 464875
10703
Integer 14527
104888
Integer 23244
3278
Integer 726
HeiKari hallo hallo du?
String Ja hei!
Product Attributes
Attribute Type Example Mapping
Speaker 1
Text
Speaker 2
Text
Noise
Text

Description

- Dataset of 100 hours of conversational speech segmented by speaker and transcribed for training speech recognition models. Perfect for training a speaker diarization and/or general speech recognition model. Languages: English, Bulgarian, German, Lithuanian, Norwegian
- 100 hours of Norwegian bokmål speech data of 100 people in conversations between two people. - Each recording is segmented by: speech, noise, and music. - Speech segments transcribed for speech recognition. - Speakers are identified throughout the entire dataset using a unique id. - Roughly 50% female and 50% male speakers. - Conversations cover varying general topics.

Geography

Europe (6)
Bulgaria
Denmark
Germany
Lithuania
Norway
United Kingdom

Volume

100 hour

Pricing

Free sample available
StageZero has not published pricing information for this product yet. You can request detailed pricing information below.

Suitable Company Sizes

Small Business
Medium-sized Business
Enterprise

Delivery

Format
.json
.csv

Use Cases

Speech Recognition
speaker recognition
dictation
voice recognition
speaker diarization

Categories

Related Products

65K Hours
98% sentence/word
94 countries covered
Off-the-shelf read speech data cover 100+ languages. All the Machine Learning (ML) Data are collected from native speakers, with signed authorization agreeme...
30K Images
100% Quality
249 countries covered
We provide face detection dataset, including data collection, metadata preparation, and annotation services for all face analysis applications. We provide hi...
10K images
100% Delivery on time
23 countries covered
5,000+ high quality human full body images with multiple attributes ready for AI & ML models
1B Monthly records
USA covered
Website visit data with URLs, categories, timestamps, and anonymized unique identifiers.

Frequently asked questions

What is Audio conversations segmented and transcribed for speech recognition?

Dataset of 100 hours of conversational speech segmented by speaker and transcribed for training speech recognition models. Perfect for training a speaker diarization and/or general speech recognition model. Languages: English, Bulgarian, German, Lithuanian, Norwegian

What is Audio conversations segmented and transcribed for speech recognition used for?

This product has 5 key use cases. StageZero recommends using the data for Speech Recognition, speaker recognition, dictation, voice recognition, and speaker diarization. Global businesses and organizations buy AI & ML Training Data from StageZero to fuel their analytics and enrichment.

Who can use Audio conversations segmented and transcribed for speech recognition?

This product is best suited if you’re a Medium-sized Business or Enterprise looking for AI & ML Training Data. Get in touch with StageZero to see what their data can do for your business and find out which integrations they provide.

Which countries does Audio conversations segmented and transcribed for speech recognition cover?

This product includes data covering 6 countries like Germany, United Kingdom, Norway, Denmark, and Bulgaria. StageZero is headquartered in Finland.

How much does Audio conversations segmented and transcribed for speech recognition cost?

Pricing information for Audio conversations segmented and transcribed for speech recognition is available by getting in contact with StageZero. Connect with StageZero to get a quote and arrange custom pricing models based on your data requirements.

How can I get Audio conversations segmented and transcribed for speech recognition?

Depending on your data requirements and subscription budget, StageZero can deliver this product in .json and .csv format.

What is the data quality of Audio conversations segmented and transcribed for speech recognition?

You can compare and assess the data quality of StageZero using Datarade’s data marketplace. StageZero appears on selected Datarade top lists ranking the best data providers, including Who’s New on Datarade? May Edition.

What are similar products to Audio conversations segmented and transcribed for speech recognition?

This Audio Data has 3 related products. These alternatives include Nexdata Multilingual Read Speech Data 65,000 Hours Audio AI & ML Training Data Audio Data Speech Recognition Data Machine Learning (ML) Data, 30000 Images+ Face Detection Dataset Facial Features Metadata Face Recognition & Analysis, and Pixta AI Imagery Data Global 5,000 Stock Images Annotation and Labelling Services Provided Full-body human images for AI & ML. You can compare the best AI & ML Training Data providers and products via Datarade’s data marketplace and get the right data for your use case.

Pricing available upon request