Audio conversations segmented and transcribed for speech recognition

#	speaker2	342506	10703	104888	3278	HeiKari hallo hallo du?
1	xxxxxxxxxx	Xxxxxxxxx	xxxxxx	xxxxxxxxxx	Xxxxx	Xxxxxx
2	Xxxxxxxxxx	Xxxxxx	Xxxxxxxxx	Xxxxxxxxxx	xxxxxxxxx	Xxxxxxxxx
3	xxxxxxxxx	Xxxxxxx	xxxxxx	Xxxxx	xxxxxxxxxx	xxxxxx
4	Xxxxxxxxxx	xxxxxx	Xxxxx	Xxxxxx	xxxxx	xxxxxxxx
5	xxxxxxx	Xxxxx	Xxxxxxxx	xxxxxxxxxx	xxxxxx	Xxxxxxxxx
6	xxxxxx	Xxxxxxxxx	Xxxxxxxxx	xxxxxxxxxx	Xxxxxx	Xxxxx
7	xxxxxx	xxxxxxx	xxxxxxx	Xxxxx	xxxxxx	Xxxxxxxxxx
8	xxxxxxxx	xxxxxx	Xxxxx	Xxxxxxx	xxxxxx	Xxxxxxxx
9	Xxxxxxx	Xxxxx	xxxxxx	xxxxxxxxxx	Xxxxx	xxxxxxxxxx
10	xxxxxxxxx	Xxxxxxx	xxxxxxxx	xxxxxxxx	Xxxxxxxxxx	Xxxxxxxx
...	Xxxxxxxx	xxxxxxxxx	Xxxxxxxxxx	Xxxxxx	Xxxxxxxxx	xxxxx

Volume

100

hour

Avail. Formats

.json and .csv

File

Coverage

Countries

[Sample] Norwegian sample.csv

Attribute	Type	Example
speaker2	String	speaker2
342506	Integer	464875
10703	Integer	14527
104888	Integer	23244
3278	Integer	726
HeiKari hallo hallo du?	String	Ja hei!

Product Attributes

Attribute	Type	Example	Mapping
Speaker 1	Text
Speaker 2	Text
Noise	Text

- Dataset of 100 hours of conversational speech segmented by speaker and transcribed for training speech recognition models. Perfect for training a speaker diarization and/or general speech recognition model. Languages: English, Bulgarian, German, Lithuanian, Norwegian

- 100 hours of Norwegian bokmål speech data of 100 people in conversations between two people. - Each recording is segmented by: speech, noise, and music. - Speech segments transcribed for speech recognition. - Speakers are identified throughout the entire dataset using a unique id. - Roughly 50% female and 50% male speakers. - Conversations cover varying general topics.

Europe (6)

Bulgaria

Denmark

Germany

Lithuania

Norway

United Kingdom

100

hour

Free sample available

StageZero has not published pricing information for this product yet. You can request detailed pricing information below.

Request detailed pricing

Format

Speech Recognition

speaker recognition

dictation

voice recognition

speaker diarization

AI & ML Training Data Machine Learning (ML) Data

65K Hours

98% sentence/word

94 countries covered

Off-the-shelf read speech data cover 100+ languages. All the Machine Learning (ML) Data are collected from native speakers, with signed authorization agreeme...

30K Images

100% Quality

249 countries covered

We provide face detection dataset, including data collection, metadata preparation, and annotation services for all face analysis applications. We provide hi...

10K images

100% Delivery on time

23 countries covered

5,000+ high quality human full body images with multiple attributes ready for AI & ML models

1B Monthly records

USA covered

Website visit data with URLs, categories, timestamps, and anonymized unique identifiers.

What is Audio conversations segmented and transcribed for speech recognition?

Dataset of 100 hours of conversational speech segmented by speaker and transcribed for training speech recognition models. Perfect for training a speaker diarization and/or general speech recognition model. Languages: English, Bulgarian, German, Lithuanian, Norwegian

What is Audio conversations segmented and transcribed for speech recognition used for?

This product has 5 key use cases. StageZero recommends using the data for Speech Recognition, speaker recognition, dictation, voice recognition, and speaker diarization. Global businesses and organizations buy AI & ML Training Data from StageZero to fuel their analytics and enrichment.

Who can use Audio conversations segmented and transcribed for speech recognition?

This product is best suited if you’re a Medium-sized Business or Enterprise looking for AI & ML Training Data. Get in touch with StageZero to see what their data can do for your business and find out which integrations they provide.

Which countries does Audio conversations segmented and transcribed for speech recognition cover?

This product includes data covering 6 countries like Germany, United Kingdom, Norway, Denmark, and Bulgaria. StageZero is headquartered in Finland.

How much does Audio conversations segmented and transcribed for speech recognition cost?

Pricing information for Audio conversations segmented and transcribed for speech recognition is available by getting in contact with StageZero. Connect with StageZero to get a quote and arrange custom pricing models based on your data requirements.

How can I get Audio conversations segmented and transcribed for speech recognition?

Depending on your data requirements and subscription budget, StageZero can deliver this product in .json and .csv format.

What is the data quality of Audio conversations segmented and transcribed for speech recognition?

You can compare and assess the data quality of StageZero using Datarade’s data marketplace. StageZero appears on selected Datarade top lists ranking the best data providers, including Who’s New on Datarade? May Edition.

What are similar products to Audio conversations segmented and transcribed for speech recognition?

This Audio Data has 3 related products. These alternatives include Nexdata Multilingual Read Speech Data 65,000 Hours Audio AI & ML Training Data Audio Data Speech Recognition Data Machine Learning (ML) Data, 30000 Images+ Face Detection Dataset Facial Features Metadata Face Recognition & Analysis, and Pixta AI Imagery Data Global 5,000 Stock Images Annotation and Labelling Services Provided Full-body human images for AI & ML. You can compare the best AI & ML Training Data providers and products via Datarade’s data marketplace and get the right data for your use case.

Audio conversations segmented and transcribed for speech recognition

Data Dictionary

Description

Geography

Volume

Pricing

Suitable Company Sizes

Delivery

Use Cases

Categories

Related Products

Frequently asked questions

StageZero
Ethical data labeling as a service!

Audio conversations segmented and transcribed for speech recognition

Data Dictionary

Description

Geography

Volume

Pricing

Suitable Company Sizes

Delivery

Use Cases

Categories

Related Products

Frequently asked questions

StageZero Ethical data labeling as a service!

StageZero
Ethical data labeling as a service!