Best Natural Language Processing (NLP) Datasets & Databases
Easily explore, compare & preview top Natural Language Processing (NLP) Datasets via Datarade.
Refine your data search
Refine your data search
Recommended Natural Language Processing (NLP) Data Products
50+ Results
Nexdata | Audio Annotation Services | AI-assisted Labeling |Speech Data | AI Training Data | Natural Language Processing (NLP) Data
by
Nexdata
Language Processing (NLP) Data, etc. ... Overview
We provide various types of Natural Language Processing (NLP) Data services, including:
Available for 129 countries
100K hours per month
5 years of historical data
99.5% word accuracy
Starts at
$5,000 / purchase
Free sample preview
WebAutomation Off the Shelf Datasets | Audio Data for AI & ML Training | 600+ Hours of Recording | Speech Recognition, Natural Language Processing
language processing, voice assistants, and more. ... language processing, voice assistants, and more.
Available for 64 countries
600 Hours of Recording
Pricing available upon request
Nexdata | Large Language Model Data | SFT Data| Pre-training Data| LLM Data|Text AI & ML Training Data | Natural Language Processing (NLP) Data
by
Nexdata
Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. ... Nexdata has a vast collection of unlabeled text data,Natural Language Processing (NLP) Data, multiligual
Available for 92 countries
800 TB
5 years of historical data
90% Accuracy
Starts at
$5,000 / purchase
Free sample preview
Scraping | data parsing | and processing services
by
AnaChart
data solutions through our expertise in web scraping, data parsing, and quality assurance processing ... AnaChart have developed expertise in web scraping, processing services, as well as testing for quality
Available for 6 countries
Pricing available upon request
Free sample preview
Kieli NLP Data - Fully-labelled dataset of Arabic language for Machine Learning & AI platforms
by
Kieli
language processing techniques. ... Kieli is a professional data analytic company dedicated to solving human language challenges using natural
Available for 242 countries
Pricing available upon request
TAUS Language Translation Data | Parallel translation for E- Commerce, various language pairs
by
TAUS
Data is available in parallel format and new language pairs can be created quickly:
French - Dutch ... Based on that, we’ve applied TAUS proprietary Matching Data technology to extract the data from the TAUS
Available for 11 countries
1M words per language pair
1 years of historical data
100% words
Starts at
€5,000 / purchase
Bitext NLP Labeling for Gen AI Data Annotation and Labeling (DAL) projects
by
bitext
English, Spanish, French, German, Italian, Portuguese, Arabic, Chinese, Japanese, Korean…)
Multiple NLP ... Bitext, we offer advanced linguistic tools designed for automated pre-labeling of datasets to help scale Data
Available for 240 countries
Pricing available upon request
Free sample preview
Nexdata |Multilingual Native & Accented English Speech Data |40,000 Hours | Audio Data|Speech Recognition Data| Natural Language Processing (NLP) Data
by
Nexdata
Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. ... Language : American English, British English, Canadian English, Australian English, French English, German
Available for 52 countries
40K Hours
10 years of historical data
98% sentence/word
Starts at
$5,000 / purchase
Free sample preview
Coresignal | Clean Data | Company Data | AI-Enriched Datasets | Global / 35M+ Records / Updated Weekly
by
Coresignal
value, making data processing much faster and easier. ... After cleaning, this data is also enriched by leveraging a carefully instructed large language model
Available for 248 countries
35 million records
Available Pricing:
One-off purchase
Monthly License
Yearly License
Usage-based
Free sample preview
Knuckle Head Data Annotation and Labelling Services (NLP Data for English, French, Spanish, Italian, Portuguese, Japanese, Indian)
by
Knuckle Head
We have been working on several projects for Data Annotation, Data-Collection and data labeling services ... Image Annotation and Labeling
Face Recognition and Emotions
Audio / Video Annotation
Medical Annotation
Data
Available for 191 countries
Pricing available upon request
Monetize data on Datarade Marketplace
List your data on our global B2B marketplace to reach 100k monthly buyers
More Natural Language Processing (NLP) Data Products
Discover related natural language processing (nlp) data products.
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
46 participants from Western Cape, No...
4.5M images
100% Image attachment
249 countries covered
Wirestock's AI/ML Image Training Data, 4.5M Files with Metadata: This data product offers a vast collection of images and associated metadata, ideal for trai...
15K Hours
98% sentence/word
79 countries covered
The Natural Language Processing (NLP) Data of in-car speech covers 20+ languages, including read, wake-up word, commend word, code-swithing, multimodal and n...
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
49 participants from Limpopo, North-W...
10K recordings
95% accuracy
64 countries covered
Authentic and spoofed faces recorded with different mobile phone cameras, showcasing both men and women, with and without glasses, under indoor and outdoor l...
50 Hours
99% Accurate
South Africa covered
50 hours of simulated, unscripted agent-caller dialogue.
Domains include: Insurance, Retail, Debt Collection, Travel.
63 participants from all South Africa...
20K Hours of Audio
95% Match Rate
215 countries covered
We help the client source, curate, & transcribe the right set of data required to train AI/ML model, with utmost precision. We offered audio data collection ...
99% accuracy
240 countries covered
Conversational AI training data generated for specific custom use cases. We have a large pool of customer support agents all over the world to generate AI vo...
USA covered
FactSquared Transcribe provides automated, full-text, searchable, indexed feeds of audio and video content.
USA covered
FactSquared Analyze offers unique data-driven insights into what public figures are -- and aren’t -- saying in their public comments on market-moving topics.
191 countries covered
We have been working on several projects for Data Annotation, Data-Collection and data labeling services since September 2019. The volume of our annotators a...
249 countries covered
2 years of historical data
Snippets database has sound / audio / sonic recordings across all kinds of venues (restaurants, bars, arenas, churches, movie theaters, retail and many more)...
6 countries covered
AnaChart have developed expertise in web scraping, processing services, as well as testing for quality assurance. AnaChart offers services to in these areas ...
26M records
249 countries covered
45 months of historical data
Easily find and get job postings from any industry and location. Job postings API allows you to use a wide selection of filters to discover job listings you'...
240 countries covered
At Bitext, we offer advanced linguistic tools designed for automated pre-labeling of datasets to help scale Data Annotation and Labeling (DAL) projects.
350K calls per month
63 countries covered
1 years of historical data
Access a vast collection of transcribed customer call records tailored to your needs. Ideal for in-depth analysis of customer interactions and behavior trend...
10 hours
Bulgaria covered
Fourth dataset of 10 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-quali...
20 hours
Bulgaria covered
The second dataset of 20 hours of Bulgarian dialogue (two people, separate tracks) about general topics. The dataset is high quality with no noise and high-q...