Best Natural Language Processing (NLP) Datasets & Databases
Easily explore, compare & preview top Natural Language Processing (NLP) Datasets via Datarade.
27 Natural Language Processing (NLP) Data Datasets

Portuguese Language Datasets | 300K Translations | Natural Language Processing (NLP) Data | Dictionary Display | Translation | EU & LATAM Coverage
Available in
and 4 more countries

Audio Annotation Services | AI-assisted Labeling |Speech Data | AI Training Data | Natural Language Processing (NLP) Data
Language Name
Available in
and 109 more countries

British English Language Datasets | 150+ Years of Research | Natural Language Processing (NLP) Data | LLMs | TTS | Dictionary Display | EU Coverage
Available in
and 10 more countries

Native & Accented English Speech Data |40,000 Hours | Audio Data|Speech Recognition Data| Natural Language Processing (NLP) Data
Language Name
Available in
and 49 more countries

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training
Available in
and 245 more countries

Global English Speech with Accent Conversational Dataset — Multi-Region Validated Speech with Gender, Age & Metadata for AI & NLP Training
Available in
and 245 more countries

LATAM Data Suite | 1.8M+ Sentences | Natural Language Processing (NLP) Data | TTS | Dictionary Display | Translation Data | LATAM Coverage
Available in
and 16 more countries

Foundation Model Data Collection and Data Annotation | Large Language Model(LLM) Data | SFT Data| Red Teaming Services
Available in
and 109 more countries

Nordic B2B Profiles Data | B2B Marketing Data | 10M Verified Leads for Norway, Sweden & Finland (100+ Attributes)
Available in
and 3 more countries

Location Intelligence Data Suite | Comprehensive view of where and how active businesses operate | Global
Available in
and 245 more countries
Can't find the data you're looking for?
Let data providers come to you by posting your request
/postings/new?utm_content=search_results_page&utm_medium=platform&utm_source=datarade
More Natural Language Processing (NLP) Data Products
Discover related natural language processing (nlp) data products.

8kHz Conversational Speech Data | 15,000 Hours | Audio Data | Speech Recognition Data| Machine Learning (ML) Data
Free sample preview
API available
Starts at
$20,000 / purchase

Global English Speech with Accent Conversational Dataset — Multi-Region Validated Speech with Gender, Age & Metadata for AI & NLP Training
Free sample preview
Starts at
$30$27 / hour

Foundation Model Data Collection and Data Annotation | Large Language Model(LLM) Data | SFT Data| Red Teaming Services
Free sample preview
API available
Starts at
$20,000 / purchase

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training
Free sample preview
API available
Starts at
$1,000$900 / month

Latin American English Accent Speech Dataset — Authentic Local Speaker Conversations
Free sample preview
API available
Starts at
$22$19.80 / hour

Chinese Language Datasets | 583KTranslations | 178K Words | NLP | Dictionary Display | Translations Data | APAC coverage | Mandarin | Cantonese
Free sample preview
API available
Pricing available upon request

Shaip - Multilingual Conversational AI Training Data (Text & Audio)
API available
Pricing available upon request

Norwegian audio dataset for speech recognition 20 hours (1/5)
Starts at
€2,500 / purchase

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training
Free sample preview
API available
Starts at
$1,000$900 / month

Call Center Audio Recordings (100,000+ Hours, High-Quality) in Multiple Languages | Available now (off-the-shelf)
Free sample preview
Pricing available upon request

Multilingual Full Duplex Conversational Speech Data | 2 Million Hours | Audio AI & ML Training Data
API available
Starts at
$20,000 / purchase

Knuckle Head Data Annotation and Labelling Services (NLP Data for English, French, Spanish, Italian, Portuguese, Japanese, Indian)
Pricing available upon request

AI Training Dataset | UK Financial Services | Sentiment & Thematic Labels | 2.5m+ Annotated Reviews from Smart Money People
Free sample preview
Pricing available upon request

VLM GUI Agent Data | 1 Million Sets | Multimodal Data | COT Data | AI & ML Training Data
API available
Starts at
$20,000 / purchase

Multilingual Full Duplex Conversational Speech Data | 2 Million Hours | Audio AI & ML Training Data
API available
Starts at
$20,000 / purchase

Chinese Language Datasets | 583KTranslations | 178K Words | NLP | Dictionary Display | Translations Data | APAC coverage | Mandarin | Cantonese
Free sample preview
API available
Pricing available upon request

EMEA Data Suite | 3.3M Translations | 1.9M Words | 23 Languages | Natural Language Processing (NLP) Data | Translation Data | TTS | EMEA Coverage
Free sample preview
API available
Pricing available upon request

Accented English Speech Dataset | 1.5K+ recordings | Scripted Monologues | Global Coverage
Free sample preview
Starts at
$1,990 / purchase