Filter by

Best NLP Datasets for ML projects

NLP datasets are curated collections of text data that are specifically designed for Natural Language Processing (NLP) tasks. These datasets encompass a wide range of textual information, including text corpora, sentiment analysis datasets, language translation data, and more. They serve as valuable resources for researchers, data scientists, and developers to train and evaluate NLP models, build chatbots, and develop language processing algorithms.

67 results
Logo of Oxford Languages

Portuguese Language Datasets | 300K Translations | Natural Language Processing (NLP) Data | Dictionary Display | Translation | EU & LATAM Coverage

by Oxford Languages
Available in
Brazil
Portugal
Angola
Macao
Mozambique
and 4 more countries
Logo of Nexdata

Audio Annotation Services | AI-assisted Labeling |Speech Data | AI Training Data | Natural Language Processing (NLP) Data

by Nexdata
Language Name
Available in
USA
UK
Germany
France
Italy
and 110 more countries
Logo of Xverum

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training

by Xverum
5.0
Available in
USA
UK
Germany
France
Italy
and 245 more countries
Logo of FileMarket

Global English Speech with Accent Conversational Dataset — Multi-Region Validated Speech with Gender, Age & Metadata for AI & NLP Training

by FileMarket
Available in
USA
UK
Germany
France
Italy
and 245 more countries
Logo of APISCRAPY

AI & ML Training Data | Artificial Intelligence (AI) | Machine Learning (ML) Datasets | Deep Learning Datasets | Easy to Integrate | Free Sample

by APISCRAPY
4.9
Available in
USA
UK
Germany
France
Italy
and 56 more countries
Logo of Allforce

Purchase Intent Data | Contact Level Interest Data | 320M+ B2B & B2C Contacts | 21,000 Interest Categories | Daily Leads

by Allforce
5.0
Company Name
Company Phone Number
Company Employee Count
Company Annual Revenue
Company Email Address
and 17 more attributes
Available in
USA
Logo of Canaria Inc.

Indeed Data – US Company & Job Postings Indeed Data with Salaries, Hiring Activity & Matchable Google Maps for HR Analytics & Business Development

by Canaria Inc.
5.0
Available in
USA
Logo of Knuckle Head

Knuckle Head Data Annotation and Labelling Services (NLP Data for English, French, Spanish, Italian, Portuguese, Japanese, Indian)

by Knuckle Head
Available in
USA
UK
Germany
France
Italy
and 186 more countries
Logo of Oxford Languages

British English Language Datasets | 150+ Years of Research | Natural Language Processing (NLP) Data | LLMs | TTS | Dictionary Display | EU Coverage

by Oxford Languages
Available in
UK
India
Australia
Nigeria
South Africa
and 10 more countries
Logo of Nexdata

In-Cabin Speech Data | 15,000 Hours | AI Training Data | Speech Recognition Data | Audio Data |Natural Language Processing (NLP) Data

by Nexdata
Language Name
Available in
USA
UK
Germany
France
Italy
and 73 more countries