Best Large Language Model (LLM) Datasets & Databases
Easily explore, compare & preview top Large Language Model (LLM) Datasets via Datarade.
73 Large Language Model (LLM) Data Datasets

Unsupervised Speech Data |1 Million Hours | Spontaneous Speech | LLM | Pre-training |Large Language Model(LLM) Data
Available in
and 42 more countries

FileMarket | 20,000 photos | AI Training Data | Large Language Model (LLM) Data | Machine Learning (ML) Data | Deep Learning (DL) Data |
Available in
and 244 more countries

Large Language Model (LLM) Data | Machine Learning (ML) Data | AI Training Data (RAG) for 1M+ Global Grocery, Restaurant, and Retail Stores
ZIP Code
Latitude
State Abbreviation
City Name
URL
and 6 more attributes
Available in
and 245 more countries

Large Language Model (LLM) Noise Level Data | Noise Complaints | CCPA, GDPR Compliant | 160k Data Points | 100% Traceable Consent
Latitude
Longitude
Country Code Alpha-2
Available in
and 231 more countries

TagX | 10000+ Multilingual Image Dataset | Text Detection | Global coverage | LLM data | LLM finetuning
Available in
and 97 more countries

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training
Available in
and 245 more countries

Foundation Model Data Collection and Data Annotation | Large Language Model(LLM) Data | SFT Data| Red Teaming Services
Available in
and 109 more countries

Large Language Model (LLM) Training Data | 180+ Countries | AI-Enhanced Ground Truth Based | 10M+ Hours of Measurements | 100% Traceable Consent
Latitude
Longitude
Country Code Alpha-2
Available in
and 245 more countries

TagX Data collection for AI/ ML training | LLM data | Data collection for AI development & model finetuning | Text, image, audio, and document data
Product Name
Available in
and 244 more countries

French Language Datasets | 150+ Years of Research | AI | NLP | LLMs | Dictionary Display | Translation Data | EU, Africa, Canada Coverage
Available in
and 32 more countries
Can't find the data you're looking for?
Let data providers come to you by posting your request
Post your request
More Large Language Model (LLM) Data Products
Discover related large language model (llm) data products.

African English Accent Conversational Dataset — Gender, Age, City Metadata with Validated Speech Samples
Free sample preview
Starts at
$22$19.80 / hour

FileMarket | Biometric Data | Human Palm Image Dataset: 20,000 Photos for Machine Learning (ML) Data and AI Model Training
Free sample preview
Pricing available upon request

American English Language Datasets | 150+ Years of Research | Textual Data | NLP | LLMs | TTS | Dictionary Display | Game | US English Coverage
Free sample preview
API available
Pricing available upon request

Foundation Model Data Collection and Data Annotation | Large Language Model(LLM) Data | SFT Data| Red Teaming Services
Free sample preview
API available
Starts at
$20,000 / purchase

Venue Noise-Level Dataset for AI — Real Acoustic Profiles from 10M+ POI Measurements
Free sample preview
Starts at
$2,500$2,250 / month

AI Training Data- Spontaneous Conversations On-Demand - Accent & Dialect Focus
Free sample preview
Starts at
$50 / Hour

Noise-Level Time-Series Dataset — Over 10M Hours of Temporal Acoustic Data for AI Forecasting
Free sample preview
Starts at
$5,000$4,500 / month

Large Language Model (LLM) Training Data | 180+ Countries | AI-Enhanced Ground Truth Based | 10M+ Hours of Measurements | 100% Traceable Consent
Free sample preview
Pricing available upon request

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training
Free sample preview
API available
Starts at
$1,000$900 / month

Global Call Center & Conversational Audio Dataset — Multilingual, Validated, with Demographics + Custom Collection Available
Free sample preview
Starts at
$7$6.30 / hour

Call Center Audio Recordings (100,000+ Hours, High-Quality) in Multiple Languages | Available now (off-the-shelf)
Free sample preview
Pricing available upon request

FileMarket | 20,000 photos | AI Training Data | Large Language Model (LLM) Data | Machine Learning (ML) Data | Deep Learning (DL) Data |
Free sample preview
Pricing available upon request

AI Training Audio Data - Scripted Conversations On-Demand - Accent & Dialect Focus
Free sample preview
Starts at
$50 / Hour

AI Training Data- Spontaneous Conversations On-Demand - Accent & Dialect Focus
Free sample preview
Starts at
$50 / Hour

AI Training - Text-To-Speech Dataset - Very Diverse Languages and Dialects
Free sample preview
Pricing available upon request

AI Training - Spontaneous Conversations Dataset - ANY LANGUAGE
Free sample preview
Starts at
$50 / Hour

EMEA Data Suite | 3.3M Translations | 1.9M Words | 23 Languages | Natural Language Processing (NLP) Data | Translation Data | TTS | EMEA Coverage
Free sample preview
API available
Pricing available upon request

German Language Datasets | 393K Translations | NLP | Dictionary Display | Machine Learning (ML) Data | Translations | EU Coverage
Free sample preview
API available
Pricing available upon request