Nexdata | Multilingual Parallel Corpus Data | 200 Million Pairs | Text AI Training Data | Natural Language Processing Data | Translation Data product image in hero

Nexdata | Multilingual Parallel Corpus Data | 200 Million Pairs | Text AI Training Data | Natural Language Processing Data | Translation Data

Nexdata
No reviews yetBadge iconVerified Data Provider
#
Dataset Name
Format
Samples
1 xxxxxxxxxx Xxxxxxxxx xxxxxx xxxxxxxxxx
2 Xxxxx Xxxxxx Xxxxxxxxxx Xxxxxx
3 Xxxxxxxxx Xxxxxxxxxx xxxxxxxxx Xxxxxxxxx
4 xxxxxxxxx Xxxxxxx xxxxxx Xxxxx
5 xxxxxxxxxx xxxxxx Xxxxxxxxxx xxxxxx
6 Xxxxx Xxxxxx xxxxx xxxxxxxx
7 xxxxxxx Xxxxx Xxxxxxxx xxxxxxxxxx
8 xxxxxx Xxxxxxxxx xxxxxx Xxxxxxxxx
9 Xxxxxxxxx xxxxxxxxxx Xxxxxx Xxxxx
10 xxxxxx xxxxxxx xxxxxxx Xxxxx
... xxxxxx Xxxxxxxxxx xxxxxxxx xxxxxx
Sign In To Preview Data
Volume
200
million pairs
Data Quality
90%
Accuracy
Avail. Formats
.bin, .json, and .xml
File
Coverage
109
Countries
History
10
years

Data Dictionary

[Sample] Nexdata-Multilingual Parallel Corpus Data.csv
Attribute Type Example Mapping
Dataset Name
String 1,340,000 Groups English-Korean Parallel Corpus Data
String English-Korean Language Name
Format
String TXT
Samples
String https://www.nexdata.ai/dataset/154?source=Datarade

Description

1. Overview Off-the-shelf parallel corpus data (Translation Data) covers many fields including spoken language, traveling, medical treatment,news, and finance. Data cleaning, desensitization, and quality inspection have been carried out. 2. Specifications Storage format : TXT Data content : Parallel Corpus Data Data size : 200 million pairs Language : 20 languages Application scenario : machine translation Accuracy rate : 90% 3. About Nexdata Nexdata owns off-the-shelf 1,000,000 hours of speech recognition data, 800TB of Annotated Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. These ready-to-go Translation Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/nlu?source=Datarade

Country Coverage

Africa (5)
Algeria
Egypt
Morocco
South Africa
Tunisia
Asia (44)
Afghanistan
Armenia
Azerbaijan
Bahrain
Bangladesh
Cambodia
China
Cyprus
Georgia
Hong Kong
India
Indonesia
Iran (Islamic Republic of)
Iraq
Israel
Japan
Jordan
Kazakhstan
Korea (Republic of)
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Lebanon
Macao
Malaysia
Maldives
Mongolia
Myanmar
Oman
Pakistan
Palestine, State of
Philippines
Qatar
Saudi Arabia
Singapore
Sri Lanka
Syrian Arab Republic
Taiwan
Tajikistan
Thailand
Turkey
United Arab Emirates
Uzbekistan
Vietnam
Europe (39)
Albania
Austria
Belarus
Belgium
Bosnia and Herzegovina
Bulgaria
Croatia
Czech Republic
Denmark
Estonia
Finland
France
Germany
Greece
Hungary
Iceland
Ireland
Italy
Latvia
Lithuania
Luxembourg
Macedonia (the former Yugoslav Republic of)
Malta
Moldova (Republic of)
Montenegro
Netherlands
Norway
Poland
Portugal
Romania
Russian Federation
Serbia
Slovakia
Slovenia
Spain
Sweden
Switzerland
Ukraine
United Kingdom
North America (3)
Canada
Mexico
United States of America
Oceania (2)
Australia
New Zealand
South America (16)
Argentina
Bolivia (Plurinational State of)
Brazil
Chile
Colombia
Cuba
Dominica
Dominican Republic
Ecuador
Grenada
Jamaica
Paraguay
Peru
Puerto Rico
Uruguay
Venezuela (Bolivarian Republic of)

History

10 years of historical data

Volume

200 million pairs

Pricing

Free sample available
License Starts at
One-off purchase
$5,000 / purchase
Monthly License Not available
Yearly License Not available
Usage-based Not available

Suitable Company Sizes

Small Business
Medium-sized Business
Enterprise

Quality

Self-reported by the provider
90%
Accuracy

Delivery

Methods
S3 Bucket
SFTP
Email
UI Export
REST API
SOAP API
Streaming API
Feed API
Frequency
secondly
minutely
hourly
daily
weekly
monthly
quarterly
yearly
real-time
on-demand
Format
.bin
.json
.xml
.csv
.xls
.sql
.txt

Use Cases

Categories

Related Searches

Related Products

50 TB per month
98% accuracy
117 countries covered
Nexdata provides high-quality Natural Language Processing (NLP) Data annotation for text cleaning, entity tagging, named entity tagging, text classification ...
600 Hours of Recording
64 countries covered
We offer a comprehensive collection of audio data, amounting to over 600 hours of high-quality recordings. Our audio datasets are meticulously curated and de...
730M Individual Profiles
99% Complete and Fully Updated Data
250 countries covered
Xverum’s Machine Learning (ML) data will help you to train LLMs and generative AI with 800M B2B profiles. 100+ attributes, global coverage, and GDPR-complian...
20K voice memos
240 countries covered
We help clients source, curate, and transcribe data for AI and machine learning models. Our services include customized audio data collection and transcripti...

Frequently asked questions

What is Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data?

Off-the-shelf parallel corpus data (Translation Data) covers many fields including spoken language, traveling, medical treatment,news, and finance. Data cleaning, desensitization, and quality inspection have been carried out.

What is Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data used for?

This product has 3 key use cases. Nexdata recommends using the data for Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning. Global businesses and organizations buy Natural Language Processing (NLP) Data from Nexdata to fuel their analytics and enrichment.

Who can use Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data?

This product is best suited if you’re a Medium-sized Business or Enterprise looking for Natural Language Processing (NLP) Data. Get in touch with Nexdata to see what their data can do for your business and find out which integrations they provide.

How far back does the data in Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data go?

This product has 10 years of historical coverage. It can be delivered on a secondly, minutely, hourly, daily, weekly, monthly, quarterly, yearly, real-time, and on-demand basis.

Which countries does Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data cover?

This product includes data covering 109 countries like USA, China, Japan, Germany, and India. Nexdata is headquartered in United States of America.

How much does Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data cost?

Pricing for Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data starts at USD5,000 per purchase. Connect with Nexdata to get a quote and arrange custom pricing models based on your data requirements.

How can I get Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data?

Businesses can buy Natural Language Processing (NLP) Data from Nexdata and get the data via S3 Bucket, SFTP, Email, UI Export, REST API, SOAP API, Streaming API, and Feed API. Depending on your data requirements and subscription budget, Nexdata can deliver this product in .bin, .json, .xml, .csv, .xls, .sql, and .txt format.

What is the data quality of Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data?

Nexdata has reported that this product has the following quality and accuracy assurances: 90% Accuracy. You can compare and assess the data quality of Nexdata using Datarade’s data marketplace.

What are similar products to Nexdata Multilingual Parallel Corpus Data 200 Million Pairs Text AI Training Data Natural Language Processing Data Translation Data?

This product has 3 related products. These alternatives include Nexdata Text Annotation Services AI-assisted Labeling Text Labeling for AI & ML Text Data Natural Language Processing (NLP) Data, WebAutomation Off the Shelf Datasets Audio Data for AI & ML Training 600+ Hours of Recording Speech Recognition, Natural Language Processing, and AI & ML Training Data 800M Profiles for LLMs, Generative AI, NLP & Predictive Models. You can compare the best Natural Language Processing (NLP) Data providers and products via Datarade’s data marketplace and get the right data for your use case.

Starts at
$5,000 / purchase
License Starts at
One-off purchase
$5,000 / purchase
Monthly License Not available
Yearly License Not available
Usage-based Not available

Nexdata

Sharpen Your AI with Better Data

Verified provider icon Verified Provider
100% Response rate