
TAUS: Parallel text, E-commerce - Product descriptions, various language pairs
A dataset by TAUS
Starts at
€5,000 / purchase
Textual data English | Textual data Japanese | SampleHead | SampleHead | |
---|---|---|---|---|
1 | We ordered some new books from English. | あの本はあなたのですか | Value | Value |
2 | We ordered some new books from English. | あの本はあなたのですか | Value | Value |
Volume
200,000 | sentences per language pair |
1 million | words per language pair |
Use Cases
Geography
Asia
(2)
China
Japan
Europe
(6)
France
Germany
Italy
Netherlands
Poland
Spain
North America
(2)
Canada
United States of America
Oceania
(1)
Australia
Categories
Data Attributes
Attribute | Example | Description |
---|---|---|
Textual data English | We ordered some new books from English. | Example of sentence |
Textual data Japanese | あの本はあなたのですか | Example of sentence |
History
1 years of past data available
Product Description
From technology items to office equipment, home furniture and pets accessories, to clothing and beauty products. Not to forget all the necessary gear for hobbies like photography, (motor)biking, fishing, skiing and gaming, and even some very special collector's items like stamps, coins, paintings, books, miniature cars, vinyl records and Panini stickers albums. These are just some examples of the products covered in our e-commerce corpora.
These corpora were created in collaboration with eBay Inc, which provided bilingual query data sets including a representative selection of key product descriptions. Based on that, we've applied TAUS proprietary Matching Data technology to extract the data from the TAUS Data Cloud, a large industry-shared repository of parallel corpora. According to the eBay Inc's linguistic assessment, our corpora were found to be 'of good quality and appropriate to consume as aligned corpora to that provided in the eBay sample'.
Data is available in parallel format and new language pairs can be created quickly:
French - Dutch
German - Polish
German - Italian
English - Italian
English - Japanese
German (Germany) - Spanish (Spain)
German (Germany) - French (France)
Other languages are available on demand.
Suitable Company Sizes
Small Business
Medium-sized Business
Enterprise
Pricing
Free sample available
License | Starts at |
---|---|
One-off purchase |
€5,000 / purchase |
Monthly license | Not available |
Yearly license | Not available |
Usage-based | Not available |
Revenue share | Not available |
Quality
Self-reported by the provider
Delivery
Methods
S3 Bucket
SFTP
Email
UI Export
REST API
SOAP API
Streaming API
Feed API
Frequency
secondly
minutely
hourly
daily
weekly
monthly
quarterly
yearly
real-time
on-demand
Format
.bin
.json
.xml
.csv
.xls
.sql
.txt
Related Products
TAUS: Parallel text, Medical / Pharmaceutical - Languages: EN-ES, EN-DE, EN-FR
by TAUS
High fidelity MT training data is always important, even more so when it comes to medical subjects. This is a must-have corpus for anyone seeking for pharma-related data.
Volume | 3M Million words per language, 150K Sentences per language |
---|---|
Quality | 100% words |
Country | USA Germany United Kingdom + 3 others |
History | 1 years of past data available |
Use Case | Artificial Intelligence (AI), Machine Learning (ML) + 1 more |
Global E-commerce Product Pricing Data
by Allfactor
Global E-commerce Product Database
Volume | 5B records |
---|---|
Country | USA China Germany + 9 others |
Use Case | Retail, 360-Degree Customer View + 3 more |
Wult Product Information (e-commerce, services and more)
by Wult.io
Overview
Wult's structured e-commerce and product dataset provides you with all the information relating to a product, category or brand products.
More detail
Our data is updated as pr...
TAUS: Parallel text, Legal, contracts and obligations - Languages: See below
by TAUS
When settling an agreement, there should be no doubt about the conditions and mutual obligations. Contracts and agreements are subject to close scrutiny, so you'd better be sure that everything in ...
Volume | 200K Sentences per language , 5M Million words per language |
---|---|
Quality | 100% words |
Country | USA China Germany + 4 others |
History | 1 years of past data available |
Use Case | Artificial Intelligence (AI), Machine Learning (ML) + 1 more |
E-commerce Data
by Luminati
Luminati’s eCommerce data collector is designed to help your business drive sales and gain advantages. This tool enables you to crawl marketplaces, competitor sites and listings in order to gain ac...
Quality | 97% Success rate in real-time |
---|---|
Country | USA China Japan + 242 others |
Use Case | B2B Marketing, Business Intelligence (BI) + 3 more |