Optimized for quick response
TAUS Data Products: APIs & Datasets
TAUS Pricing & Cost
We set our prices based on factors such as the rarity of the language pair, the locale the data is requested from, the volume of the data requested, and so on. Contact us to describe your data needs to get a pricing quote.
TAUS Reviews
Your Review
There are still only a few reviews and ratings for TAUS at the moment. Have you worked with TAUS? You can help other data professionals better understand TAUS’s data products and services by leaving a review now.
TAUS Competitors & Alternatives
About TAUS
TAUS in a Nutshell
TAUS was founded in 2005 as a think tank with a mission to automate and innovate translation. Ideas transformed into actions. TAUS became the language data network offering the largest industry-shared repository of data, deep know-how in language engineering and a network of data contributors and annotators around the globe. Our mission today is to empower global enterprises and their service and technology providers with data solutions that help them to communicate in all languages,
Country Coverage
Data Offering
TAUS offers text, speech, and image data. You buy domain-specific and crowdsourced datasets from the Data Library or buy and sell on the new Data Marketplace. TAUS also offers cleaning, anonymization, quality review, annotation, and other data preparation services performed by our own global community of data contributors and annotators.
Use Cases
Our use cases include:
Bilingual or monolingual language data collection in any given domain and language pair.
Low-resource language data generation
NER (Named Entity Recognition) Tagging
Text, speech or image data annotation
Speech, text or image data collection
Domain-specific dataset generation based on client’s own sample dataset
Data Sources & Collection
TAUS has several sources of data:
1- A large repository of legacy data coming from TAUS member uploads with more than 35B words in over 600+ language pairs.
2- Data Marketplace: a language data monetization and acquisition platform for trading data.
3- Data Library: Tailor-made datasets matching the sample data you provide. You can also buy from the library of read-made corpora.
4- HLP Community: Data generation and annotation in the requested locales by our community of data contributors.
Key Differentiators
TAUS has 15+ years of experience in thought-leading and innovation in the language data space. We have a proven ability to mobilize a big community of data contributors and annotators all around the globe. With an in-house NLP Team we are able to match your sample dataset to create you a tailor-made one. Next to our vast library of ready-made datasets, we are happy to provide on-demand data solutions for your projects.
Data Privacy
The TAUS IT environment has been implemented with IT security and data protection as the highest priority. Our infrastructure is hosted on AWS which is fully GDPR compliant. We have performed final revisions and corrections in order to deliver a complete IT Security Framework. This framework is based on “General IT security” policies and best practices which include also the GDPR parts of Personal Data protection and processing limitations.
Frequently asked questions about TAUS
What does TAUS do?
TAUS is the language data network offering the largest repository of language data, deep know-how in language engineering, and a network of data contributors around the globe. Our mission is to empower global enterprises and their technology providers with data solutions.
How much does TAUS cost?
TAUS’s APIs and datasets range in cost from €5,000 / purchase to €100,000 / purchase. TAUS offers free samples for individual data requirements. Get talking to a member of the TAUS team to receive custom pricing options, information about data subscription fees, and quotes for TAUS’s data offering tailored to your use case.
What kind of data does TAUS have?
AI Training Data, Ecommerce Data, Court Data, Natural Language Processing (NLP) Data, and 2 others
What data does TAUS offer?
TAUS offers text, speech, and image data. You buy domain-specific and crowdsourced datasets from the Data Library or buy and sell on the new Data Marketplace. TAUS also offers cleaning, anonymization, quality review, annotation, and other data preparation services performed by our own global community of data contributors and annotators.
How does TAUS collect data?
TAUS has several sources of data: 1- A large repository of legacy data coming from TAUS member uploads with more than 35B words in over 600+ language pairs. 2- Data Marketplace: a language data monetization and acquisition platform for trading data. 3- Data Library: Tailor-made datasets matching the sample data you provide. You can also buy from the library of read-made corpora. 4- HLP Community: Data generation and annotation in the requested locales by our community of data contributors.
What’s TAUS’s data privacy policy?
The TAUS IT environment has been implemented with IT security and data protection as the highest priority. Our infrastructure is hosted on AWS which is fully GDPR compliant. We have performed final revisions and corrections in order to deliver a complete IT Security Framework. This framework is based on “General IT security” policies and best practices which include also the GDPR parts of Personal Data protection and processing limitations.
What are the best use cases for TAUS’s data?
Our use cases include: Bilingual or monolingual language data collection in any given domain and language pair. Low-resource language data generation NER (Named Entity Recognition) Tagging Text, speech or image data annotation Speech, text or image data collection Domain-specific dataset generation based on client’s own sample dataset