What is Alternative Data?
Alternative data (or non-traditional data) is collected from data sources, which are usually unavailable to the general public and at first glance may not have a direct connection to a specified use case. In most cases, alternative data aims to supplement an already existing understanding of a certain economic development with correlated data that is rather exclusive or at least only available to a limited amount of parties. It is mainly used during financial analysis to gauge a company’s performance before public announcements are made by the company itself. Using alternative data can give an investor competitive advantages over others, thus being able to beat the average market returns. This difference is called alpha.
Typical data sources for alternative data are scraped websites, satellite images, point of sale data and many more. Most of the sources used for alternative data are used for other primary use cases. The main challenge for interested parties is, therefore, identifying the potential of such data sources for alternative use cases.
How to use Alternative Data and for what use cases?
In general, the main use case of alternative data is to receive insights about the economic development of a company or predicting market trends before this information is released to the broad public. This head start in informational coverage enables beneficiaries to make relevant business decisions before the general population.
In today’s information society, news about companies, stocks or other financial products spreads quickly through social media and news outlets. This is especially prominent when quarterly or yearly results get released, oftentimes leading to stark movements on the respective stock or other related financial instruments. Therefore institutional investors such as hedge funds attempt to strengthen or weaken their own position in stock prior to these events based on alternative data sources.
Many enterprises like large retailers require up to date information about their competitors in order to drive advanced business processes like pricing strategies, procurement orders or branch network modifications. As most competitors will protect their key performance indicators from being released prematurely, this information has to be inferred through alternative data sources that can act as a proxy.
One of the industries that has been employing data-driven processes basically from its inception is the insurance industry. Most global insurances utilize hundreds or even thousands of various data sources to assess risks within their underwriting process. While historical statistical data can feed most insurance processes to a certain degree, further reducing the uncertainty of policies requires the usage of alternative data.
As market research firms act as an agent for their customers in finding exclusive relevant information or forecasts, they sometimes have to rely on alternative data sets in order to produce more accurate results than their competitors, which may base their predictions on traditional sources like industry-specific performance indicators or consumer surveys.
What are typical Alternative Data attributes?
First of all, it must be emphasized that there is no such thing as a typical alternative data attribute, which is inherent in the name. Every attribute that can be correlated to a required outcome/prediction can potentially serve as alternative data. Therefore, this section will give a non-exhaustive list of known data attributes that already serve as alternative data sources for various use cases.
Many different types of information available on the internet can be used as alternative data, like social media posts or pricing data from online platforms. They may be used to infer sentiment towards a company or current sales performance of a specific product or brand.
Point of sales data as well as credit card purchase records can further indicate particularly strong or weak sales periods of certain products and as such the performance of their connected companies.
Utilizing fulfillment records of ports, freight airports, and truck toll stations can give detailed insights into supply chain processes and oftentimes into the underlying product quantities being shipped. Such information can then be used to predict export numbers of a certain product or brand.
The sales numbers of car brands are usually reported through their quarterly results. Tapping into newly signed car insurance policies through a partnered insurance company can give accurate predictions for these numbers, thus allowing buying and selling shares accordingly prior to the public announcement of the quarterly results.
Monitoring satellite images of agricultural areas can provide indicators of crop yield, which is dependent on factors like precipitation, droughts and forest fires. Predicting agricultural yields allows predicting the market supply of specific crops and products based around them. This is, for example, relevant for financial products that cover cocoa, corn, ethanol, sugar, and others but is also relevant for the stocks of companies that are largely dependent on raw materials. Furthermore, analyzing the performance of agricultural yield may provide an inference of the future performance of fertilizer producing companies.
While the economic growth of a company is usually reported periodically, an indication can sometimes be inferred through changes in headcount. Therefore scraping and analyzing job posting boards may serve to correlate the number of newly created job postings of a company with its growth in terms of headcount.
The prices of commodities like oil, gold, and silver are usually dependent on global changes in global demand as well as production adjustments by large suppliers. For some production and storage facilities, low altitude drone imagery sometimes serves as a method to indicate their saturation or capacities. Applying this strategy on outdoor oil storage facilities can, for example, predict the short-term supply of crude oil on the financial markets.
By offering WiFi hotspots, companies can measure the number of people around certain points of interest. This information can be used to depict the amount of footfall traffic that specific retailers receive. Aggregating multiple of those sources for retail brands on a national or international level can conclude the overall revenue performance trend of each.
How is Alternative Data typically collected?
As already established in the previous sections, there are no exhaustive lists of data types or attributes that can serve as alternative data. Many companies may produce alternative data as kind of a byproduct of one of their primary functions or processes. This leads to the conclusion that there is also an endless number of possibilities to collect alternative data. The following list depicts some common data types that can serve as alternative data with likely sources for them:
- Web data (search trends, social media sentiment, email receipts): Provided by web service operators or external scrapers
- Mobile data (app usage, telephone history): Available from app analytics SDKs and telecommunication companies.
- Satellite data (satellite images, weather forecast, traffic images): Available through specialized satellite networks.
- CCTV or drone imagery (traffic images, visitor counts, behavior): Provided by traffic monitoring companies or specialized drone operators.
- Sensor data (WiFi tracking data, Bluetooth beacons data, but also sound, air-quality and seismic sensor data): Provided by WiFi hotspot operators, weather stations or specialized footfall traffic measurement companies.
- Traffic data (license plate recognition and toll-road data, but also ship and plane movements): Collected by toll stations, ports, airports and satellite networks.
- Credit-card data (transaction history): Collected by credit-card providers.
- Banking records (transaction history, loans): Available through banking integrations.
- Corporate data (sales, finance, and human resources data): Collected individually by many companies for other primary use cases.
How to assess the quality of Alternative Data?
Most companies that are willing to employ alternative data will employ rigorous testing methods to ensure that the data is fulfilling its purpose before procuring it. This is why many alternative data providers offer sufficient testing periods that allow technical teams of the buyers to process the data accordingly and to perform test cases on it.
As in many scenarios, the purpose of alternative data is to predict a desired outcome, testing periods are usually used to solidify and understanding about the data source. That includes proving the correlation and if possible the causality between the alternative data source and its effect on a financial instrument or a similar outcome.
How is Alternative Data typically priced?
Alternative data is usually priced through a yearly license fee that is individually agreed on between a buyer and provider. Prices may differ drastically from case to case based on update frequencies, preprocessing services and exclusivity clauses.
While some providers now offer monthly subscriptions to “alternative data”, those offerings much rather represent core financial or industry-specific data as well as access to structured web data. As monthly subscriptions are usually targeted to a broader audience, data sets offered through such pricing models can not provide exclusivity and as such are not actual alternative data.
What is the #1 issue when buying Alternative Data?
The number one issue of every alternative data set is its exclusivity. Most parties that employ alternative data want to be the only user of it, thus having sole competitive advantage over other market participants. This motivation of data buyers faces an opposing incentive of the alternative data providers, which want to maximize their revenue and therefore are willing to sell to as many market participants as possible.
Buyers have to make sure that the alternative data providers are transparent with their existing customer relations. In order to establish an exclusive contract for a specific alternative data set, buyers can usually provide a higher monetary incentive to make up for the difference in revenue that a provider could have gotten from multiple non-exclusive contracts.
Another major issue that is inherent in buying alternative data is assessing its potential impact for an intended use case. As most alternative data sources are not specifically made to feed the specific business process or financial decision, it is key to understand the causality between the two. Moreover, most alternative data sets are delivered in their original format and therefore require excessive cleansing and preparation in order to be ingested into the target system.
What to ask Alternative Data providers?
- What are the raw data sources and how is the data collected?
- How often does the data set receive updates?
- How many other parties are using this dataset?
- Is an exclusive contract possible?
- What financial product/s does this data set correlate with?
- Is there a way to test the product for an agreed timespan?