Alternative data (finance)

Alternative data (in finance) refers to data used to obtain insight into the investment process. These data sets are often used by hedge fund managers and other institutional investment professionals within an investment company. Alternative data sets are information about a particular company that is published by sources outside of the company, which can provide unique and timely insights into investment opportunities.

Alternative data sets are often categorized as big data, which means that they may be very large and complex and often cannot be handled by software traditionally used for storing or handling data, such as Microsoft Excel. An alternative data set can be compiled from various sources such as financial transactions, sensors, mobile devices, satellites, public records, and the internet. Alternative data can be compared with data that is traditionally used by investment companies such as investor presentations, SEC filings, and press releases. These examples of "traditional data" are produced directly by the company itself.

Since alternative data sets originate as a product of a company's operations, these data sets are often less readily accessible and less structured than traditional sources of data. Alternative data is also known as "data exhaust". The company that produces alternative data generally overlooks the value of the data to institutional investors. During the last decade, many data brokers, aggregators, and other intermediaries began specializing in providing alternative data to investors and analysts.

Types
Examples of alternative data include:
 * Geolocation (foot traffic)
 * Credit card transactions
 * Email receipts
 * Point-of-sale transactions
 * Web site usage
 * Mobile App or App Store analytics
 * Crowdsourcing
 * Obscure city hall records
 * Satellite images
 * Social media posts
 * Online browsing activity
 * Shipping container receipts
 * Product reviews
 * Price trackers
 * Shipping trackers
 * Internet activity and quality data



Uses
Alternative data is being used by fundamental and quantitative institutional investors to create innovative sources of alpha. The field is still in the early phases of development, yet depending on the resources and risk tolerance of a fund, multiple approaches abound to participate in this new paradigm.

The process to extract benefits from alternative data can be extremely challenging. The analytics, systems, and technologies for processing such data are relatively new and most institutional investors do not have capabilities to integrate alternative data into their investment decision process. However, with the right tools and strategy, a fund can mitigate costs while creating an enduring competitive advantage.

Most alternative data research projects are lengthy and resource intensive; therefore, due-diligence is required before working with a data set. The due-diligence should include an approval from the compliance team, validation of processes that create and deliver this data set, and identification of investment insights that can be additive to the investment process.

However, the usage of the alternative data is not restricted by investment sphere, it is successfully used in economics and politics as well as retail and e-commerce spheres. It is possible to predict geopolitical risk through a profound alternative data analysis, while social media sites reveal a host of data for consumer sentiment analysis.

Methodology
Alternative data can be accessed via:


 * Web scraping (or web Harvesting, performed by computer programmers that design an algorithm that searches websites for specific data on a desired topic)
 * Acquisition of Raw data
 * Third-party Licensing

Analysis
In finance, Alternative data is often analyzed in the following ways:


 * Scarcity: the data Information overload within financial markets
 * Granularity: the level of detail and aggregation of data (including time)
 * History: the trajectory of data
 * Structure: the form of the data (csv, json etc.)
 * Coverage: the stocks or geographical locations that data can be linked with

Best practices
While compliance and internal regulation are widely practiced in the alternative data field, there exists a need for an industry-wide best practices standard. Such a standard should address personally identifiable information (PII) obfuscation and access scheme requirements among other issues. Compliance professionals and decision makers can benefit from proactively creating internal guidelines for data operations. Publications such as NIST 800-122 provide guidelines for protecting PII and are useful when developing internal best practices. Investment Data Standards Organization (IDSO) was established to develop, maintain, and promote industry-wide standards and best practices for the Alternative Data industry.

Web Scraping
Legal aspects surrounding web scraping of alternative data have yet to be defined. Current best practices address the following issues when determining legal compliance of web crawling operations:

Web scraped data refers to data harvested from public websites. With 4 billion webpages and 1.2 million terabytes of data on the internet, there is a mountain of information that can be valuable to investors when analyzing a corporate performance. The companies that specialize in this type of data collection, like Thinknum Alternative Data,  write programs that access targeted websites and collect and store the scraped information on a periodic basis. In some cases web scraping requires use of public APIs as a way to access the data within those pages directly without visiting the actual website. Types of web scraped data include:
 * Review of the terms and conditions associated with the websites crawled
 * Control over the potential interference with crawled websites
 * Job listings: A company that is increasing hiring and headcount is likely experiencing growth.
 * Company ratings: Sites like Glassdoor allows employees to rate their company; increasing ratings, especially (in conjunction with increasing job listings) can be another growth indicator.
 * Online retail data: High product rankings on online retailers suggest strong sales for those product manufacturers. On the flip side, heavy discounting of products suggest weak sales.

Standards Board for Alternative Investment (SBAI) is the global standard-setting agency for the alternative investment industry and guardian of the Alternative Investment Standards. The agency supported by approximately 200 alternative investment managers and institutional investors and collectively manage $3.5 trillion. The SBAI has published the Standardised Trial Data License Agreement which addresses investment managers' issues when comes to new data trailing process, like alternative data and big data. Thomas Deinet, Executive Director of the SBAI said: "This Trial Data Licence Agreement template highlights a number of very important issues, including personal data protection, which has become a hot topic in light of the overhaul of data protection regulation in many jurisdictions. It also includes key protections for managers in areas such as prevention of insider trading and 'right to use data'. It is crucial that managers and data vendors fully understand all risks when selling and using new data."