Industrial big data

Industrial big data refers to a large amount of diversified time series generated at a high speed by industrial equipment, known as the Internet of things. The term emerged in 2012 along with the concept of "Industry 4.0”, and refers to big data”, popular in information technology marketing, in that data created by industrial equipment might hold more potential business value. Industrial big data takes advantage of industrial Internet technology. It uses raw data to support management decision making, so to reduce costs in maintenance and improve customer service. Please see intelligent maintenance system for more reference.

Definition
Big data refers to data generated in high volume, high variety, and high velocity that require new technologies of processing to enable better decision making, knowledge discovery and process optimization. Sometimes, the feature of veracity is also added to emphasize the quality and integrity of the data. However, for industrial big data, there should be two more "V’s". One is visibility, which refers to the discovery of unexpected insights of the existing assets and/or processes and in this way transferring invisible knowledge to visible value. The other "V" is value.


 * Background: General "Big Data" analytics often focuses on the mining of relationships and capturing the phenomena. Yet "Industrial Big Data" analytics is more interested in finding the physical root cause behind features extracted from the phenomena. This means effective "Industrial Big Data" analytics will require more domain know-how than general "Big Data" analytics.
 * Broken: Compared to "Big Data" analytics, "Industrial Big Data" analytics favors the "completeness" of data over the "volume" of the data, which means that in order to construct an accurate data-driven analytical system, it is necessary to prepare data from different working conditions. Due to communication issues and multiple sources, data from the system might be discrete and un-synchronized. That is why pre-processing is an important procedure before actually analyzing the data to make sure that the data are complete, continuous and synchronized.
 * Bad-Quality: The focus of "Big Data" analytics is mining and discovering, which means that the volume of the data might compensate the low-quality of the data. However, for "Industrial Big Data", since variables usually possess clear physical meanings, data integrity is of vital importance to the development of the analytical system. Low-quality data or incorrect recordings will alter the relationship between different variables and will have a catastrophic impact on the estimation accuracy.

Data acquisition, storage and management
As data from automated industrial equipment are being generated at an extraordinary speed and volume, the infrastructure of storing and managing these data becomes the first challenge any industry will face. Different from the tradition business intelligence which mostly focuses on internal structured data and processes that information in regularly occurring cycles, "Industrial Big Data” analytical system requires near real-time analytics and visualization of the results.

The first step is to collect the right data. Since the automation level of modern equipment is getting higher, data are being generated from an increasing number of sensors. Recognizing the parameters are related to equipment status is important to reducing the amount of data necessary to be collected and increase the efficiency and effectiveness of data analytics.

The next step is to build a data management system that will be able to handle large amounts of data and perform analytics in near real-time. In order to enable rapid decision making, data storage, management and processing need to be more integrated. General Electric has built a prototype data storage infrastructure for fleet of gas turbines. The developed in-memory data grids (IMDG)-based system was proved to be able to handle challenging high velocity and high volume data flow while performing near real-time analytics on the data. They believe that the developed technology has demonstrated a viable path to realize batch "Industrial Big Data” management infrastructure. As prices of memory becomes cheaper, such systems will become central and fundamental to future industry.

Cyber-physical systems
Cyber-physical systems is the core technology of industrial big data. Cyber-physical systems are systems that require seamless integration between computational models and physical components. Differing from the traditional operation technology, "Industrial Big Data” requires that the decision to be informed from a way wider scope, a central part of which is equipment status. T Improved processes will further increase productivity and reduce costs. This aligns with the mission of "Industrial Big Data”, which is to reveal insights from the large amount of raw data and turn that information into value. This combines the power of information technology and operation technology to create an information-transparent environment to support decisions for users of different levels.

Sample repositories
Every unit in an industrial system generates vast amount of data every moment. Billions of data samples are being generated by every single machine per day in a manufacturing line. As an example, a Boeing 787 generates over half a terabyte of data per flight. Clearly the volume of data generated by group of units in an industrial system is far beyond the capability of traditional methods therefore handling, managing and processing it would be a challenge.

In the course of last several years, researchers and companies have actively participated in collecting, organizing and analyzing huge industrial data sets. Some of these data sets are currently available for public usage for research purposes.

NASA data repository is one of the most famous data repositories for Industrial Big Data. Various data sets provided by this repository may be used for predictive analysis, fault detection, prognostics, etc.

Sample industrial big data analytics use cases
Leveraging machine learning and predictive analytics algorithms, industrial big data can help to create value in various use case scenarios like predictive maintenance (predicting and preventing machine failures or component failures, e.g. im manufacturing machines, airplanes, automotive vehicles, trains, wind turbine, oil pipelines, etc.), product quality prediction in early stages of the production process and product quality optimization (e.g. in the steel industry), prediction and prevention of critical situation in continuous production processes (e.g. in the chemical industry), prediction of product lifetime(e.g. car engines, wind turbine components, batteries, etc.), assembly plan prediction for new 3D product designs (e.g. for truck engine components, whitegoods like washers and dryers, etc.), energy demand prediction, demand forecasting, price forecasting, and many other use cases(see e.g. Industrial Data Science Conference (IDS 2017 and IDS 2019 ).