Fault detection and isolation

Fault detection, isolation, and recovery (FDIR) is a subfield of control engineering which concerns itself with monitoring a system, identifying when a fault has occurred, and pinpointing the type of fault and its location. Two approaches can be distinguished: A direct pattern recognition of sensor readings that indicate a fault and an analysis of the discrepancy between the sensor readings and expected values, derived from some model. In the latter case, it is typical that a fault is said to be detected if the discrepancy or residual goes above a certain threshold. It is then the task of fault isolation to categorize the type of fault and its location in the machinery. Fault detection and isolation (FDI) techniques can be broadly classified into two categories. These include model-based FDI and signal processing based FDI.

Model-based FDI


In model-based FDI techniques some model of the system is used to decide about the occurrence of fault. The system model may be mathematical or knowledge based. Some of the model-based FDI techniques include observer-based approach, parity-space approach, and parameter identification based methods. There is another trend of model-based FDI schemes, which is called set-membership methods. These methods guarantee the detection of fault under certain conditions. The main difference is that instead of finding the most likely model, these techniques omit the models, which are not compatible with data.

The example shown in the figure on the right illustrates a model-based FDI technique for an aircraft elevator reactive controller through the use of a truth table and a state chart. The truth table defines how the controller reacts to detected faults, and the state chart defines how the controller switches between the different modes of operation (passive, active, standby, off, and isolated) of each actuator. For example, if a fault is detected in hydraulic system 1, then the truth table sends an event to the state chart that the left inner actuator should be turned off. One of the benefits of this model-based FDI technique is that this reactive controller can also be connected to a continuous-time model of the actuator hydraulics, allowing the study of switching transients.

Signal processing based FDI
In signal processing based FDI, some mathematical or statistical operations are performed on the measurements, or some neural network is trained using measurements to extract the information about the fault.

A good example of signal processing based FDI is time domain reflectometry where a signal is sent down a cable or electrical line and the reflected signal is compared mathematically to original signal to identify faults. Spread Spectrum Time Domain Reflectometry, for instance, involves sending down a spread spectrum signal down a wire line to detect wire faults. Several clustering methods have also been proposed to identify the novel fault and segment a given signal into normal and faulty segments.

Machine fault diagnosis
Machine fault diagnosis is a field of mechanical engineering concerned with finding faults arising in machines. A particularly well developed part of it applies specifically to rotating machinery, one of the most common types encountered. To identify the most probable faults leading to failure, many methods are used for data collection, including vibration monitoring, thermal imaging, oil particle analysis, etc. Then these data are processed utilizing methods like spectral analysis, wavelet analysis, wavelet transform, short term Fourier transform, Gabor Expansion, Wigner-Ville distribution (WVD), cepstrum, bispectrum, correlation method, high resolution spectral analysis, waveform analysis (in the time domain, because spectral analysis usually concerns only frequency distribution and not phase information) and others. The results of this analysis are used in a root cause failure analysis in order to determine the original cause of the fault. For example, if a bearing fault is diagnosed, then it is likely that the bearing was not itself damaged at installation, but rather as the consequence of another installation error (e.g., misalignment) which then led to bearing damage. Diagnosing the bearing's damaged state is not enough for precision maintenance purposes. The root cause needs to be identified and remedied. If this is not done, the replacement bearing will soon wear out for the same reason and the machine will suffer more damage, remaining dangerous. Of course, the cause may also be visible as a result of the spectral analysis undertaken at the data-collection stage, but this may not always be the case.

The most common technique for detecting faults is the time-frequency analysis technique. For a rotating machine, the rotational speed of the machine (often known as the RPM), is not a constant, especially not during the start-up and shutdown stages of the machine. Even if the machine is running in the steady state, the rotational speed will vary around a steady-state mean value, and this variation depends on load and other factors. Since sound and vibration signals obtained from a rotating machine are strongly related to its rotational speed, it can be said that they are time-variant signals in nature. These time-variant features carry the machine fault signatures. Consequently, how these features are extracted and interpreted is important to research and industrial applications.

The most common method used in signal analysis is the FFT, or Fourier transform. The Fourier transform and its inverse counterpart offer two perspectives to study a signal: via the time domain or via the frequency domain. The FFT-based spectrum of a time signal shows us the existence of its frequency contents. By studying these and their magnitude or phase relations, we can obtain various types of information, such as harmonics, sidebands, beat frequency, bearing fault frequency and so on. However, the FFT is only suitable for signals whose frequency contents do not change over time; however, as mentioned above, the frequency contents of the sound and vibration signals obtained from a rotating machine are very much time-dependent. For this reason, FFT-based spectra are unable to detect how the frequency contents develop over time. To be more specific, if the RPM of a machine is increasing or decreasing during its startup or shutdown period, its bandwidth in the FFT spectrum will become much wider than it would be simply for the steady state. Hence, in such a case, the harmonics are not so distinguishable in the spectrum.

The time frequency approach for machine fault diagnosis can be divided into two broad categories: linear methods and the quadratic methods. The difference is that linear transforms can be inverted to construct the time signal, thus, they are more suitable for signal processing, such as noise reduction and time-varying filtering. Although the quadratic method describes the energy distribution of a signal in the joint time frequency domain, which is useful for analysis, classification, and detection of signal features, phase information is lost in the quadratic time-frequency representation; also, the time histories cannot be reconstructed with this method.

The short-term Fourier transform (STFT) and the Gabor transform are two algorithms commonly used as linear time-frequency methods. If we consider linear time-frequency analysis to be the evolution of the conventional FFT, then quadratic time frequency analysis would be the power spectrum counterpart. Quadratic algorithms include the Gabor spectrogram, Cohen's class and the adaptive spectrogram. The main advantage of time frequency analysis is discovering the patterns of frequency changes, which usually represent the nature of the signal. As long as this pattern is identified the machine fault associated with this pattern can be identified. Another important use of time frequency analysis is the ability to filter out a particular frequency component using a time-varying filter.

Robust fault diagnosis
In practice, model uncertainties and measurement noise can complicate fault detection and isolation.

As a result, using fault diagnostics to meet industrial needs in a cost-effective way, and to reduce maintenance costs without requiring more investments than the cost of what is to be avoided in the first place, requires an effective scheme of applying them. This is the subject of maintenance, repair and operations; the different strategies include:
 * Condition-based maintenance
 * Planned preventive maintenance
 * Preventive maintenance
 * Corrective maintenance (does not use diagnostics)
 * Integrated vehicle health management

Machine learning techniques for fault detection and diagnosis
In fault detection and diagnosis, mathematical classification models which in fact belong to supervised learning methods, are trained on the training set of a labeled dataset to accurately identify the redundancies, faults and anomalous samples. During the past decades, there are different classification and preprocessing models that have been developed and proposed in this research area. K-nearest-neighbors algorithm (kNN) is one of the oldest techniques which has been used to solve fault detection and diagnosis problems. Despite the simple logic that this instance-based algorithm has, there are some problems with large dimensionality and processing time when it is used on large datasets. Since kNN is not able to automatically extract the features to overcome the curse of dimensionality, so often some data preprocessing techniques like Principal component analysis(PCA), Linear discriminant analysis(LDA) or Canonical correlation analysis(CCA) accompany it to reach a better performance. In many industrial cases, the effectiveness of kNN has been compared with other methods, specially with more complex classification models such as Support Vector Machines (SVMs), which is widely used in this field. Thanks to their appropriate nonlinear mapping using kernel methods, SVMs have an impressive performance in generalization, even with small training data. However, general SVMs do not have automatic feature extraction themselves and just like kNN, are often coupled with a data pre-processing technique. Another drawback of SVMs is that their performance is highly sensitive to the initial parameters, particularly to the kernel methods, so in each signal dataset, a parameter tuning process is required to be conducted first. Therefore, the low speed of the training phase is a limitation of SVMs when it comes to its usage in fault detection and diagnosis cases. Artificial Neural Networks (ANNs) are among the most mature and widely used mathematical classification algorithms in fault detection and diagnosis. ANNs are well-known for their efficient self-learning capabilities of the complex relations (which generally exist inherently in fault detection and diagnosis problems) and are easy to operate. Another advantage of ANNs is that they perform automatic feature extraction by allocating negligible weights to the irrelevant features, helping the system to avoid dealing with another feature extractor. However, ANNs tend to over-fit the training set, which will have consequences of having poor validation accuracy on the validation set. Hence, often, some regularization terms and prior knowledge are added to the ANN model to avoid over-fitting and achieve higher performance. Moreover, properly determining the size of the hidden layer needs an exhaustive parameter tuning, to avoid poor approximation and generalization capabilities. In general, different SVMs and ANNs models (i.e. Back-Propagation Neural Networks and Multi-Layer Perceptron) have shown successful performances in the fault detection and diagnosis in industries such as gearbox, machinery parts (i.e. mechanical bearings ), compressors, wind and gas turbines and steel plates.

Deep learning techniques for fault detection and diagnosis


With the research advances in ANNs and the advent of deep learning algorithms using deep and complex layers, novel classification models have been developed to cope with fault detection and diagnosis. Most of the shallow learning models extract a few feature values from signals, causing a dimensionality reduction from the original signal. By using Convolutional neural networks, the continuous wavelet transform scalogram can be directly classified to normal and faulty classes. Such a technique avoids omitting any important fault message and results in a better performance of fault detection and diagnosis. In addition, by transforming signals to image constructions, 2D Convolutional neural networks can be implemented to identify faulty signals from vibration image features.

Deep belief networks, Restricted Boltzmann machines and Autoencoders are other deep neural networks architectures which have been successfully used in this field of research. In comparison to traditional machine learning, due to their deep architecture, deep learning models are able to learn more complex structures from datasets, however, they need larger samples and longer processing time to achieve higher accuracy.

Fault recovery
Fault Recovery in FDIR is the action taken after a failure has been detected and isolated to return the system to a stable state. Some examples of fault recoveries are:


 * Switch-off of a faulty equipment
 * Switch-over from a faulty equipment to a redundant equipment
 * Change of state of the complete system into a Safe Mode with limited functionalities