Failure analysis

Failure analysis is the process of collecting and analyzing data to determine the cause of a failure, often with the goal of determining corrective actions or liability. According to Bloch and Geitner, ”machinery failures reveal a reaction chain of cause and effect… usually a deficiency commonly referred to as the symptom…”. Failure analysis can save money, lives, and resources if done correctly and acted upon. It is an important discipline in many branches of manufacturing industry, such as the electronics industry, where it is a vital tool used in the development of new products and for the improvement of existing products. The failure analysis process relies on collecting failed components for subsequent examination of the cause or causes of failure using a wide array of methods, especially microscopy and spectroscopy. Nondestructive testing (NDT) methods (such as industrial computed tomography scanning) are valuable because the failed products are unaffected by analysis, so inspection sometimes starts using these methods.

Forensic investigation
Forensic inquiry into the failed process or product is the starting point of failure analysis. Such inquiry is conducted using scientific analytical methods such as electrical and mechanical measurements, or by analyzing failure data such as product reject reports or examples of previous failures of the same kind. The methods of forensic engineering are especially valuable in tracing product defects and flaws. They may include fatigue cracks, brittle cracks produced by stress corrosion cracking or environmental stress cracking for example. Witness statements can be valuable for reconstructing the likely sequence of events and hence the chain of cause and effect. Human factors can also be assessed when the cause of the failure is determined. There are several useful methods to prevent product failures occurring in the first place, including failure mode and effects analysis (FMEA) and fault tree analysis (FTA), methods which can be used during prototyping to analyze failures before a product is marketed.

Several of the techniques used in failure analysis are also used in the analysis of no fault found (NFF) which is a term used in the field of maintenance to describe a situation where an originally reported mode of failure can't be duplicated by the evaluating technician and therefore the potential defect can't be fixed.

NFF can be attributed to oxidation, defective connections of electrical components, temporary shorts or opens in the circuits, software bugs, temporary environmental factors, but also to the operator error. A large number of devices that are reported as NFF during the first troubleshooting session often return to the failure analysis lab with the same NFF symptoms or a permanent mode of failure.

The term failure analysis also applies to other fields such as business management and military strategy.

Failure analysis engineers
A failure analysis engineer often plays a lead role in the analysis of failures, whether a component or product fails in service or if failure occurs in manufacturing or during production processing. In any case, one must determine the cause of failure to prevent future occurrence, and/or to improve the performance of the device, component or structure. Structural Engineers and Mechanical Engineers are very common for the job. More specific majors can also get into the position such as materials engineers. Specializing in metallurgy and chemistry is always useful along with properties and strengths of materials. Someone could be hired for different reasons, whether it be to further prevent or liability issues. The median salary of a failure analysis engineer, an engineer with experience in the field, is $81,647. A failure analysis engineer requires a good amount of communication and ability to work with others. Usually, the person hired has a bachelor's degree in engineering, but there are certifications that can be acquired.

Methods of analysis
The failure analysis of many different products involves the use of the following tools and techniques:

Microscopes

 * Optical microscope
 * Scanning acoustic microscope (SAM)
 * Scanning electron microscope (SEM)
 * Atomic force microscope (AFM)
 * Stereomicroscope
 * Photon emission microscopy (PEM)
 * X-ray microscope
 * Infra-red microscope
 * Scanning SQUID microscope
 * USB microscope

Sample preparation

 * Jet-etcher
 * Plasma etcher
 * Metallography
 * Back side thinning tools
 * Mechanical back-side thinning
 * Laser chemical back-side etching

Spectroscopic analysis

 * Transmission line pulse spectroscopy (TLPS)
 * Auger electron spectroscopy
 * Deep-level transient spectroscopy (DLTS)

Device modification

 * Focused ion beam etching (FIB)

Surface analysis

 * Dye penetrant inspection
 * Other Surface analysis tools

Electron microscopy

 * Scanning electron microscope (SEM)
 * Electron beam induced current (EBIC) in SEM
 * Charge-induced voltage alteration (CIVA) in SEM
 * Voltage contrast in SEM
 * Electron backscatter diffraction (EBSD) in SEM
 * Energy-dispersive X-ray spectroscopy (EDS) in SEM
 * Transmission electron microscope (TEM)
 * Computer-controlled scanning electron microscope (CCSEM)

Laser signal injection microscopy (LSIM)

 * Photo carrier stimulation
 * Static
 * Optical beam induced current (OBIC)
 * Light-induced voltage alteration (LIVA)
 * Dynamic
 * Laser-assisted device alteration (LADA)
 * Thermal laser stimulation (TLS)
 * Static
 * Optical-beam-induced resistance change (OBIRCH)
 * Thermally induced voltage alteration (TIVA)
 * External induced voltage alteration (XIVA)
 * Seebeck effect imaging (SEI)
 * Dynamic
 * Soft defect localization (SDL)

Semiconductor probing

 * Mechanical probe station
 * Electron beam prober
 * Laser voltage prober
 * Time-resolved photon emission prober (TRPE)
 * Nanoprobing

Software-based fault location techniques

 * CAD Navigation
 * Automatic test pattern generation (ATPG)
 * [ Chip bonder]

People on the Case
Mr. Brahimi is an American Bridge Fluor consultant and has a Masters in materials engineering.

Mr. Aguilar is the Branch Chief for Caltrans Structural Materials Testing Branch with 30 years’ experience as an engineer.

Mr. Christensen who is a Caltrans consultant with 32 years of experience with metallurgy and failure analysis.

Steps
Visual Observation which is non-destructive examination. This revealed sign of brittleness with no permanent plastic deformation before it broke. Cracks were shown which were the final breaking point of the shear key rods. The engineers suspected hydrogen was involved in producing the cracks.

Scanning Electron Microscopy which is the scanning of the cracked surfaces under high magnification to get a better understanding of the fracture. The full fracture happened after the rod couldn’t hold under load when the crack reached a critical size.

Micro Structural Examination where cross-sections were examined to reveal more information about interworking bonds of the metal.

Hardness Testing using two strategies, the Rockwell C Hardness and the Knoop Microhardness which reveals that it was not heat treated correctly.

Tensile Test tells the engineer the yield strength, tensile strength, and elongation was sufficient to pass the requirements. Multiple pieces were taken and performed by Anamet Inc.

Charpy V-Notch Impact Test shows the toughness of the steel by taking different samples of the rod and done by Anamet Inc.

Chemical Analysis was the Final Test also done by Anamet Inc. which met the requirements for that steel.

Conclusion of the Case Study
The rods failed from hydrogen embrittlement which was susceptible to the hydrogen from the high tensile load and the hydrogen already in the material. The rods did not fail because they did not meet the requirements for strength in these rods. While they met requirements, the structure was inhomogeneous which caused different strengths and low toughness.

This study shows a couple of the many ways failure analysis can be done. It always starts with a nondestructive form of observation, like a crime scene. Then pieces of the material are taken from the original piece which are used in different observations. Then destructive testing is done to find toughness and properties of the material to find exactly what went wrong.

Failure of failure analysis
The Oakland Nimitz Freeway was a bridge that collapsed during an earthquake even after the program to strengthen the bridge. Different engineers were asked their take on the situation. Some did not blame the program or the department, like James Rogers who said that in an earthquake there is “a good chance the Embarcadero would do the same thing the Nimitz did.” Others said more prevention could have been done. Priestly said that “neither of the department’s projects to strengthen roadways addressed the problems of weakness…” in the bridge's joints. Some experts agreed that more could have been done to prevent this disaster. The program is under fire for making “the failure more serious”.

From a design engineer's POV
A product needs to be able to work even in the hardest of scenarios. This is very important on products made for expensive builds such as buildings or aircraft. If these parts fail, they can cause serious damage and/or safety problems. A product starts to be designed "...to minimize the hazards associated with this "worst case scenario." Discerning the worst case scenario requires a complete understanding of the product, its loading and its service environment. Prior to the product entering service, a prototype will often undergo laboratory testing which proves the product withstands the worst case scenario as expected." Some of the tests done on jet engines today are very intensive checking if the engine can withstand:


 * ingestion of debris, dust, sand, etc.;
 * ingestion of hail, snow, ice, etc.;
 * ingestion of excessive amounts of water.

These tests must be harder than what the product will experience in use. The engines are pushed to the max in order to ensure that the product will function the way it should no matter the condition. Failure analysis on both sides is about the prevention of damage and maintaining safety.