Physics of failure

Physics of failure is a technique under the practice of reliability design that leverages the knowledge and understanding of the processes and mechanisms that induce failure to predict reliability and improve product performance.

Other definitions of Physics of Failure include:
 * A science-based approach to reliability that uses modeling and simulation to design-in reliability. It helps to understand system performance and reduce decision risk during design and after the equipment is fielded. This approach models the root causes of failure such as fatigue, fracture, wear, and corrosion.
 * An approach to the design and development of reliable product to prevent failure, based on the knowledge of root cause failure mechanisms. The Physics of Failure (PoF) concept is based on the understanding of the relationships between requirements and the physical characteristics of the product and their variation in the manufacturing processes, and the reaction of product elements and materials to loads (stressors) and interaction under loads and their influence on the fitness for use with respect to the use conditions and time.

Overview
The concept of Physics of Failure, also known as Reliability Physics, involves the use of degradation algorithms that describe how physical, chemical, mechanical, thermal, or electrical mechanisms evolve over time and eventually induce failure. While the concept of Physics of Failure is common in many structural fields, the specific branding evolved from an attempt to better predict the reliability of early generation electronic parts and systems.

The beginning
Within the electronics industry, the major driver for the implementation of Physics of Failure was the poor performance of military weapon systems during World War II. During the subsequent decade, the United States Department of Defense funded an extensive amount of effort to especially improve the reliability of electronics, with the initial efforts focused on after-the-fact or statistical methodology. Unfortunately, the rapid evolution of electronics, with new designs, new materials, and new manufacturing processes, tended to quickly negate approaches and predictions derived from older technology. In addition, the statistical approach tended to lead to expensive and time-consuming testing. The need for different approaches led to the birth of Physics of Failure at the Rome Air Development Center (RADC). Under the auspices of the RADC, the first Physics of Failure in Electronics Symposium was held in September 1962. The goal of the program was to relate the fundamental physical and chemical behavior of materials to reliability parameters.

Early history – integrated circuits
The initial focus of physics of failure techniques tended to be limited to degradation mechanisms in integrated circuits. This was primarily because the rapid evolution of the technology created a need to capture and predict performance several generations ahead of existing product.

One of the first major successes under predictive physics of failure was a formula developed by James Black of Motorola to describe the behavior of electromigration. Electromigration occurs when collisions of electrons cause metal atoms in a conductor to dislodge and move downstream of current flow (proportional to current density). Black used this knowledge, in combination with experimental findings, to describe the failure rate due to electromigration as
 * $$\text{MTTF}=A(J^{-n})e^{\frac{E_\text{a}}{kT}}$$

where A is a constant based on the cross-sectional area of the interconnect, J is the current density, Ea is the activation energy (e.g. 0.7 eV for grain boundary diffusion in aluminum), k is the Boltzmann constant, T is the temperature and n is a scaling factor (usually set to 2 according to Black).

Physics of failure is typically designed to predict wearout, or an increasing failure rate, but this initial success by Black focused on predicting behavior during operational life, or a constant failure rate. This is because electromigration in traces can be designed out by following design rules, while electromigration at vias are primarily interfacial effects, which tend to be defect or process-driven.

Leveraging this success, additional physics-of-failure based algorithms have been derived for the three other major degradation mechanisms (time dependent dielectric breakdown [TDDB], hot carrier injection [HCI], and negative bias temperature instability [NBTI]) in modern integrated circuits (equations shown below). More recent work has attempted to aggregate these discrete algorithms into a system-level prediction.

TDDB: τ = τo(T) exp[ G(T)/ εox] where τo(T) = $5.4$ exp(−Ea / kT), G(T) = 120 + 5.8/kT, and εox is the permittivity.

HCI: λHCI = A3 exp(−β/VD) exp(−Ea / kT) where λHCI is the failure rate of HCI, A3 is an empirical fitting parameter, β is an empirical fitting parameter, VD is the drain voltage, Ea is the activation energy of HCI, typically −0.2 to −0.1 eV, k is the Boltzmann constant, and T is absolute temperature.

NBTI: λ = A εoxm VTμp exp(−Ea / kT) where A is determined empirically by normalizing the above equation, m = 2.9, VT is the thermal voltage, μp is the surface mobility constant, Ea is the activation energy of NBTI, k is the Boltzmann constant, and T is the absolute temperature.

Next stage – electronic packaging
The resources and successes with integrated circuits, and a review of some of the drivers of field failures, subsequently motivated the reliability physics community to initiate physics of failure investigations into package-level degradation mechanisms. An extensive amount of work was performed to develop algorithms that could accurately predict the reliability of interconnects. Specific interconnects of interest resided at 1st level (wire bonds, solder bumps, die attach), 2nd level (solder joints), and 3rd level (plated through holes).

Just as integrated circuit community had four major successes with physics of failure at the die-level, the component packaging community had four major successes arise from their work in the 1970s and 1980s. These were

Peck : Predicts time to failure of wire bond / bond pad connections when exposed to elevated temperature / humidity
 * $$\text{TTF} = A_0 (RH)^{-2.7} f(V) \exp\left(\frac{E_a}{k_\text{B} T}\right)$$

where A is a constant, RH is the relative humidity, f(V) is a voltage function (often cited as voltage squared), Ea is the activation energy, kB is the Boltzmann constant, and T is absolute temperature.

Engelmaier : Predicts time to failure of solder joints exposed to temperature cycling
 * $$N_\text{f}(50\%)=\frac{1}{2}\left[\frac{2\epsilon'_\text{f}}{\Delta D}\right]^{\frac{-1}{c} }\quad\Delta D(\text{leadless})=\left[\frac{F L_\text{D} \Delta(\alpha \Delta T)}{h}\right]$$

where εf is a fatigue ductility coefficient, c is a time and temperature dependent constant, F is an empirical constant, LD is the distance from the neutral point, α is the coefficient of thermal expansion, ΔT is the change in temperature, and h is solder joint thickness.

Steinberg : Predicts time to failure of solder joints exposed to vibration
 * $$Z_0=\frac{9.8\times 3\sqrt{\pi/2\times \text{PSD}\times f_n\times Q} }{f_n^2}\quad Z_\text{c}=\frac{0.00022B}{chr\sqrt{L} }$$

where Z is maximum displacement, PSD is the power spectral density (g2/Hz), fn is the natural frequency of the CCA, Q is transmissibility (assumed to be square root of natural frequency), Zc is the critical displacement (20 million cycles to failure), B is the length of PCB edge parallel to component located at the center of the board, c is a component packaging constant, h is PCB thickness, r is a relative position factor, and L is component length.

IPC-TR-579 : Predicts time to failure of plated through holes exposed to temperature cycling
 * $$\sigma=\frac{(\alpha_\text{E}-\alpha_\text{Cu})\Delta T A_\text{E} E_\text{E} E_\text{Cu} }{A_\text{E} E_\text{E}+A_\text{Cu}E_\text{Cu} },\quad \text{for }\sigma\le S_Y$$
 * $$N_\text{f}^{-0.6}D_\text{f}^{0.75}+0.9\frac{S_\text{u}}{E} \left[ \frac{\exp(D_\text{f})}{0.36}\right]^{0.1785\log\frac{10^5}{N_\text{f}} }-\Delta\epsilon=0$$

where a is coefficient of thermal expansion (CTE), T is temperature, E is elastic modules, h is board thickness, d is hole diameter, t is plating thickness, and E and Cu label corresponding board and copper properties, respectively, Su being the ultimate tensile strength and Df being ductility of the plated copper, and De is the strain range.

Each of the equations above uses a combination of knowledge of the degradation mechanisms and test experience to develop first-order equations that allow the design or reliability engineer to be able to predict time to failure behavior based on information on the design architecture, materials, and environment.

Recent work
More recent work in the area of physics of failure has been focused on predicting the time to failure of new materials (i.e., lead-free solder, high-K dielectric ), software programs, using the algorithms for prognostic purposes, and integrating physics of failure predictions into system-level reliability calculations.

Limitations
There are some limitations with the use of physics of failure in design assessments and reliability prediction. The first is physics of failure algorithms typically assume a 'perfect design'. Attempting to understand the influence of defects can be challenging and often leads to Physics of Failure (PoF) predictions limited to end of life behavior (as opposed to infant mortality or useful operating life). In addition, some companies have so many use environments (think personal computers) that performing a PoF assessment for each potential combination of temperature / vibration / humidity / power cycling / etc. would be onerous and potentially of limited value.