Reliability-centered maintenance

Reliability-centered maintenance (RCM) is a concept of maintenance planning to ensure that systems continue to do what their users require in their present operating context. Successful implementation of RCM will lead to increase in cost effectiveness, reliability, machine uptime, and a greater understanding of the level of risk that the organization is managing.

Context
It is generally used to achieve improvements in fields such as the establishment of safe minimum levels of maintenance, changes to operating procedures and strategies and the establishment of capital maintenance regimes and plans. Successful implementation of RCM will lead to increase in cost effectiveness, machine uptime, and a greater understanding of the level of risk that the organization is managing.

John Moubray characterized RCM as a process to establish the safe minimum levels of maintenance. This description echoed statements in the Nowlan and Heap report from United Airlines.

It is defined by the technical standard SAE JA1011, Evaluation Criteria for RCM Processes, which sets out the minimum criteria that any process should meet before it can be called RCM. This starts with the seven questions below, worked through in the order that they are listed:


 * 1. What is the item supposed to do and its associated performance standards?
 * 2. In what ways can it fail to provide the required functions?
 * 3. What are the events that cause each failure?
 * 4. What happens when each failure occurs?
 * 5. In what way does each failure matter?
 * 6. What systematic task can be performed proactively to prevent, or to diminish to a satisfactory degree, the consequences of the failure?
 * 7. What must be done if a suitable preventive task cannot be found?

Reliability centered maintenance is an engineering framework that enables the definition of a complete maintenance regimen. It regards maintenance as the means to maintain the functions a user may require of machinery in a defined operating context. As a discipline it enables machinery stakeholders to monitor, assess, predict and generally understand the working of their physical assets. This is embodied in the initial part of the RCM process which is to identify the operating context of the machinery, and write a Failure Mode Effects and Criticality Analysis (FMECA). The second part of the analysis is to apply the "RCM logic", which helps determine the appropriate maintenance tasks for the identified failure modes in the FMECA. Once the logic is complete for all elements in the FMECA, the resulting list of maintenance is "packaged", so that the periodicities of the tasks are rationalised to be called up in work packages; it is important not to destroy the applicability of maintenance in this phase. Lastly, RCM is kept live throughout the "in-service" life of machinery, where the effectiveness of the maintenance is kept under constant review and adjusted in light of the experience gained.

RCM can be used to create a cost-effective maintenance strategy to address dominant causes of equipment failure. It is a systematic approach to defining a routine maintenance program composed of cost-effective tasks that preserve important functions.

The important functions (of a piece of equipment) to preserve with routine maintenance are identified, their dominant failure modes and causes determined and the consequences of failure ascertained. Levels of criticality are assigned to the consequences of failure. Some functions are not critical and are left to "run to failure" while other functions must be preserved at all cost. Maintenance tasks are selected that address the dominant failure causes. This process directly addresses maintenance preventable failures. Failures caused by unlikely events, non-predictable acts of nature, etc. will usually receive no action provided their risk (combination of severity and frequency) is trivial (or at least tolerable). When the risk of such failures is very high, RCM encourages (and sometimes mandates) the user to consider changing something which will reduce the risk to a tolerable level.

The result is a maintenance program that focuses scarce economic resources on those items that would cause the most disruption if they were to fail.

RCM emphasizes the use of predictive maintenance (PdM) techniques in addition to traditional preventive measures.

Background
The term "reliability-centered maintenance" authored by Tom Matteson, Stanley Nowlan and Howard Heap of United Airlines (UAL) to describe a process used to determine the optimum maintenance requirements for aircraft (having left United Airlines to pursue a consulting career a few months before the publication of the final Nowlan-Heap report, Matteson received no authorial credit for the work). The US Department of Defense (DOD) sponsored the authoring of both a textbook (by UAL) and an evaluation report (by Rand Corporation) on Reliability-Centered Maintenance, both published in 1978. They brought RCM concepts to the attention of a wider audience.

The first generation of jet aircraft had a crash rate that would be considered highly alarming today, and both the Federal Aviation Administration (FAA) and the airlines' senior management felt strong pressure to improve matters. In the early 1960s, with FAA approval the airlines began to conduct a series of intensive engineering studies on in-service aircraft. The studies proved that the fundamental assumption of design engineers and maintenance planners—that every aircraft and every major component thereof (such as its engines) had a specific "lifetime" of reliable service, after which it had to be replaced (or overhauled) in order to prevent failures—was wrong in nearly every specific example in a complex modern jet airliner.

This was one of many astounding discoveries that have revolutionized the managerial discipline of physical asset management and have been at the base of many developments since this seminal work was published. Among some of the paradigm shifts inspired by RCM were:
 * an understanding that the vast majority of failures are not necessarily linked to the age of the asset
 * changing from efforts to predict life expectancies to trying to manage the process of failure
 * an understanding of the difference between the requirements of assets from a user perspective, and the design reliability of the asset
 * an understanding of the importance of managing assets on condition (often referred to as condition monitoring, condition based maintenance and predictive maintenance)
 * an understanding of four basic routine maintenance tasks
 * linking levels of tolerable risk to maintenance strategy development

Later RCM was defined in the standard SAE JA1011, Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes. This sets out the minimum criteria for what is, and for what is not, able to be defined as RCM. The standard is a watershed event in the ongoing evolution of the discipline of physical asset management. Prior to the development of the standard many processes were labeled as RCM even though they were not true to the intentions and the principles in the original report that defined the term publicly.

Basic features
The RCM process described in the DOD/UAL report recognized three principal risks from equipment failures: threats
 * to safety,
 * to operations, and
 * to the maintenance budget.

Modern RCM gives threats to the environment a separate classification, though most forms manage them in the same way as threats to safety.

RCM offers five principal options among the risk management strategies:
 * Predictive maintenance tasks,
 * Preventive Restoration or Preventive Replacement maintenance tasks,
 * Detective maintenance tasks,
 * Run-to-Failure, and
 * One-time changes to the "system" (changes to hardware design, to operations, or to other things).

RCM also offers specific criteria to use when selecting a risk management strategy for a system that presents a specific risk when it fails. Some are technical in nature (can the proposed task detect the condition it needs to detect? does the equipment actually wear out, with use?). Others are goal-oriented (is it reasonably likely that the proposed task-and-task-frequency will reduce the risk to a tolerable level?). The criteria are often presented in the form of a decision-logic diagram, though this is not intrinsic to the nature of the process.

In use
After being created by the commercial aviation industry, RCM was adopted by the U.S. military (beginning in the mid-1970s) and by the U.S. commercial nuclear power industry (in the 1980s).

Starting in the late 1980s, an independent initiative led by John Moubray corrected some early flaws in the process, and adapted it for use in the wider industry. Moubray was also responsible for popularizing the method and for introducing it to much of the industrial community outside of the aviation industry. In the two decades since this approach (called by the author RCM2) was first released, industry has undergone massive change with advances in lean thinking and efficiency methods. At this point in time many methods sprung up that took an approach of reducing the rigour of the RCM approach. The result was the propagation of methods that called themselves RCM, yet had little in common with the original concepts. In some cases these were misleading and inefficient, while in other cases they were even dangerous. Since each initiative is sponsored by one or more consulting firms eager to help clients use it, there is still considerable disagreement about their relative dangers (or merits).

The RCM standard (SAE JA1011, available from http://www.sae.org) provides the minimum criteria that processes must comply with if they are to be called RCM.

Although a voluntary standard, it provides a reference for companies looking to implement RCM to ensure they are getting a process, software package or service that is in line with the original report.

The Walt Disney Company introduced RCM to its parks in 1997, led by Paul Pressler and consultants McKinsey & Company, laying off a large number of maintenance workers and saving large amounts of money. Some people blamed the new cost-conscious maintenance culture for some of the Incidents at Disneyland Resort that occurred in the following years.