High reliability organization

A high reliability organization (HRO) is an organization that has succeeded in avoiding catastrophes in an environment where normal accidents can be expected due to risk factors and complexity.

Important case studies in HRO research include both studies of disasters (e.g., Three Mile Island nuclear incident, the Challenger Disaster and Columbia Disaster, the Bhopal chemical leak, the Chernobyl Disaster, the Tenerife air crash, the Mann Gulch forest fire, the Black Hawk friendly fire incident in Iraq) and HROs like the air traffic control system, naval aircraft carriers, and nuclear power operations.

History
HRO theory is derived from normal accident theory, which led a group of researchers at the University of California, Berkeley (Todd LaPorte, Gene Rochlin, and Karlene Roberts) to study how organizations working with complex and hazardous systems operated error free. They researched three organizations: United States nuclear aircraft carriers (in partnership with Rear Admiral (ret.) Tom Mercer on the USS Carl Vinson), the Federal Aviation Administration's Air Traffic Control system (and commercial aviation more generally), and nuclear power operations (Pacific Gas and Electric's Diablo Canyon reactor).

The result of this initial work was the defining characteristics of HROs hold in common:


 * 1) "Hypercomplexity" – extreme variety of components, systems, and levels.
 * 2) Tight coupling – reciprocal interdependence across many units and levels.
 * 3) Distinguishable hierarchy – multiple levels, each with its own elaborate control and regulating mechanisms.
 * 4) Large numbers of decision makers in complex communication networks – characterized by a thorough network of peer-reviewed control and informational systems.
 * 5) Discernable degree of accountability that reinforces organizational commitment high quality work  – strict adherence to a set performance standard, in which deviation results in additional training or corrective action.
 * 6) High frequency of immediate feedback about decisions.
 * 7) Compressed time factors – cycles of major activities are measured in seconds.
 * 8) More than one critical outcome that must happen simultaneously – simultaneity signifies both the complexity of operations as well as the inability to withdraw or modify operations decisions.

While many organizations display some of these characteristics, HROs display them all simultaneously.

Normal Accident and HRO theorists agreed that interactive complexity and tight coupling can, theoretically, lead to a system accident. However, they hold different opinions on whether those system accidents are inevitable or are manageable. Serious accidents in high risk, hazardous operations can be prevented through a combination of organizational design, culture, management, and human choice. Theorists of both schools place a lot of emphasis on human interaction with the system as either cause (Normal Accident Theory - NAT) or prevention (HRO) of a systems accident. High reliability organization theory and HROs are often contrasted against Charles Perrow's Normal Accident Theory (see Sagan for a comparison of HRO and NAT). NAT represents Perrow's attempt to translate his understanding of the disaster at Three Mile Island nuclear facility into a more general formulation of accidents and disasters. Perrow's 1984 book also included chapters on petrochemical plants, aviation accidents, naval accidents, "earth-based system" accidents (dam breaks, earthquakes), and "exotic" accidents (genetic engineering, military operations, and space flight). At Three Mile Island the technology was tightly coupled due to time-dependent processes, invariant sequences, and limited slack. The technological deficiencies were a result of unforeseen concatenations, that ultimately resulted in the conjoined collapse of a complex system. Perrow hypothesized that regardless of the effectiveness of management and operations, accidents in systems that are characterized by tight coupling and interactive complexity will be normal or inevitable as they often cannot be foreseen or prevented. This view, described by some theorists as boldly technologically deterministic, contrasts with the view of HRO proponents, who argued that high-risk, high-hazard organizations can function safely despite the hazards of complex systems. Despite their differences, NAT and HRO theory share a focus on the social and organizational underpinnings of system safety and accident causation/prevention. As research continued, a body of knowledge emerged based on the studying of a variety of organizations. For example, a fire incident command system, Loma Linda Hospital's Pediatric Intensive Care Unit, and the California Independent System Operator were all studied as examples of HROs.

Although they may seem diverse, these organizations have a number of similarities. First, they operate in rigid social and political environments. Second, their technologies are high-risk and present the potential for error. Third, the severity and scale of possible consequences from errors or mistakes precludes learning through experimentation. Finally, these organizations all use complex processes to manage complex technologies and complex work to avoid failure. HROs share many properties with other high-performing organizations including highly trained-personnel, continuous training, effective reward systems, frequent process audits and continuous improvement efforts. Yet other properties such as an organization-wide sense of vulnerability, a widely distributed sense of responsibility and accountability for reliability, concern about misperception, misconception and misunderstanding that is generalized across a wide set of tasks, operations, and assumptions, pessimism about possible failures, redundancy and a variety of checks and counter checks as a precaution against potential mistakes are more distinctive.

Defining high reliability and specifying what constitutes a HRO has presented some challenges. Roberts initially proposed that high reliability organizations are a subset of hazardous organizations that have enjoyed a record of high safety over long periods of time. Specifically she argued that: “One can identify this subset by answering the question, “how many times could this organization have failed resulting in catastrophic consequences that it did not?” If the answer is on the order of tens of thousands of times the organization is “high reliability”” (p. 160). More recent definitions have built on this starting point but emphasized the dynamic nature of producing reliability (i.e., constantly seeking to improve reliability and intervening both to prevent errors and failures and to cope and recover quickly should errors become manifest). Some researchers view HROs as reliability-seeking rather than reliability-achieving. Reliability-seeking organizations are not distinguished by their absolute errors or accident rate, but rather by their “effective management of innately risky technologies through organizational control of both hazard and probability” (p. 14). Consequently, the phrase "high reliability" has come to mean that high risk and high effectiveness can co-exist, for organizations that must perform well under trying conditions, and that it takes intensive effort to do so.

While the early research focused on high risk industries, other expressed interest in HROs and sought to emulate their success. A key turning point was Karl Weick, Kathleen M. Sutcliffe, and David Obstfeld's reconceptualization of the literature on high reliability. These researchers systematically reviewed the case study literature on HROs and illustrated how the infrastructure of high reliability was grounded in processes of collective mindfulness which are indicated by a preoccupation with failure, reluctance to simplify interpretations, sensitivity to operations, commitment to resilience, and deference to expertise. In other words, HROs are distinctive because of their efforts to organize in ways that increase the quality of attention across the organization, thereby enhancing people's alertness and awareness to details so that they can detect subtle ways in which contexts vary and call for contingent responding (i.e., collective mindfulness). This construct was elaborated and refined as mindful organizing in Weick and Sutcliffe's 2001 and 2007 editions of their book Managing the Unexpected. Mindful organizing forms a basis for individuals to interact continuously as they develop, refine and update a shared understanding of the situation they face and their capabilities to act on that understanding. Mindful organizing proactively triggers actions that forestall and contain errors and crises and requires leaders and employees to pay close attention to shaping the social and relational infrastructure of the organization. They establish a set of interrelated organizing processes and practices, which jointly contribute to the system's (e.g., team, unit, organization) overall safety culture.

Characteristics
Successful organizations in high-risk industries continually "reinvent" themselves. For example, when an incident command team realizes what they thought was a garage fire has now changed into a hazardous material incident, they completely restructure their response organization.

There are five characteristics of HROs that have been identified as responsible for the "mindfulness" that keeps them working well when facing unexpected situations.


 * Preoccupation with failure: HROs treat anomalies as symptoms of a problem with the system. The latent organizational weaknesses that contribute to small errors can also contribute to larger problems, so errors are reported promptly so problems can be found and fixed.
 * Reluctance to simplify interpretations: HROs take deliberate steps to comprehensively understand the work environment as well as a specific situation. They are cognizant that the operating environment is very complex, so they look across system boundaries to determine the path of problems (where they started, where they may end up) and value a diversity of experience and opinions.
 * Sensitivity to operations: HROs are continuously sensitive to unexpected changed conditions. They monitor the systems’ safety and security barriers and controls to ensure they remain in place and operate as intended. Situational awareness is extremely important to HROs.
 * Commitment to resilience: HROs develop the capability to detect, contain, and recover from errors. Errors will happen, but HROs are not paralyzed by them.
 * Deference to expertise: HROs follow typical communication hierarchy during routine operations, but defer to the person with the expertise to solve the problem during upset conditions. During a crisis, decisions are made at the front line and authority migrates to the person who can solve the problem, regardless of their hierarchical rank.

Although the original research and early application of HRO theory into practice occurred in high risk industries, research covers a wide variety of applications and settings. Health care has been the largest practitioner area for the past several years. The applications of Crew Resource Management is another area of focus for leaders in HROs requiring competent behavior systems measurement and intervention. Wildfires create complex and very dynamic mega-crisis situations across the globe every year. U.S. wildland firefighters, often organized using the Incident Command System into flexible inter-agency incident management teams, are not only called upon to "bring order to chaos" in today's mega-fires, they also are requested on "all-hazard events" like hurricanes, floods and earthquakes. The U.S. Wildland Fire Lessons Learned Center has been providing education and training to the wildland fire community on high reliability since 2002.

HRO behaviors can be developed into high-functioning skills of anticipation and resilience. Learning organizations that strive for high performance in things they can plan for, can become HROs that are able to better manage unexpected events that by definition cannot be planned for.