User:Avi Harel/Resilience engineering

Resilience engineering is an emergent discipline, formed by integration of concepts and techniques of three disciplines: Resilience engineering plays a key role in Safety Engineering.
 * Systems Engineering, focusing primarily of providing functionality and performance
 * Software Engineering, focusing on the design, development, implementation and maintenance of the system software
 * Cognitive Engineering, focusing on the operator's performance and on the operator's role in resolving complex situations.

The integration of the disciplines is described in the interactive guide for resilience assurance.

Resilience-oriented Resilience Engineering (ROSE)
The methods for resilience assurance are integrated in the traditional cycle of proactive and reactive system development.

Methodology
The interactive guide proposes an iterative approach to resilience assurance, combining  two leading approaches (learning cycles ...): The iterative approach to system design is described here ...
 * The proactive approach, intended to prevent failure by design
 * The reactive approach, intended to assure learning from incidences.

Content
The requirement specification includes lists of hazards, defense add-ons,  interaction styles, Resilience modules ... and a description of the operational rules.

The top-level design is based on a resilience-oriented system architecture ...

The unit design targets assuring error prevention by specialized control and supervision stations, detection of component fault operator's slips and mistakes by component-level add-ons, and of unexpected activity, based on the operational rules.

Architecture
A key feature in Proactive resilience assurance is a resilience-oriented architecture, which extends the functional unit by special add-ons.

Key features

 * Iterative assurance

Subjects

 * Alarm generation ...  - towards zero missed alarms, and minimum improper alarms
 * Alarm perception ...  - the awareness of the operators about new operational risks.
 * Recovery - the system activity during the transition from exceptional or unpredictable situations to normal operation

Goals

 * Unit testing - verify that the system can identify any component fault and any rule violation
 * Integration - test the system architecture, to ensure that all the resilience features work as intended.
 * Verification - ensure that operators can tackle all the expected situations involved in failure modes, including those related to exceptional situations.
 * Validation - graying out black swans: ensure that the system captures some unpredictable situations and events, such as unexpected  operational errors : identify instances in which the system resilience does not comply with the expectations of the stakeholders.