Behaviorally anchored rating scales

Behaviorally anchored rating scales (BARS) are scales used to rate performance. BARS are normally presented vertically with scale points ranging from five to nine. It is an appraisal method that aims to combine the benefits of narratives, critical incidents, and quantified ratings by anchoring a quantified scale with specific narrative examples of good, moderate, and poor performance.

Background
BARS were developed in response to dissatisfaction with the subjectivity involved in using traditional rating scales such as the graphic rating scale. A review of BARS concluded that the strength of this rating format may lie primarily in the performance dimensions which are gathered rather than the distinction between behavioral and numerical scale anchors.

Benefits of BARS
BARS are rating scales that add behavioral scale anchors to traditional rating scales (e.g., graphic rating scales). In comparison to other rating scales, BARS are intended to facilitate more accurate ratings of the target person's behavior or performance. However, whereas the BARS is often regarded as a superior performance appraisal method, BARS may still suffer from unreliability, leniency bias and lack of discriminant validity between performance dimensions.

Developing BARS
BARS are developed using data collected through the critical incident technique, or through the use of comprehensive data about the tasks performed by a job incumbent, such as might be collected through a task analysis. In order to construct BARS, several basic steps, outlined below, are followed.
 * 1) Examples of effective and ineffective behavior related to job are collected from people with knowledge of job using the critical incident technique. Alternatively, data may be collected through the careful examination of data from a recent task analysis.
 * 2) These data are then converted into performance dimensions. To convert these data into performance dimensions, examples of behavior (such as critical incidents) are sorted into homogeneous groups using the Q-sort technique. Definitions for each group of behaviors are then written to define each grouping of behaviors as a performance dimension
 * 3) A group of subject matter experts (SMEs) are asked to re-translate the behavioral examples back into their respective performance dimensions. At this stage the behaviors for which there is not a high level of agreement (often 50–75%) are discarded while the behaviors which were re-translated back into their respective performance dimensions with a high level of SME agreement are retained. The re-translation process helps to ensure that behaviors are readily identifiable with their respective performance dimensions.
 * 4) The retained behaviors are then scaled by having SMEs rate the effectiveness of each behavior. These ratings are usually done on a 5- to 9-point Likert-type scale.
 * 5) Behaviors with a low standard deviation (for examples, less than 1.50) are retained while behaviors with a higher standard deviation are discarded. This step helps to ensure SME agreement about the rating of each behavior.
 * 6) Finally, behaviors for each performance dimensions, all meeting re-translation and criteria, will be used as scale anchors.