Guttman scale

In the analysis of multivariate observations designed to assess subjects with respect to an attribute, a Guttman scale (named after Louis Guttman) is a single (unidimensional) ordinal scale for the assessment of the attribute, from which the original observations may be reproduced. The discovery of a Guttman scale in data depends on their multivariate distribution's conforming to a particular structure (see below). Hence, a Guttman scale is a hypothesis about the structure of the data, formulated with respect to a specified attribute and a specified population and cannot be constructed for any given set of observations. Contrary to a widespread belief, a Guttman scale is not limited to dichotomous variables and does not necessarily determine an order among the variables. But if variables are all dichotomous, the variables are indeed ordered by their sensitivity in recording the assessed attribute, as illustrated by Example 1.

Deterministic model
Example 1: Dichotomous variables

A Guttman scale may be hypothesized for the following five questions that concern the attribute "acceptance of social contact with immigrants" (based on the Bogardus social distance scale), presented to a suitable population:


 * 1) Would you accept immigrants as residents in your country? (No=0; Yes=1)
 * 2) Would you accept immigrants as residents in your town? (No=0; Yes=1)
 * 3) Would you accept immigrants as residents in your neighborhood? (No=0; Yes=1)
 * 4) Would you accept immigrants as next-door neighbors? (No=0; Yes=1)
 * 5) Would you accept an immigrant as your child's spouse? (No=0; Yes=1)

A positive response by a particular respondent to any question in this list, suggests positive responses by that respondent to all preceding questions in this list. Hence one could expect to obtain only the responses listed in the shaded part (columns 1–5) of Table 1. '''Table 1. Hypothesized responses to the five social distance variables form a Guttman scale (a cumulative scale)'''



Every row in the shaded part of Table 1 (columns 1–5) is the response profile of any number (≥ 0) of respondents. Every profile in this table indicates acceptance of immigrants in all senses indicated by the previous profile, plus an additional sense in which immigrants are accepted. If, in a large number of observations, only the profiles listed in Table 1 are observed, then the Guttman scale hypothesis is supported, and the values of the scale (last column of Table 1) have the following properties:


 * 1) They assess the strength of the attribute "acceptance of social contact with immigrants";
 * 2) They reproduce the original observations. (For example, a respondent's scale score of 2 implies that that respondent responded positively to questions 1 and 2 and negatively to questions 3, 4, and 5.)

Guttman scale, if supported by data, is useful for efficiently assessing subjects (respondents, testees or any collection of investigated objects) on a one-dimensional scale with respect to the specified attribute. Typically, Guttman scales are found with respect to attributes that are narrowly defined.

While other scaling techniques (e.g., Likert scale) produce a single scale by summing up respondents' scores—a procedure that assumes, often without justification, that all observed variables have equal weights — Guttman scale avoids weighting the observed variables; thus 'respecting' data for what they are. If a Guttman scale is confirmed, the measurement of the attribute is intrinsically one-dimensional; the unidimensionality is not forced by summation or averaging. This feature renders it appropriate for the construction of replicable scientific theories and meaningful measurements, as explicated in facet theory.

Ordinal variables
Given a data set of N subjects observed with respect to n ordinal variables, each having any finite number (≥2) of numerical categories ordered by increasing strength of a pre-specified attribute, let aij be the score obtained by subject i on variable j, and define the list of scores that subject i obtained on the n variables, ai=ai1...ain, to be the profile of subject i. (The number of categories may be different in different variables; and the order of the variables in the profiles is not important but should be fixed).

Define:

Two profiles, as and at are equal, denoted as=at, iff asj=atj for all j=1...n

Profile as is greater than Profile at, denoted as>at, iff asj ≥ atj for all j=1...n and asj' > atj for at least one variable, j.

Profiles as and at are comparable, denoted asSat, iff as=at; or as>at; or at>as

Profiles as and at are incomparable, denoted as$at, if they are not comparable (that is, for at least one variable, j, asj' > atj and for at least one other variable, j  , atj  > asj''.

For data sets where the categories of all variables are similarly ordered numerically (from high to low or from low to high) with respect to a given attribute, Guttman scale is defined simply thus:

Definition: Guttman scale is a data set in which all profile-pairs are comparable.

Example: Non-dichotomous variables
Consider the following four variables that assess arithmetic skills among a population P of pupils:

V1: Can pupil (p) perform addition of numbers? No=1; Yes, but only of two-digit numbers=2; Yes=3.

V2: Does pupil (p) know the (1-10) multiplication table? No=1; Yes=2.

V3: Can pupil (p) perform multiplication of numbers? No=1; Yes, but only of two-digit numbers=2; Yes=3.

V4: Can pupil (p) perform long division? No=1; Yes=2.

Data collected for the above four variables among a population of school children may be hypothesized to exhibit the Guttman scale shown below in Table 2:

'''Table 2. Data of the four ordinal arithmetic skill variables are hypothesized to form a Guttman scale'''

The set profiles hypothesized to occur (shaded part in Table 2) illustrates the defining feature of the Guttman scale, namely, that any pair of profiles are comparable. Here too, if the hypothesis is confirmed, a single scale-score reproduces a subject's responses in all the variables observed.

Any ordered set of numbers could serve as scale. In this illustration we chose the sum of profile-scores. According to facet theory, only in data that conform to a Guttman scale such a summation may be justified.

Reproducibility
In practice, perfect ("deterministic") Guttman scales are rare, but approximate ones have been found in specific populations with respect to attributes such as religious practices, narrowly defined domains of knowledge, specific skills, and ownership of household appliances. When data do not conform to a Guttman scale, they may either represent a Guttman scale with noise (and treated stochastically ), or they may have a more complex structure requiring multiple scaling for identifying the scales intrinsic to them.

The extent to which a data set conforms to a Guttman scale can be estimated from the coefficient of reproducibility of which there are a few versions, depending on statistical assumptions and limitations. Guttman's original definition of the reproducibility coefficient, CR is simply 1 minus the ratio of the number of errors to the number of entries in the data set. And, to ensure that there is a range of responses (not the case if all respondents only endorsed one item) the coefficient of scalability is used.

In Guttman scaling is found the beginnings of item response theory which, in contrast to classical test theory, acknowledges that items in questionnaires do not all have the same level of difficulty. Non-deterministic (i.e., stochastic) models have been developed such as the Mokken scale and the Rasch model. Guttman scale has been generalized to the theory and procedures of "multiple scaling" which identifies the minimum number of scales needed for satisfactory reproducibility.

As a procedure that ties substantive contents with logical aspects of data, Guttman scale heralded the advent of facet theory developed by Louis Guttman and his associates.

Guttman scale in qualitative variables
Guttman's original definition of a scale allows also for the exploratory scaling analysis of qualitative variables (nominal variables, or ordinal variables that do not necessarily belong to a pre-specified common attribute). This definition of Guttman scale relies on the prior definition of a simple function.

For a totally ordered set X, say, 1,2,...,m, and another finite set, Y, with k elements k ≤ m, a function from X to Y is a simple function if X can be partitioned into k intervals which are in a one-to-one correspondence with the values of Y.

A Guttman scale may then be defined for a data set of n variables, with the jth variable having kj (qualitative, not necessarily ordered) categories, thus:

Definition: Guttman scale is a data set for which there exists an ordinal variable, X, with a finite number m of categories, say, 1,...,m with m≥ maxj(kj) and a permutation of subjects' profiles such that each variable in the data set is a simple function of X.

Despite its seeming elegance and appeal for exploratory research, this definition has not been sufficiently studied or applied.