Stanford–Binet Intelligence Scales

The Stanford–Binet Intelligence Scales (or more commonly the Stanford–Binet) is an individually administered intelligence test that was revised from the original Binet–Simon Scale by Alfred Binet and Théodore Simon. It is in its fifth edition (SB5), which was released in 2003.

It is a cognitive-ability and intelligence test that is used to diagnose developmental or intellectual deficiencies in young children, in contrast to the Wechsler Adult Intelligence Scale (WAIS). The test measures five weighted factors and consists of both verbal and nonverbal subtests. The five factors being tested are knowledge, quantitative reasoning, visual-spatial processing, working memory, and fluid reasoning.

The development of the Stanford–Binet initiated the modern field of intelligence testing and was one of the first examples of an adaptive test. The test originated in France, then was revised in the United States. It was initially created by the French psychologist Alfred Binet and the French psychiatrist Théodore Simon, who, following the introduction of a law mandating universal education by the French government, began developing a method of identifying "slow" children, so that they could be placed in special education programs, instead of labelled sick and sent to the asylum. As Binet and Simon indicated, case studies might be more detailed and helpful, but the time required to test many people would be excessive. In 1916, at Stanford University, the psychologist Lewis Terman released a revised examination that became known as the Stanford–Binet test.

Development
As discussed by Fancher & Rutherford in 2012, the Stanford–Binet is a modified version of the Binet–Simon Intelligence scale. The Binet–Simon scale was created by the French psychologist Alfred Binet and the French psychiatrist Theodore Simon. Due to the introduction of compulsory education at that time, questions were raised about children unfit for regular education, and a proposal was made to build boarding schools in asylums for them. Binet and Simon were part of a learned society that opposed the proposal and advocated the creation of remedial tracks in regular schools. They created the first intelligence test to objectively measure the intellectual functioning of primary school children. Binet and Simon believed that intelligence is malleable and that intelligence tests would help target children in need of extra attention to advance their intelligence.

To create their test, Binet and Simon first created a baseline of intelligence. A wide range of children were tested on a broad spectrum of measures in an effort to discover a clear indicator of intelligence. Failing to find a single identifier of intelligence, Binet and Simon instead compared children in each category by age. The children's highest levels of achievement were sorted by age and common levels of achievement considered the normal level for that age. Because this testing method merely compares a person's ability to the common ability level of others their age, the general practices of the test can easily be transferred to test different populations, even if the measures used are changed.



One of the first intelligence tests, the Binet–Simon test quickly gained support in the psychological community, many of whom further spread it to the public. Lewis M. Terman, a psychologist at Stanford University, was one of the first to create a version of the test for people in the United States, naming the first localized version the Stanford revision of the Binet-Simon Intelligence Scale (1916) and the second version the Stanford–Binet Intelligence Scale (1937). Terman used the test not only to help identify children with learning difficulties but also to find children and adults who had above average levels of intelligence. In creating his version, Terman also tested additional methods for his Stanford revision, publishing his first official version as The Measurement of Intelligence: An Explanation of and a Complete Guide for the Use of the Stanford Revision and Extension of the Binet–Simon Intelligence Scale (Fancher & Rutherford, 2012) (Becker, 2003).

The original tests in the 1905 form include:


 * 1) "Le Regard"
 * 2) Prehension Provoked by a Tactile Stimulus
 * 3) Prehension Provoked by a Visual Perception
 * 4) Recognition of Food
 * 5) Quest of Food Complicated by a Slight Mechanical Difficulty
 * 6) Execution of Simple Commands and Imitation of Simple Gestures
 * 7) Verbal Knowledge of Objects
 * 8) Verbal Knowledge of Pictures
 * 9) Naming of Designated Objects
 * 10) Immediate Comparison of Two Lines of Unequal Lengths
 * 11) Repetition of Three Figures
 * 12) Comparison of Two Weights
 * 13) Suggestibility
 * 14) Verbal Definition of Known Objects
 * 15) Repetition of Sentences of Fifteen Words
 * 16) Comparison of Known Objects from Memory
 * 17) Exercise of Memory on Pictures
 * 18) Drawing a Design from Memory
 * 19) Immediate Repetition of Figures
 * 20) Resemblances of Several Known Objects Given from Memory
 * 21) Comparison of Lengths
 * 22) Five Weights to be Placed in Order
 * 23) Gap in Weights
 * 24) Exercise upon Rhymes
 * 25) Verbal Gaps to be Filled
 * 26) Synthesis of Three Words in One Sentence
 * 27) Reply to an Abstract Question
 * 28) Reversal of the Hands of a Clock
 * 29) Paper Cutting
 * 30) Definitions of Abstract Terms

Historical use
One hindrance to widespread understanding of the test is its use of a variety of different measures. In an effort to simplify the information gained from the Binet–Simon test into a more comprehensible and easier to understand form, German psychologist William Stern created the well known Intelligence Quotient (IQ). By comparing the mental age a child scored at to their biological age, a ratio is created to show the rate of their mental progress as IQ. Terman quickly grasped the idea for his Stanford revision with the adjustment of multiplying the ratios by 100 to make them easier to read.

As also discussed by Leslie, in 2000, Terman was another of the main forces in spreading intelligence testing in the United States (Becker, 2003). Terman quickly promoted the use of the Stanford–Binet for schools across the United States where it saw a high rate of acceptance. Terman's work also had the attention of the U.S. government, who recruited him to apply the ideas from his Stanford–Binet test for military recruitment near the start of World War I. With over 1.7 million military recruits taking a version of the test and the acceptance of the test by the government, the Stanford–Binet saw an increase in awareness and acceptance (Fancher & Rutherford, 2012).

Given the perceived importance of intelligence and with new ways to measure intelligence, many influential individuals, including Terman, began promoting controversial ideas to increase the nation's overall intelligence. These ideas included things such as discouraging individuals with low IQ from having children and granting important positions based on high IQ scores. While there was significant opposition, many institutions proceeded to adjust students' education based on their IQ scores, often with a heavy influence on future career possibilities (Leslie, 2000).

Revisions of the Stanford–Binet Intelligence Scale
Since the first publication in 1916, there have been four additional revised editions of the Stanford–Binet Intelligence Scales, the first of which was developed by Lewis Terman. Over twenty years later, Maud Merrill was accepted into Stanford's education program shortly before Terman became the head of the psychology department. She completed both her master's degree and Ph.D. under Terman and quickly became a colleague of his as they started the revisions of the second edition together. There were 3,200 examinees, aged one and a half to eighteen years, ranging in different geographic regions as well as socioeconomic levels in attempts to comprise a broader normative sample (Roid & Barram, 2004). This edition incorporated more objectified scoring methods, while placing less emphasis on recall memory and including a greater range of nonverbal abilities (Roid & Barram, 2004) compared to the 1916 edition.

When Terman died in 1956, the revisions for the third edition were well underway, and Merrill was able to publish the final revision in 1960 (Roid & Barram, 2004). The use of deviation IQ made its first appearance in third edition, however the use of the mental age scale and ratio IQ were not eliminated. Terman and Merrill attempted to calculate IQs with a uniform standard deviation while still maintaining the use of the mental age scale by including a formula in the manual to convert the ratio IQs with means varying between age ranges and nonuniform standard deviations to IQs with a mean of 100 and a uniform standard deviation of 16. However, it was later demonstrated that very high scores occurred with much greater frequency than what would be predicted by the normal curve with a standard deviation of 16, and scores in the gifted range were much higher than those yielded by essentially every other major test, so it was deemed that the ratio IQs modified to have a uniform mean and standard deviation, referred to as "deviation IQs" in the manual of the third edition of the Stanford–Binet (Terman & Merrill, 1960), could not be directly compared to scores on "true" deviation IQ tests, such as the Wechsler Intelligence Scales, and the later versions of the Stanford–Binet, as those tests compare the performance of examinees to their own age group on a normal distribution (Ruf, 2003). While new features were added, there were no newly created items included in this revision. Instead, any items from the 1937 form that showed no substantial change in difficulty from the 1930s to the 1950s were either eliminated or adjusted (Roid & Barram, 2004).

Robert Thorndike was asked to take over after Merrill's retirement. With the help of Elizabeth Hagen and Jerome Sattler, Thorndike produced the fourth edition of the Stanford–Binet Intelligence Scale in 1986. This edition covers the ages two through twenty-three and has some considerable changes compared to its predecessors (Graham & Naglieri, 2003). This edition was the first to use the fifteen subtests with point scales in place of using the previous age scale format. In an attempt to broaden cognitive ability, the subtests were grouped and resulted in four area scores, which improved flexibility for administration and interpretation (Youngstrom, Glutting, & Watkins, 2003). The fourth edition is known for assessing children that may be referred for gifted programs. This edition includes a broad range of abilities, which provides more challenging items for those in their early adolescent years, whereas other intelligence tests of the time did not provide difficult enough items for the older children (Laurent, Swerdlik, & Ryburn, 1992).

Gale Roid published the most recent edition of the Stanford–Binet Intelligence Scale. Roid attended Harvard University where he was a research assistant to David McClelland. McClelland is well known for his studies on the need for achievement. While the fifth edition incorporates some of the classical traditions of these scales, there were several significant changes made.

Timeline

 * April 1905: Development of Binet–Simon Test announced at a conference in Rome
 * June 1905: Binet–Simon Intelligence Test introduced
 * 1908 and 1911: New Versions of Binet–Simon Intelligence Test
 * 1916: Stanford–Binet First Edition by Terman
 * 1937: Second Edition by Terman and Merrill
 * 1960: Third Edition by Merrill (form L-M)
 * 1973: Third Edition by Merrill (1937 norms were re-normed)
 * 1986: Fourth Edition by Thorndike, Hagen, and Sattler
 * 2003: Fifth Edition by Roid

Stanford–Binet Intelligence Scale: Fifth Edition
Just as it was used when Binet first developed the IQ test, the Stanford–Binet Intelligence Scale: Fifth Edition (SB5) is based in the schooling process to assess intelligence. It continuously and efficiently assesses all levels of ability in individuals with a broader range in age. It is also capable of measuring multiple dimensions of abilities (Ruf, 2003).

The SB5 can be administered to individuals as early as two years of age. There are ten subsets included in this revision including both verbal and nonverbal domains. Five factors are also incorporated in this scale, which are directly related to Cattell-Horn-Carroll (CHC) hierarchical model of cognitive abilities. These factors include fluid reasoning, knowledge, quantitative reasoning, visual-spatial processing, and working memory (Bain & Allin, 2005). Many of the familiar picture absurdities, vocabulary, memory for sentences, and verbal absurdities still remain from the previous editions (Janzen, Obrzut, & Marusiak, 2003), however with more modern artwork and item content for the revised fifth edition.

For every verbal subtest that is used, there is a nonverbal counterpart across all factors. These nonverbal tasks consist of making movement responses such as pointing or assembling manipulatives (Bain & Allin, 2005). These counterparts have been included to address language-reduced assessments in multicultural societies. Depending on age and ability, administration can range from fifteen minutes to an hour and fifteen minutes.

The fifth edition incorporated a new scoring system, which can provide a wide range of information such as four intelligence score composites, five factor indices, and ten subtest scores. Additional scoring information includes percentile ranks, age equivalents, and a change-sensitive score (Janzen, Obrzut, & Marusiak, 2003). Extended IQ scores and gifted composite scores are available with the SB5 in order to optimize the assessment for gifted programs (Ruf, 2003). To reduce errors and increase diagnostic precision, scores are obtained electronically through the use of computers now.

The standardization sample for the SB5 included 4,800 participants varying in age, sex, race/ethnicity, geographic region, and socioeconomic level (Bain & Allin, 2005).

Reliability
Several reliability tests have been performed on the SB5 including split-half reliability, standard error of measurement, plotting of test information curves, test-retest stability, and inter-scorer agreement. On average, IQ scores for this scale have been found quite stable across time (Janzen, Obrzut, & Marusiak, 2003). Internal consistency was tested by split-half reliability and was reported to be substantial and comparable to other cognitive batteries (Bain & Allin, 2005). The median interscorer correlation was .90 on average (Janzen, Obrzut, & Marusiak, 2003). The SB5 has also been found to have great precision at advanced levels of performance meaning that the test is especially useful in testing children for giftedness (Bain & Allin, 2005). There have only been a small amount of practice effects and familiarity of testing procedures with retest reliability; however, these have proven to be insignificant. Readministration of the SB5 can occur in a six-month interval rather than one year due to the small mean differences in reliability (Bain & Allin, 2005).

Validity
Content validity has been found based on the professional judgments Roid received concerning fairness of items and item content as well as items concerning the assessment of giftedness (Bain & Allin, 2005). With an examination of age trends, construct validity was supported along with empirical justification of a more substantial g loading for the SB5 compared to previous editions. The potential for a variety of comparisons, especially for within or across factors and verbal/nonverbal domains, has been appreciated with the scores received from the SB5 (Bain & Allin, 2005).

Score classification
The test publisher includes suggested score classifications in the test manual.

The classifications of scores used in the Fifth Edition differ from those used in earlier versions of the test.

Recent use
Since its inception, the Stanford–Binet has been revised several times. The test is in its fifth edition, called the Stanford–Binet Intelligence Scales, Fifth Edition, or SB5. According to the publisher's website, "The SB5 was normed on a stratified random sample of 4,800 individuals that matches the 2000 U.S. Census". By administering the Stanford–Binet test to large numbers of individuals selected at random from different parts of the United States, it has been found that the scores approximate a normal distribution. The revised edition of the Stanford–Binet over time has devised substantial changes in the way the tests are presented. The test has improved when looking at the introduction of a more parallel form and more demonstrative standards. For one, a non-verbal IQ component is included in the tests whereas in the past, there was only a verbal component. It evolved to have equally balanced verbal and non-verbal content. It is also more animated than the other tests, providing the test-takers with more colourful artwork, toys and manipulatives. This allows the test to have a higher range in the age of the test takers. This test is purportedly useful in assessing the intellectual capabilities of people ranging from young children all the way to young adults. However, the test has come under criticism for not being able to compare people of different age categories, since each category gets a different set of tests. Furthermore, very young children tend to do poorly on the test because they lack the ability to concentrate long enough to finish it.

Uses for the test include clinical and neuropsychological assessment, educational placement, compensation evaluations, career assessment, adult neuropsychological treatment, forensics, and research on aptitude. Various high-IQ societies also accept this test for admission into their ranks; for example, the Triple Nine Society accepts a minimum qualifying score of 151 for Form L or M, 149 for Form L-M if taken in 1986 or earlier, 149 for SB-IV, and 146 for SB-V; in all cases the applicant must have been at least 16 years old at the date of the test. Intertel accepts a score of 135 on SB5 and 137 on Form L-M.