Later, Alfred Binet and physician Theodore Simon collaborated in studying mental retardation in French school children. Between 1905 and 1908, their research at a boys school, in Grange-aux-Belles, led to their developing the Binet-Simon tests; via increasingly difficult questions, the tests measured attention, memory, and verbal skill. Binet warned that such test scores not be interpreted literally, because intelligence is plastic and the margin of error inherent to the test (Fancher, 1985).
In 1916, Stanford psychologist Lewis Terman released the "Stanford Revision of the Binet-Simon Scale", the "Stanford-Binet", in short. Helped by graduate students and validation experiments, he removed some Binet-Simon test items, and added new ones. Soon, the test was so popular that Robert Yerkes, the president of the American Psychological Association, decided to use it in developing the Army Alpha and the Army Beta tests to classify recruits. Thus, a high-scoring recruit might earn an A-grade (high officer material), whereas a low-scoring recruit with an E-grade would be rejected for military service. (Fancher, 1985).
Low variation on individuals tested multiple times indicates the test has high reliability, although its validity is hotly debated (see below). It features Fluid Reasoning, Knowledge, Quantitative Reasoning, Visual-Spatial Processing, and Working Memory as the five factors tested. Each of these factors is tested in two separate domains, verbal and nonverbal, in order to accurately assess individuals with deafness, limited English, or communication disorders. Examples of test items include verbal analogies to test Verbal Fluid Reasoning and picture absurdities to test Nonverbal Knowledge. In conclusion, the test makers assure people the Stanford-Binet 5 will accurately assess low-end functioning, normal intelligence, and the highest levels of giftedness (Riverside Publishing, 2004).
Students with exceptional scores on this test may be deemed bright, moderately gifted, highly gifted, extremely gifted, or profoundly gifted (contrast these with obsolete terms for low scores). These terms equate with progressively further standard deviations of IQ scores from the mean (100), bright being 1σ (one standard deviation), moderately gifted 2σ, etc. Mensa currently requires a score of 132 on the Stanford-Binet. Since the test has a standard deviation of 16, this corresponds to 2σ above the mean in a normalized population.
As Brown & French point out, "IQ tests serve one function exceptionally well, they predict academic success or failure ... they are composed of items that are representative of the kinds of problems that traditionally dominate school curricula," (1979: 255) and thus only predict that category of school assimilation. Further, "children with the same current status on an IQ test item may vary quite widely in terms of their cognitive potential." (ibid.: 258)
The validity of standardised tests such as Stanford-Binet for testing general intelligence (and indeed the whole concept of general intelligence) has been disputed by a number of commentators. A notable example, though not an intelligence researcher, is Stephen Jay Gould, particularly in his book The Mismeasure of Man. According to Gould, Binet originally devised his test to be carried out one-on-one with an examiner for detecting problem areas, rather than as a means of linearly ranking the general intelligence of students.