"May I end up next to Judas Iscariot, Brutus, and Cassius in the devils mouth
at the center of hell if I ever fail to present my most honest assessment and
best judgment of evidence for empirical truth" (p. 39). So swears one Stephen
Jay Gould, justifiably worried that his activist background may have tarnished
his reputation for scholarship. Critical examination of the new edition of
The Mismeasure of Man shows that, indeed, Gould's resort to character
assassination and misrepresentation of evidence have caught up with him.
Hailed in the popular media as the definitive deconstruction of the 'myth'
that science is an objective enterprise, the original The Mismeasure of
Man was in fact an ad hominem attack on eminent scholars, past and
present, who have scientifically studied race, intelligence, and brain size.
Despite the masses of empirical research using state-of-the-art technology
published in highly prestigious journals that refute the obscurantist arguments
Gould first served up in 1981, all the chapters of the initial edition have now
been unapologetically regurgitated. Gould's failure not only to conduct any
empirical research of his own but to even acknowledge the existence of any and
all contradictory data speaks for itself. Revealed political truth may abhor
revision but science thrives on it. Scientist that he is, Gould may yet regret
agreeing to produce this 'revision'.
Rather than being appropriately revised, the original edition of The
Mismeasure of Man has merely been expanded. Gould includes a 30-page preface
on why he wrote the original and why the renewed interest in race, behavior, and
evolution, required that he 'revise' it after 15 years, although he also
maintains (p. 35) that his 1981 arguments needed no modification. Gould's 1996
book also contains five end chapters including essays on J. F. Blumenbach, the
19th century German anthropologist who developed the first scientific system of
racial hierarchy, and Gould's own previously published reviews of Herrnstein and
Murrays (1994) The Bell Curve.
After carefully reading the book, I charge Gould with several counts of
scholarly malfeasance. First, he omits mention of remarkable new discoveries
made from Magnetic Resonance Imaging (MRI) which show that brain-size and IQ
correlate about 0.40. These results are as replicable as one will find in the
social and behavioral sciences and utterly destroy many of Gould's arguments.
Second, despite published refutations, Gould repeats verbatim his defamations of
character against long deceased individuals. Third, Gould fails to respond to
the numerous empirical studies that show a consistent pattern of race
differences in IQ, brain size, crime, and other factors that have appeared since
his first edition went to press.
In the opening chapters, Gould charges 19th century scientists with 'juggling' and 'finagling' brain size data in order to place Northern Europeans at the apex of civilization, lower orders trailing behind in a great chain of being. He argues that, in effect, Paul Broca, Francis Galton, and Samuel George Morton, all erred in the same direction and by similar magnitudes. Implausibly, Gould asks us to believe that Broca 'leaned' on his autopsy scales when measuring wet brains by just enough to produce the same differences that Morton caused by 'over-packing' empty skulls using filler, as did Galton's extra loose grip on calipers while measuring heads!
Later in the book, Gould attempts to discredit such 20th century luminaries as H. H. Goddard, Lewis Terman, R. M. Yerkes, Charles Spearman, Cyril Burt, Hans Eysenck and Arthur Jensen who, Gould claims, mean-spiritedly set out to measure IQ and fabricate its heritability. Gould specifically charges psychometricians with the sin of reification, that is, treating hypothetical constructs as though they were real entities. His major target is the general factor of intelligence (known as g). Contrary to Gould, every major study shows that different IQ tests tend to be significantly intercorrelated (Carroll, 1993) and that g is the 'active ingredient' in IQ predictions (Brody, 1992).
Gould's omission of recent, devastatingly contradictory evidence constitutes at best shoddy and at worst dishonest scholarship. Even before Gould's (1981) first edition, Van Valen (1974) had reviewed the literature and estimated an overall correlation of 0.30 between brain size and intelligence. Gould (1981) neglected to even mention Van Valens review. The 1990s have been called the 'Decade of the Brain' for good reason. Remarkable discoveries made using MRI confirm many of the relationships described by the 19th century visionaries defamed by Gould. Neither Gould nor his publisher show any scruples in releasing these chapters without the required revisions. Since Gould chose to withold this evidence from his extensive readership, allow me to reveal it. (For more detail, see the review by Rushton & Ankney, 1996).
The published research that most clearly shows the correlation between brain size and intelligence employed MRI, which creates, in vivo, a three-dimensional image of the brain. An overall correlation of 0.44 was found between MRI-measured-brain-size and IQ in 8 separate studies with a total sample size of 381 non-clinical adults. This correlation is about as strong as the relationship between socioeconomic status of origin and IQ. In seven MRI studies of clinical adults (N = 312) the overall correlation was 0.24; in 15 studies using external head measurements with adults (N = 6,437) the overall correlation was 0.15, and in 17 studies using external head measurements with children and adolescents (N = 45,056) the overall correlation was 0.21. The head size and brain size correlation with the g factor itself, which Gould would have you believe is a mere artifact, is even larger --- 0.60! (Jensen, 1994; Wickett et al., 1996).
Further, the brain-size/IQ correlation is predictive from birth. The National Collaborative Perinatal Study analyzed data from 17,000 White babies and 19,000 Black babies followed from birth to 7 years (Broman et al., 1987). Head perimeters were measured at birth for all children. At age 7, head perimeters were remeasured and IQ assessed. For both the Black and the White children, head perimeter measured at birth significantly predicted head perimeter at 7 years, and head perimeter at both ages predicted IQ!
The first of these MRI studies were published in the late 1980s and early 1990s in leading, refereed, mainstream journals like Intelligence (Willerman et al., 1991) and the American Journal of Psychiatry (Andreasen et al., 1993). I know Gould is aware of them because my colleagues and I routinely sent him copies as they appeared and asked him what he thought! For the record, let it be known that Gould did not reply to the missives regarding the published scientific data that destroyed the central thesis of his first edition.
Further evidence of Gould's method is the way the 1996 edition deletes the very section of the 1981 edition that discussed the brain-size/IQ relation. In the 1981 edition (pp. 108-111), Gould cited Jensen's (1980) Bias in Mental Testing (pp. 361-362) in order to pooh-pooh Jensen's report of a 0.30 correlation between brain-size and IQ and a table from Hooton (1939) which showed that average head sizes differed by SES. Gould (1996) gives no reason for making this selective cut, which would have appeared on page 140 of the new edition. I can only infer that when Gould read Jensen's (1982) review of his book, which he mentions doing in the introduction, he realized that Jensen's citation of the 0.30 correlation between brain size and IQ was based on Van Valen's (1974) review and so could no longer be dismissed as just Jensen. I submit that Gould realized that repeating this section verbatim, given the weight of the new evidence, would destroy his entire thesis. Rather than revise his arguments in light of the truth, Gould chose to repeat them without change and to withold any evidence to the contrary. Both Gould and his publisher owe it to their readers to explain why this supposedly 'new' edition studiously avoids any mention of all the new evidence.
Is it reasonable to expect that brain size and cognitive ability are related? Yes! Haug (1987, p.135) found a correlation of 0.479 (N = 81, P<0.001) between number of cortical neurons (based on a partial count of representative areas of the brain) and brain size in humans. His sample included both men and women. The regression relating the two measures is: number of cortical neurons (in billions)= 5.583 + 0.006 (cm3 brain volume). According to this equation, a person with a brain size of 1,400 cm3 has, on average, 600 million fewer cortical neurons than an individual with a brain size of 1,500 cm3. The difference between the low end of the normal distribution (1,000 cm3) and the high end (1,700 cm3) works out to be 4.2 billion neurons. That amounts to 27% more neurons for a 41% increase in brain size. The best estimate is that the human brain contains about 100 billion (1011) neurons classifiable into perhaps as many as 10,000 different types resulting in 100,000 billion synapses (Kandel, 1991). Even storing information at the low average rate of one bit per synapse, which would require two levels of synaptic activity (high or low/on or off), the structure as a whole would generate 1014 bits of information. Contemporary supercomputers, by comparison, typically have a memory of about 109 bits.
Gould's faults extend well beyond sins of omission to include sins of commission. The 'new' edition repeats the same false accusations that have been well refuted since 1981. Thus, Gould leaves unmodified his denigration of Sir Francis Galton as a 'dotty Victorian eccentric' (p. 108) despite having been called to account for painting a thoroughly tendentious portrait by University of Cambridge statistician, A. W. F. Edwards (1983) in the London Review of Books. Edwards rightly excoriated Gould, as the author of a book full of references to correlation, regression (including multiple regression), principal components analysis, and factor analysis, for failing to inform his readers that this whole statistical methodology is derived from Galtons pioneering work on the bivariate normal distribution and linear regression.
Gould also repeats verbatim his (1981) claim that S. G. Morton (1799-1851), one of the giants of 19th American science, 'unconsciously' doctored his results on cranial capacity so as to prove Caucasian racial superiority, despite the fact that when J. S. Michael (1988) remeasured a random sample of the Morton collection he found that very few errors had been made, and that these were not in the direction that Gould had asserted. Instead, the errors were in Gould's own work! Michael concluded that Mortons research "was conducted with integrity...(while)...Gould is mistaken" (p. 353).
Other refutations of Gould's original edition of The Mismeasure of Man appeared in the 1987 and 1988 issues of the American Psychologist. Gould claimed to have detected "conscious skullduggery" in Goddard's (1912) study of the heritability of feeblemindedness in the Kallikak family and alleged that Goddard's photographs had been 'phonied' by inserting heavy lines to give the eyes and mouth a 'depraved', 'sinister', and 'diabolical appearance'. However, not only was such retouching common during the period and thus betrays no evil intent (Fancher, 1987), but the retouched photographs actually strike judges (when empirically tested) as appearing kind (Glenn & Ellis, 1988).
Similarly, Gould repeats his trashing of Sir Cyril Burt's reputation, citing the initial verdict against him by Hearnshaw (1977) and avoiding any mention of the new evidence that has since come to light. Recall that Burt (1883-1971) was the distinguished British educational psychologist who reported a heritability for IQ of 77% for identical twins reared apart. Subsequently, he was widely accused of fabricating his data. However, five separate studies of identical twins raised apart have now corroborated Burt's finding (Jensen, 1992; see also Bouchard et al., 1990; Pedersen et al., 1992). The average heritability from these studies is 75%, almost the same as Burts supposedly 'faked' heritability of 77%. Moreover, two independently written, meticulously thorough books, one by Robert B. Joynson (1988) and the other by Ronald Fletcher (1991), have vindicated Burt and described how he was railroaded by those on both sides of the Atlantic dedicated to destroying hereditarian findings.
Gould's most inflammatory allegation consists of blaming IQ testers for magnifying the toll of those lost in the Holocaust (p. 263). Here he has followed the lead of Leon Kamin's (1974) The Science and Politics of IQ. The Kamin-Gould thesis is that early IQ testers claimed their research proved that Jews as a group scored low on their tests and that this finding was then conveniently used to support passage of the restrictive Immigration Act of 1924 which then denied entry to hapless Jewish refugees in the 1930s. Gould goes so far as to claim (1996, pp. 195-198; 255-258) that Henry H. Goddard (in 1917) and Carl C. Brigham (in 1923) labeled four-fifths of Jewish immigrants as "feeble-minded ... morons".
The facts are very different. Goddard wanted to find out if the Binet test was as effective at identifying 'high-grade defectives' (the term then used for those with mental ages between eight and twelve) among immigrants as it was among native-born Americans. By 1913, Goddard had translated the Binet test into English and arranged, over a two-and-a-half-month period, for it to be given to a subset of Jewish, Hungarian, Italian, and Russian immigrants "preselected as being neither 'obviously feeble-minded' nor 'obviously normal'" (Goddard, 1917, p. 244, emphasis added). Among this "unrepresentative" group (178 subjects in all), the tests successfully categorized 83% of the Jews, 80% of the Hungarians, 79% of the Italians, and 87% of the Russians. Goddard (1917) explicitly did not assert that 80% of Russians, Jews, or any immigrant group in general were feeble minded nor that the figures were representative of all immigrants from those nations. Nor did he claim that the feeblemindedness he was measuring was due to heredity. The vast majority of the many immigrants going through Ellis Island were never given mental tests. Nor was a random sample of any national group of immigrants ever tested. The only study by Goddard involving the testing of immigrants begins with the following sentence: "This is not a study of immigrants in general but of six small highly selected groups... "(1917, p. 243).
Gould's account of Brigham's (1923) A Study of American Intelligence is also misleading. Brigham examined the First World War intelligence tests given to 15,543 White officers, 93,955 White recruits, and 23,596 'Negro' recruits. The White recruits were subdivided into 81,465 native born ('Nordic' in origin) and 12,492 foreign born (categorized by country of origin as being primarily 'Nordic', 'Alpine', or 'Mediterranean'). Brigham found that U.S.-born White officers averaged a 'mental age' of about 17.3, U.S.-born White draftees about 13.3 years, foreign-born English speaking Nordics about 13.4 years, foreign-born non-English speaking Nordics about 12.6 years, foreign-born Alpines about 11.7 years, foreign-born Mediterraneans about 11.5 years, and Negroes about 10.7 years. Brigham made only passing reference to Jewish IQ (pp. 187-190) noting that no separate scores existed for them. But, by assuming that the proportions from the U.S. Census of 1910 were generalizable to his army recruits (implying that 50 percent of his Russian-born sample was Jewish, and that the Jewish subset scored about the same as other Russians), Brigham concluded that their mean mental age could be estimated at about 11.5 years. Brigham concluded that these data, taken at face value, did "tend to disprove the popular belief that the Jew is highly intelligent" (p. 190), but he immediately qualified this by noting that the standard deviation of the Russian sample was the highest of any immigrant group and that talent searches in New York and California schools often found high ability among Jewish children. Nonetheless, he did remark, somewhat snidely, that "the able Jew is popularly recognized not only because of his ability, but because he is able and a Jew" (p. 190).
For all their faults, the true story of the early IQ testers is a far cry from Gould's attempt to label them as unindicted co-conspirators in genocide. What is especially vexing about Gould's account is that he repeats it despite widely disseminated refutations. Historian of psychology Franz Samelson (1975, 1982) began the process of setting the record straight with his review of Kamin's book in the journal Social Forces. Perhaps the most incisive of these refutations appeared in a paper by Mark Snyderman and the late Richard Herrnstein in the 1983 issue of the American Psychologist. Snyderman and Herrnstein fully corroborated Samelson's conclusions, pointing out that the testing community in general did not view its findings as favoring restrictive immigration policies like those in the 1924 Act. As far as Snyderman and Herrnstein could ascertain from the records and publications of the time, Congress took virtually no notice of intelligence testing. None of the major contemporary figures in testing were called to testify, nor were any of their writings inserted into the legislative record.
In his 1981 book In Search of Human Nature, the eminent historian Carl N. Degler took Gould to task for ignoring contradictory information. Degler pointed out, for example, that it was the evidence of high IQs in Jews and Chinese in California that led Lewis Terman to strengthen his view that the low Black IQ was heritable. Degler also pointed out that although the comparatively high scores of Orientals did not prevent them from being excluded from immigration, such scores would embarrass any attempt to make IQ the basis for ethnic bias in immigration. Again, in 1992, the noted columnist Daniel Seligman debunked Gould's anti-testing propaganda in his book A Question of Intelligence. Most revealing of Gould's scholarship, perhaps, is that Herrnstein and Murray (1994) also highlighted the issue in a special boxed section on page 5 of The Bell Curve, a book that Gould reviewed (twice!). Did Gould overlook these refutations? Why did he not respond to them in his 'revision'?
The early IQ testers were far more aware of the effects of environmental and cultural background on their test takers than Gould would have you believe. They clearly stated that many high-IQ groups had been excluded from the draft sample, including those in occupations exempted from the draft as being vital to the war effort. Gould acknowledges these facts (p. 252) but puts on the spin that if Yerkes (1921) knew of flaws in his massive monograph Psychological Examining in the United States Army, from which Brigham (1923) drew his data, this only made the conclusions even more obviously biased than they otherwise would have been.
Eighty years of theoretical and applied progress, unrivalled in virtually any other field of psychology, have done nothing to diminish the fervor of Gould's anti-psychometric zealotry. In his review of The Bell Curve, Gould (1996, pp. 370-376) charges Herrnstein and Murray (1994) with 'disingenuousness'. First, Gould alleges disingenuousness of content, for he claims that The Bell Curve is really about race, but pretends to be about IQ. Second, he alleges there is disingenuousness of argument, for The Bell Curve fails to report openly the strength of statistical relationships. Finally, he claims there is disingenuousness of political program, for The Bell Curve attempts to justify cutting social programs while claiming to be in the tradition of Jeffersonian democracy.
Gould withholds from his readers that The Bell Curve is mainly an empirical work about the causes of social stratification and that it reached its conclusions only after fully analyzing a 12-year longitudinal study of 12,486 youths (3,022 of whom were African American) which showed that most 17-year-olds with high IQs (Blacks as well as Whites) went on to occupational success by their late 20s and early 30s whereas many of those with low IQs (both Black and White) went on to welfare dependency. The average IQ for African Americans was found to be lower than those for Latino, White, Asian, and Jewish Americans (85, 89, 103, 106, and 115, respectively, pp. 273-278). Failure to mention these data fosters the false belief that IQ tests are not predictive and are biased in favor of North Europeans.
In an afterword to the softcover edition of The Bell Curve, Charles Murray (1996) chides Gould and his reviews for being hopelessly out of date regarding the evidence for the biological basis of g and for dismissing as 'trivial' the predictive power of IQ in The Bell Curve sample. Murray invites Gould to "count the ways" in which g does in fact capture "a real property in the head". The higher the g loading of a subtest, the higher is its heritability, the higher the degree of inbreeding depression (an established genetic phenomenon) a test exhibits, the higher its relation to elementary cognitive tasks like reaction time, and the more it is related to physiological processes such as cortical evoked potentials and the brains consumption of glucose. Murray also accuses Gould of misleading readers by focusing on the R2 statistics given in the appendix, rather than on the IQ predictions given in the text. As Murray concludes "The relationships beween IQ and social behaviors that we present in the book are so powerful that they will revolutionize sociology" (p. 569).
Gould likes to leave his readers chanting the mantra that "g is nothing more than an artifact of the mathematical procedure used to calculate it". Jensen and Weng (1994) and Carroll (1995) provide detailed empirical and analytical demonstrations of the reality of g. Suffice to note for the purposes of this review that they find that g is remarkably robust and invariant across different data sets, different statistical procedures, or even simulated data, and that Gould avoids any mention of these studies.
In his critique of The Bell Curve, Gould acknowledges (p. 369), and then quickly sidesteps the finding that Orientals have a small average IQ advantage over Whites and a large one over Blacks, despite being aware that The Bell Curve brought Richard Lynn's (1991) detailed compilation of these data to wide attention. Because Gould dodged the issue allow me to address it. Lynn (1991, 1996) showed that, on average, Orientals score higher on tests of mental ability than do Whites, both within the U.S.A. and in Asia, whereas Africans and Caribbeans score lower. Oriental populations in East Asia and North America typically have mean IQs falling between 101 to 111. White populations in Europe, South Africa, Australasia, and North America have mean IQs of from 85 to 115, with an overall mean of 100. Black populations living south of the Sahara, in the Caribbean, in Britain, and in North America, average IQs of from 70 to 90.
Especially contentious was Lynn's calculation of a mean IQ of only 70 for Black Africans living south of the Sahara. Many reviewers have expressed skepticism about such a low IQ, holding it impossible that, by European standards, 50 percent of Black Africa is 'mentally retarded'. But a mean African IQ of 70 has been confirmed in three studies since Lynn's review, each of which used Raven's Progressive Matrices, a test regarded as an excellent measure of the non-verbal component of general intelligence and one not bound by culturally specific information. Kenneth Owen (1992) found it (a mean IQ of 70) in a sample of over 1,000 South African 13-year-olds, Fred Zindi (1994), a Black Zimbabwean, found it in a study of 12- to 14-year olds in Zimbabwe, and Richard Lynn (1994a) found it in a study of Ethiopian immigrants to Israel. In a reply to Leon Kamin regarding these data, Charles Murray (1995) wrote:" When data are as carefully collected and analyzed as these, attention must be paid" (p. 22).
Speed of decision making (reaction time) in 9- to 12-year olds, in which children decide which of several lights stands out from others, shows that the racial differences in mental ability are not restricted to paper and pencil tests. All children can perform the task in less than one second, but more intelligent children, as measured by traditional IQ tests, perform the task faster than do less intelligent children. Lynn (1991) found Oriental children from Hong Kong and Japan were faster on average in decision time (controlling for movement time) than were White children from Britain and Ireland, who in turn were faster than Black children from South Africa. Using the same decison time tasks, Jensen (1993) found the same racial ordering in California school children.
It seems unlikely that Gould's scornful remarks about early studies of racial differences in brain size were based on an objective assessment of the literature. First, investigation of the studies Gould does cite show him up to his usual tricks of hiding and distorting data. Second, although numerous modern studies have appeared since his 1981 edition went to press, he fails to make the corrections required by them or even to acknowledge their existence.
Consider, for example, a section titled "A Curtain Raiser With a Moral". In this, Gould (1996, 109-114) reviewed a technical debate over Black/White brain-size differences between Robert Bennett Bean (1906), a Virginia physician, and Franklin P. Mall (1909), Beans mentor at Johns Hopkins Medical School. Bean (1906) published a study finding that the weight of 103 American Negro brains at autopsy varied with the amount of Caucasian admixture, from 0 admixture = 1,157 grams, 1/16 = 1,191 grams, 1/8 = 1,335 grams, 1/4 = 1,340 grams, to 1/2 = 1,347 grams. Bean also reported that the 103 Negro brains were less convoluted than were 49 White brains and that Whites had a proportionately larger genus to splenium ratio (front to back part of corpus callosum), implying that Whites may have more activity in the frontal lobes which were thought to be the seat of intelligence. Mall (1909) disagreed and found that he was unable to replicate the results on genus/splenium ratios when he remeasured a subset of the brains under 'blind' conditions regarding the race of the brain. Gould elevated this disagreement on one of the findings into a morality play. (Mall "became suspicious"; "prior prejudice dictates conclusions"). What Gould neglects to tell us is that Mall himself (p. 7) reported a Black/White difference in brain weight of 100 grams and that he did not refute the data on racial admixture or on complexity of convolutions.
J. S. Michael's (1988) revelation of Gould's mistreatment of Samuel George Morton's 19th century data has been described above. Nonetheless, Michael remained doubtful that Morton's data could be used to examine race differences in brain size. Rushton (1989a), however, showed that Morton's data, even as reassesed by Gould, indicated that in cubic inches, Mongoloids averaged 86.5, Caucasoids 85.5, and Negroids 83.0, which convert to 1,401, 1,385, and 1,360 cm3, respectively. To be absolutely clear there is no misunderstanding about these data and to allow readers to combine the subgroups in their own preferred ways, Table 1 presents Gould's own retabulation of Morton's data (1981, p. 66, Table 2.5; 1996, p. 98, Table 2.5). Gould dismisses these differences as "trivial". But, as noted, a difference of 1 cubic inch (16 cm3) in brain size translates into a very nontrivial millions of neurons and hundreds of millions of synapses.
Table 1. S.J. Gould's ' corrected' final tabulation of Morton's assessment of racial differences in cranial capacity
Population | Cubic inches | Cubic centimeters |
Native Americans | 86 | 1410 |
Mongolians | 87 | 1427 |
Modern Caucasians | 87 | 1427 |
Malays | 85 | 1394 |
Ancient Caucasians | 84 | 1378 |
Africans | 83 | 1361 |
Consider the following statistically significant comparisons (sexes combined) from recently conducted studies using the four techniques mentioned above. Using brain mass at autopsy, Ho et al. (1990) summarized data for 1,261 individuals. They reported a mean brain weight of 1,323 grams for White Americans and 1,223 grams for Black Americans. Using endocranial volume, Beals et al. (1984) analyzed about 20,000 skulls from around the world and found that East Asians, Europeans, and Africans averaged cranial volumes of 1,415, 1,362, and 1,268 cm3 respectively. Using external head measurements from a stratified random sample of 6,325 U.S. Army personnel, Rushton (1992) found that Asian Americans, European Americans, and African Americans averaged 1,416, 1,380, and 1,359 cm3, respectively. Using external head measures from tens of thousands of men and women from around the world collated by the International Labour Office, Rushton (1994) found that Asians, Europeans, and Africans averaged 1,308, 1,297, and 1,241 cm3, respectively. Finally, an MRI study in Britain found that people of African and of Caribbean background averaged a smaller brain volume than did those of European background (Harvey et al., 1994).
Contrary to most purely environmental theories, racial differences in brain size show up early in life. Data from the U.S. National Collaborative Perinatal Project on 19,000 Black children and 17,000 White children showed that Black children had a smaller head perimeter at birth and, although Black children were born shorter in stature and lighter in weight than White children, by age 7 'catch-up growth' led Black children to be larger in body size than White children. However, Blacks remained smaller in head perimeter (Broman et al., 1987). Further, head perimeter at birth, 1 year, 4 years, and 7 years correlated with IQ scores at age 7 in both Black and White children (r = 0.13 to 0.24).
An absolute difference in brain size between men and women has not been disputed since at least the time of Broca (1861). He assembled a series of 292 male brains and found an average weight of 1,325 grams, while 140 female brains averaged 1,144 grams, a difference of 181 grams. Gould claimed that the sex difference disappears when appropriate statistical corrections are made for body size or age of people sampled. However, when Gould used multiple regression to remove the simultaneous influence of height and age, he only succeeded in reducing the sex difference by one third, to 113 grams. Gould then invoked additional unspecified age and body parameters, claiming that if these could be controlled the entire difference would disappear.
David Ankney (1992) questioned Gould's methodology. He reexamined autopsy data on 1,261 American adults (Ho et al., 1980) and found that at any given body surface area or height, mens brains are heavier than are womens brains. For example, among those who are 168-cm tall (5' 7"; the approximately overall mean height for men and women combined), brain mass of men averages about 100 g heavier than that of women, whereas the average difference in brain mass, uncorrected for body size, is 140 g. Thus, only about 30% of the sex difference in brain size is due to differences in body size.
Ankney's (1992) results were confirmed in the study of cranial capacity in a stratified random sample of 6,325 U.S. Army personnel (Rushton, 1992). After adjustment, via analysis of covariance, for effects of age, stature, weight, military rank, and race, men averaged 1,442 cm3 and women 1,332 cm3. This difference was found in all of 20 or more separate analyses performed to rule out any body-size effect (see Rushton, 1992; pp. 406-408). Moreover, the male/female difference was replicated across samples of Asians, Whites, and Blacks, as well as across samples of officers and enlisted personnel. The sex difference of 110 cm3 found by Rushton (1992) from analysis of external head measurements is remarkably similar to the 100 grams obtained in Ankney's (1992) analysis of brain mass (1 cm3 = 1.036 grams, Hofmann, 1991).
The brain size studies do present a paradox. Women have proportionately smaller brains than do men but, apparently, the same intelligence scores. This was recognized in stronger form over 100 years ago. Gould cites G. Hervé, a colleague of Broca's, who wrote in 1881; "Men of the black races have a brain scarcely heavier than that of a white woman." Gould's (1996, p. 135) response was a political one, namely "I do not regard as empty rhetoric a claim that the battles of one group are for all of us". David Ankney (1992, 1995) had a more scientific response. He suggested that the difference in brain size relates to those intellectual abilities at which men excel; that spatial and mathematical ability may require more "brain" power than do verbal abilities. Other theories are that men average slightly higher in general intelligence than do women (Lynn, 1994b); or that these particular differences in brain size have nothing to do with cognitive ability but reflect greater male muscle mass and physical co-ordination on tasks like throwing and catching.
As mentioned earlier, Gould inexplicably deleted a table which showed that averaged head sizes increased with each of 8 steps of vocational status from Hooton (1939) that had appeared on p. 109 of his first edition. Numerous other nineteenth- and early twentieth-century data sets (Broca, 1861; Sorokin, 1927; Topinard, 1878) confirmed that people of higher status occupations averaged a larger brain or head size than did those in lower ones. For example, Galton collected head measurements and information on educational and occupational background from thousands of individuals at his laboratory in the South Kensington Natural History Museum in London. However, he had no statistical method for testing the significance of the differences in head size between various occupational groups. Nearly a century later, Galton's data were analyzed by Johnson et al. (1985), who found that the professional and semiprofessional groups averaged significantly larger head sizes (both length and width) than did unskilled groups. The results were striking for men but less clear-cut for women. Rushton and Ankney (1996) calculated cranial capacities from Johnson et al.s (1985) summary of Galtons head-size data and found that cranial capacity increased from unskilled to professional classes from 1,324 to 1,468 cm3 in men but only from 1,256 to 1,264 cm3 in women (figures uncorrected for body size). Gould mentions none of this more recent work in his purported revision.
In his revised edition, Gould (pp. 151-175) continues to ridicule the 'ape-in-some-of-us' hypothesis proposed by Cesare Lombroso (1836-1909), the Italian physician and anthropologist who founded the discipline of criminology. Lombroso argued that many criminals were throwbacks to man's ancestral past, ill-suited to life in civilized society, and that therefore 'natural born criminals' could be identified by the presence of the anatomical signs of primitiveness he termed 'stigmata'. But, contrary to Gould, Lombroso was no monomaniac and also believed that criminal behavior could arise in 'normal' men.
Lombroso carried out several anthropometric surveys of the heads and bodies of criminals and noncriminals, including a sample of 383 crania from dead convicts. He claimed that, as a group, criminals evidenced many features he considered primitive, including smaller brains, thicker skulls, simpler cranial sutures, larger jaws, preeminence of the face over the cranium, a low and narrow forehead, long arms, and large ears. Lombroso also examined African tribes in the Upper Nile region finding so many of these allegedly primitive traits that he concluded criminality would be considered normal behavior among them.
While Gould delights in lampooning such early evolutionary thinking, he fails to tell his readers that though Lombrosos description of the individual trees was distorted by the prejudicial lens of his time, he correctly saw the forest. Lombroso was the first to understand how Darwin's theory of evolution provides a biological understanding for why some people are more prone to criminality than are others, how certain physical indicators allow us to predict criminality, and to recognize the critical role of the forebrain in inhibiting violent and antisocial behavior.
The reader of The Mismeasure of Man will search in vain for even a dismissing reference to any of the following recent studies of the biological correlates of criminal behavior. Raine (1993) reviewed several studies that used the state-of-the-art techniques of Computerized Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET) to study the brains of violent and sexual offenders. He tentatively concluded that frontal lobe dysfunction was associated with violent behavior including rape. Moreover, given the relation between brain size and IQ (Rushton & Ankney, 1996; see above), Lombroso's finding of a smaller brain in criminals relative to non-criminals is likely correct. Numerous American studies from those of H. H. Goddard in 1917 to the present, including The Bell Curve's 12 year longitudinal study of over 12,000 youth (Herrnstein & Murray, 1994), have established the predictive relationship between IQ and crime.
Nor does Gould feel compelled to let his readers know that Lombroso's ideas have received considerable support from recent work in behavioral genetics, a science that barely existed when Lombroso conducted his pioneering work. The same 1993 review by Raine (neither cited nor mentioned by Gould) describes 10 twin studies of adult crime based on official convictions. These studies yielded 13 analyses that together gave a concordance rate for criminal behavior of 52% for 202 monozygotic twins and only 21% for 345 dizygotic twins.
American, Danish, and Swedish studies of children who were adopted in infancy provide a means of testing the genetic theory of criminal behavior against the environmental theory. These studies support the findings of the twin studies and Lombroso's theory of 'natural born criminals'. Adopted children were at greater risk for criminal convictions if their biological parents had been convicted of a crime than if their adoptive parents had been. In a Danish study of some 14,000 adoptees, boys who had neither adoptive nor biological criminal parents, themselves had a 14% rate of criminal conviction. If the adoptive, but not biological parents were criminals, boys still had a conviction rate of only 15%. But if the biological but not adoptive parents were criminal, the rate increased to 20%. And, if both biological and adoptive parents were criminal, the rate increased to 25% (Mednick et al., 1984).
Studies that use self-reports of criminal behavior tell the same story as do studies of official arrest records. In one massive study, Rowe (1986) sampled almost all the eighth to twelfth graders in the Ohio Public Schools and found that MZ twins were roughly twice as alike in their self-report delinquency as were DZ twins, yielding a heritability of about 50%. Another recent study (Rushton, 1996) of 274 adult twin pairs used retrospective self-reports about destroying property, fighting, carrying and using a weapon, and struggling with the police and found a 50% heritability for such violent behaviors. Questionnaire studies of related traits such as altruism, aggression, and empathy in adults also typically show a 50% heritability (Rushton et al. 1986). Within the same family (that is, where socioeconomic status is identical), studies show it is the less intelligent and the more aggressive siblings who are more prone to delinquency.
Nor is Lombroso's concept of stigmata as far out as Gould would have you believe. In fact, the theory of bodily markers of abnormal behavior is making a comeback, albeit from an environmentalist as well as a genetic perspective. During gestation, an insult to the fetus (such as a drug in the mothers body) that disturbs brain development, may simultaneously produce a minor physical anomaly (termed an MPA) on the external body surface. For example, during the course of pregnancy, the ears start low on the neck of the fetus and gradually drift into their standard positions. An insult to development can prematurely stop this upward migration of the ears and result in low-set ears -- an observable MPA. Thus, the number of MPAs serves as a rough index of (perhaps hidden) central nervous system anomalies. For children raised in unstable families, Raine (1993) found that the number of MPAs at age 12 year was related to violent behaviors at age 21. More generally, Raines review found that antisocial children often appear markedly less attractive than normal children. In one sample of over 11,000 criminals and 7,000 controls, 60% of criminals but only 20% of controls had facial deformities, as judged by expert plastic surgeons.
Finally, consider the striking racial differences in criminal behavior. These differences are consistent across time, national boundaries, and political-economic system, which argues strongly for their having some genetic component. For example, as far back as records go, in the U.S., Orientals have been underrepresented and Blacks overrepresented in crime statistics relative to Whites. This pattern is not specific to the U.S. but is repeated around the world. Analyses of INTERPOL Yearbooks throughout the 1980s show that African and Caribbean countries have double the rate for violent crime of European countries and three times the rate of the countries in the Pacific Rim. The combined figures for murder, rape, and serious assault per 100,000 population for 1984 and 1986 were Africans -- 142, Europeans -- 74, and Asians -- 43. For 1989-90, the pattern was unchanged: Africans -- 240, Europeans -- 75, and Asians -- 32 (Rushton, 1990, 1995a).
It is unfortunate that Gould does not even cite, let alone attempt to refute any of these studies. Even if all of them are in some way biased and all my reasoning flawed, Gould owes it to those who rely upon his work to explain how this is so. More unfortunate is that by dismissing out of hand the hypothesis of the inclination to criminal behavior by some sneering remarks on the early work of the long-dead Lombroso and ignoring the latest research, Gould is actively obstructing scientists from finding the biogenetic treatments and environmental intervention strategies that could spare both future victims and delinquents (who, in their own way, are victims of their genes and their environments). It is thus Gould who is -- in Lombosos words -- the delinquent man.
Gould ( 1996, pp. 186-187, 369-370) continues to disparage the possibility of generalizing within-group findings to the causes of between-group differences. When environmentalists use nutrition as an explanation of both within-group and between-group differences this is (sensibly) not disputed. But when the exact same inference is made for heritabilities to explain both within-group and between-group differences, Gould argues it is inappropriate. But, if poor nutrition is shown to have an effect 'within' Whites and Blacks, it is sensible to suppose that nutrition has an effect on differences 'between' Whites and Blacks. If so for environmental generalization, why not for genetic generalization?
What Gould especially fails to mention is the striking and critically important finding that 'genetic weights on IQ subtests predict racial differences'. Although the White/Black IQ gap averages 15 points, the difference 'is more pronounced on subtests that are highly heritable within races than it is on less heritable tests' (Jensen, 1985, Rushton, 1989b). This observation is important because it provides a test of differential predictions. Environmental theory predicts that racial differences will be greater on more culturally or environmentally influenced tests whereas genetic theory predicts they will be greater on more heritable tests. Because higher heritabilities are stronger indicators of underlying genetic substrates than are lower heritabilities, the data support the genetic hypothesis, not Gould.
It is in fact an important 'empirical' question whether heritabilities for Blacks are the same as, or different from, those for Whites. Reason alone tells us that as environments become more benign and more equal, genetic sources of variation will become larger. For example, over the last 50 years, as environmental barriers to health and educational attainment have fallen, the variance in health and educational attainment accounted for by genetic factors has increased (Scriver, 1984; Heath et al., 1985). In animal studies, low heritabilities for body size variables are typically interpreted as showing the suppressant effect of the environment on natural growth (e.g. Larsson, 1993). The relevant question thus becomes: 'Are IQ heritabilities for Blacks lower than those for Whites?' Most of the evidence favors the view of equal heritabilities across the three major races. There is, however, some evidence of lower heritabilities in Blacks which would support the hypothesis of a more damaging environment. For example, Rushton and Osborne (1995) studied cranial capacity in several hundred Black and White twins and found a range of higher heritabilities (depending on corrections for age and body size) for Whites than for Blacks (47% to 56% vs 12% to 31%). The differences, however, were not statistically significant. These are, however, precisely the kinds of analyses Gould should be conducting if he wants to make a scientific, rather than a political argument about heritability!
Most transracial adoption studies also provide evidence for the heritability of racial differences in IQ. Studies of Korean and Vietnamese children adopted into White American and white Belgian homes have been conducted (Clark & Hanisee, 1982; Frydman & Lynn, 1989; Winick et al., 1975). As babies, many adoptees had been hospitalized for malnutrition. Nontheless, they went on to develop IQs 10 or more points higher than their adoptive national norms. By contrast, Black and Mixed-Race (Black/White) children adopted into White middle class families typically perform at a lower level than similarly adopted White children. For example, in the well known Minnesota Adoption Study, by age 17, adopted children with two White biological parents had an average IQ of 106, adopted children with one White and one Black biological parent had an average IQ of 99 and adopted children with two Black biological parents had an average IQ of 89 (Weinberg, Scarr & Waldman, 1992).
The only adoption studies Gould refers to (p. 370) are those showing IQ gains of very young Black children adopted into affluent and intellectual homes (presumably based on an earlier account of the Minnesota study when the children were only 7 years old) and a study of prepubertal mixed-race German children fathered by Black soldiers compared with those fathered by White soldiers which found 'no difference'. But these apparent exceptions may 'prove the rule'. In general, behavior genetic studies show that as people age, trait heritability increases while environmentality decreases. Differences not apparent before puberty often emerge by age 17.
Given that Gould doesnt believe that either brain size or intelligence differ by race and sex it is not surprising that he offers no evolutionary explanations for the origins of these differences. Gould (p. 399) acknowledges the accumulating evidence in favor of the 'Out of Africa' model of human origins. It holds that Homo sapiens arose in Africa 200,000 years ago, exited Africa with an African/non-African split about 110,000 years ago, and migrated east with a European/East Asian split about 40,000 years ago (Stringer & Andrews, 1988). But, Gould refuses to acknowledge any relationship between this evolutionary sequence and the parallel rankings of major racial groups in behavioral traits. Nor does he tell his readers that evolutionary selection pressures were different in the hot savanna where Africans evolved than in the cold Arctic where East Asians evolved.
Rushton (1995b) and others have proposed that the farther north the populations migrated, out of Africa, the more they encountered the cognitively demanding problems of gathering and storing food, gaining shelter, making clothes, and raising children during prolonged winters. Consequently, as the original African populations evolved into present-day Europeans and East Asians, they did so by moving in the direction of larger brains and greater intelligence, but also towards slower rates of maturation, lower levels of sex hormone, and concomitant reductions in sexual potency and aggressiveness, and increases in family stability and social conformity.
Such an evolutionary scenario fits the data from Rushtons (1995b) review of the international literature on race differences which found that on more than 60 variables Orientals and Africans consistently averaged at opposite ends of a continuum with Europeans averaging intermediately. For example, the rate of dizygotic twinning based on a double ovulation is less than 4 per 1,000 births among East Asians, 8 among Europeans, and 16 or greater among Africans. Multiple birthing is known to be heritable through the race of the mother. No known environmental factor can explain why Africans average the smallest brains and the highest twinning rates, East Asians average the largest brains and the lowest twinning rates, and Europeans average intermediately in both. Clearly, there is a need for a genetic-evolutionary explanation.
In fact, Vincent Sarich, who helped initiate the research program on biochemical taxonomy from which the 'Out of Africa' model developed (Sarich & Wilson, 1967), argues that Gould got his evolutionary ideas about race completely upside down. As Sarich (1995, p.86) pointed out, "it is the Out of Africa model, not that of regional continuity, which makes racial differences more functionally significant. It does so because the amount of time involved in the raciation process is much smaller, while, obviously, the degree of racial differentiation is the same -- large. The shorter the period of time required to produce a given amount of morphological difference, the more selectively important the differences become." Sarich (1982, 1995) has labelled the argument that natural selection would result in geographically separated populations evolving the exact same brain size 'behavioral creationism'. Although Gould is comfortable talking about the evolution of different body types in humans, he often writes as though he believes that societies, cultures, and mental differences spring into being full-blown, as if from the brow of Zeus or the hand of God.
With respect to the evolution of sex differences in brain size, Ankney (1992, 1995) hypothesized that differing roles of men and women during human evolution produced a sexual divergence in brain size and in abilities. Men roamed from the home base to hunt, which would select for targeting ability and navigational skills; women were relatively sedentary. Such additional abilities would have selected for relatively larger brains in men as it may require more brain tissue to process spatial information. Lynn (1994b) has also proposed that men evolved larger (more costly) brains because they enhance their probability of becoming socially dominant and thus more reproductively successful; female reproductive success is much less dependent on social status.
Others have speculated on the extent to which Gould's political outlook has colored his scientific work (Davis, 1986; Dennett, 1995, Ruse, 1993). In Darwins Dangerous Idea, Dennett (1995) brilliantly documents how Gould has been systematically misleading his readers for decades, attempting to smuggle anti-Darwinian mechanisms into evolutionary theory with a lot of clever talk of "spandrels" "punctuated equilibrium", and "dialectical processes". Gould notwithstanding, Darwinian adaptation is the way evolution works and the mechanism on which working evolutionary scientists base their research programs.
Gould himself tells us (1996, p. 19) that he originally considered titling his book Great Is Our Sin from Charles Darwin's remark: "If the misery of the poor be caused not by the laws of nature, but by our institutions, great is our sin." Gould avers that the scientific study of human differences in mental ability is nothing but an apology for elitist European enslavement and oppression of the rest of the world -- so it was in the beginning, is now, and ever shall be, world without end, amen. This has become the Apostle's Creed of the Adversary Culture. (Do not blame criminals from poor backgrounds, they are but helpless victims of a wicked system; affirmative action and multiculturalism must be invoked to exorcise the demons of capitalist oppression, racism, and sexism). In Goulds (1996) benediction, he keeps the faith of "political correctness", while grudgingly confessing that many see it as "leftist fascism" (his words, p. 424).
In his preface, Gould describes his background and how it has affected his work. All his grandparents were Eastern European Jews whose entry into America, he believes, Goddard "would have so severely restricted" (p. 38). Thus the book is dedicated to "Grammy and Papa Joe, who came, struggled, and prospered, Mr. Goddard notwithstanding". Gould's father fought for the leftist International Brigade in the Spanish Civil War (p. 39). He himself actively campaigned against racial oppression in the U.S.A. and in England (p. 38). I for one admire Gould for having the candor to divulge this background. No doubt personal experience affects all scholarship (including mine). However, even the most deeply held values cannot justify witholding evidence, engaging in character assassination, and repeating unfounded charges despite published refutations.
No doubt we are all prisoners of our background as well as slaves to our genes, but facts remain facts. Brain size and IQ are correlated. Men do average larger and heavier brains than do women. Asians and Europeans do average larger and heavier brains than do Africans. Higher SES groups do average larger and heavier brains than do lower SES groups.
Perhaps more than any scientist in recent memory, Gould has wielded his influence not only as a professor of science at Harvard but also through the pages of the New York Review of Books and through broadcasts on educational television, to seriously and intentionally misrepresent the science and politics of IQ. By his own standard, Gould has consigned himself to the innermost circle of hell. But science, fortunately, is not religion or politics. Gould need only own up to the facts and end his career of relentless special pleading. The second edition of The Mismeasure of Man does not measure up to Goulds own standard of "honest assesment and best judgment of evidence for empirical truth".
J. PHILIPPE RUSHTON
Department of Psychology
University of Western
Ontario
London, Ontario, Canada N6A 5C2