Open Letters
THE ORION PARTY
The Prometheus League
- Humanity Needs A World Government PDF
- Cosmos Theology Essay PDF
- Cosmos Theology Booklet PDF
- Europe Destiny Essays PDF
- Historical Parallels PDF
- Christianity Examined PDF
News Blogs
Euvolution
- Home Page
- Pierre Teilhard De Chardin
- Library of Eugenics
- Genetic Revolution News
- Science
- Philosophy
- Politics
- Nationalism
- Cosmic Heaven
- Eugenics
- Future Art Gallery
- NeoEugenics
- Contact Us
- About the Website
- Site Map
Transhumanism News
Partners
Population Differences In Intelligence
Chapter 12 entitled Population Differences In Intelligence: Causal
Hypotheses from Arthur Jensen's latest book, The g Factor: The Science of
Mental Ability published 1998.
The relationship of the g factor to a number of biological variables and
its relationship to the size of the white-black differences on various
cognitive tests (i.e., Spearman's hypothesis) suggests that the average
white-black difference in g has a biological component. Human races are viewed
not as discrete, or Platonic, categories, but rather as breeding populations
that, as a result of natural selection, have come to differ statistically in
the relative frequencies of many polymorphic genes. The "genetic distances"
between various populations form a continuous variable that can be measured in
terms of differences in gene frequencies. Racial populations differ in many
genetic characteristics, some of which, such as brain size, have behavioral
and psychometric correlates, particularly g. What I term the default
hypothesis states that the causes of the phenotypic differences between
contemporary populations of recent African and European descent arise from the
same genetic and environmental factors, and in approximately the same
magnitudes, that account for individual differences within each population.
Thus genetic and environmental variances between groups and within groups are
viewed as essentially the same for both populations. The default hypothesis is
able to account for the present evidence On the mean white-black difference in
g. There is no need to invoke any ad hoc hypothesis, or a Factor X, that is
unique to either the black or the white population. The environmental
component of the average g difference between groups is primarily attributable
to a host of microenvironmental factors that have biological effects. They
result from non-genetic variation in prenatal, perinatal, and neonatal
conditions and specific nutritional factors.
The many studies of Spearman's hypothesis using the method of correlated
vectors show a strong relationship between the g loadings of a great variety
of cognitive tests and the mean black-white differences on those tests. The
fact that the same g vectors that are correlated with W-B differences are also
correlated (and to about the same degree) with vectors composed of various
cognitive tests' correlations with a number of genetic, anatomical, and
physiological variables suggests that certain biological factors may be
related to the average black-white population difference in the level of g.
The degree to which of each of many different psychometric tests is
correlated with all of the other tests is directly related to the magnitude of
the test's g loading. What may seem surprising, however, is the fact that the
degree to which a given test is correlated with any one of the following
variables is a positive function of that test's g loading:
* Heritability of test scores. * Amount of inbreeding depression of test
scores. * Heterosis (hybrid vigor, that is, raised test scores, due to
outbreeding). * Head size (also, by inference, brain size). * Average evoked
potential (AEP) habituation and complexity. * Glucose metabolic rate as
measured by PET scan. * Average reaction time to elementary cognitive tasks. *
Size of the mean W-B difference on various cognitive tests.
The one (and probably the only) common factor that links all of these
non-psychometric variables to psychometric test scores and also links
psychometric test scores to the magnitude of the mean W-B difference is the g
factor. The critical role of g in these relationships is shown by the fact
that the magnitude of a given test's correlation with any one of the
above-listed variables is correlated with the magnitude of the W-B difference
on that test. For example, Rushtont1' reported a correlation (r = + .48)
between the magnitudes of the mean W-B differences (in the American
standardization sample) on eleven sub-tests of the WISC-R and the effect of
inbreeding depression on the eleven subtest scores of the Japanese version of
the WISC. Further, the subtests' g loadings in the Japanese data predicted the
American W-B differences on the WISC-R sub-tests with r = .69-striking
evidence of the g factor's robustness across different cultures. Similarly,
the magnitude of the mean W-B difference on each of seventeen diverse
psychometric tests was predicted (with r .71, p < .01) by the tests'
correlations with head size (a composite measure of length, width, and
circumference).
This association of psychometric tests' g loadings, the tests' correlations
with genetic and other biological variables, and the mean W-B differences in
test scores cannot be dismissed as happenstance. The failure of theories of
group differences in IQ that are based exclusively on attitudinal, cultural,
and experiential factors to predict or explain such findings argues strongly
that biological factors, whether genetic or environmental in origin, must be
investigated. Before examining possible biological factors in racial
differences in mental abilities, however, we should be conceptually clear
about the biological meaning of the term "race."
THE MEANING OF RACE
Nowadays one often reads in the popular press (and in some anthropology
textbooks) that the concept of human races is a fiction (or, as one well-known
anthropologist termed it, a "dangerous myth"), that races do not exist in
reality, but are social constructions of politically and economically dominant
groups for the purpose of maintaining their own status and power in a society.
It naturally follows from this premise that, since races do not exist in any
real, or biological, sense, it is meaningless even to inquire about the
biological basis of any racial differences. I believe this line of argument
has five main sources, none of them scientific:
o Heaping scorn on the concept of race is deemed an effective way of
combating racism-here defined as the belief that individuals who visibly
differ in certain characteristics deemed "racial" can be ordered on a
dimension of "human worth" from inferior to superior, and that therefore
various civil and political rights, as well as social privileges, should be
granted or denied according to a person's supposed racial origin.
o Neo-Marxist philosophy (which still has exponents in the social sciences
and the popular media) demands that individual and group differences in
psychologically and socially significant traits be wholly the result of
economic inequality, class status, or the oppression of the working classes in
a capitalist society. It therefore excludes consideration of genetic or
biological factors (except those that are purely exogenous) from any part in
explaining behavioral differences among humans. It views the concept of race
as a social invention by those holding economic and political powers to
justify the division and oppression of unprivileged classes.
o The view that claims that the concept of race (not just the
misconceptions about it) is scientifically discredited is seen as a way to
advance more harmonious relations among the groups in our society that are
commonly perceived as "racially" different.
o The universal revulsion to the Holocaust, which grew out of the racist
doctrines of Hitler's Nazi regime, produced a reluctance on the part of
democratic societies to sanction any inquiry into biological aspects of race
in relation to any behavioral variables, least of all socially important ones.
o Frustration with the age-old popular wrong-headed conceptions about race
has led some experts in population genetics to abandon the concept instead of
attempting candidly to make the public aware of how the concept of race is
viewed by most present-day scientists.
Wrong Conceptions of Race. The root of most wrong conceptions of race is
the Platonic view of human races as distinct types, that is, discrete,
mutually exclusive categories. According to this view, any observed variation
among the members of a particular racial category merely represents individual
deviations from the archetype, or ideal type, for that "race." Since,
according to this Platonic view of race, every person can be assigned to one
or another racial category, it naturally follows that there is some definite
number of races, each with its unique set of distinctive physical
characteristics, such as skin color, hair texture, and facial features. The
traditional number has been three: Caucasoid, Mongoloid, and Negroid, in part
derived from the pre-Darwinian creationist view that "the races of mankind"
could be traced back to the three sons of Noah-Shem, Ham, and Japheth.
The Cause of Biological Variation. All that is known today about the
worldwide geographic distribution of differences in human physical
characteristics can be understood in terms of the synthesis of Darwinian
evolution and population genetics developed by R. A. Fisher, Sewall Wright,
Theodosius Dobzhansky, and Ernst Mayr. Races are defined in this context as
breeding populations that differ from one another in gene frequencies and that
vary in a number of intercorrelated visible features that are highly
heritable.
Racial differences are a product of the evolutionary process working on the
human genome, which consists of about 100,000 polymorphic genes (that is,
genes that contribute to genetic variation among members of a species) located
in the twenty-three pairs of chromosomes that exist in every cell of the human
body. The genes, each with its own locus (position) on a particular
chromosome, contain all of the chemical information needed to create an
organism. In addition to the polymorphic genes, there are also a great many
other genes that are not polymorphic (that is, are the same in all individuals
in the species) and hence do not contribute to the normal range of human
variation. Those genes that do produce variation are called polymorphic genes,
as they have two or more different forms called alleles, whose codes differ in
their genetic information. Different alleles, therefore, produce different
effects on the phenotypic characteristic determined by the gene at a
particular chromosomal locus. Genes that do not have different alleles (and
thus do not have variable phenotypic effects) are said to have gone to
fixation; that is, alternative alleles, if any, have long since been
eliminated by natural selection in the course of human or mammalian evolution.
The physiological functions served by most basic "housekeeping" genes are so
crucial for the organism's development and viability that almost any mutation
of them proves lethal to the individual who harbors it; hence only one form of
the gene is possessed by all members of a species. A great many such essential
genes are in fact shared by closely related species; the number of genes that
are common to different species is inversely related to the evolutionary
distance between them. For instance, the two living species closest to Homo
sapiens in evolutionary distance, chimpanzees and gorillas, have at least 97
percent of their genes (or total genetic code) in common with present-day
humans, scarcely less than chimps and gorillas have in common with each other.
This means that even the very small percentage of genes (<3 percent) that
differ between humans and the great apes is responsible for all the
conspicuous and profound phenotypic differences observed between apes and
humans. The genetic difference appears small only if viewed on the scale of
differences among all animal species.
A particular gene's genetic code is determined by the unique sequences of
four chemical bases of the DNA, arranged in the familiar double-helix
structure of the gene. A change in a gene's code (one base pair), however
slight, can produce a new or different allele that manifests a different
phenotypic effect. (Many such mutations, however, have no phenotypic effect
because of redundancy in the DNA.) Such changes in the DNA result from
spontaneous mutation. Though mutations occur at random, some gene loci have
much higher mutation rates than others, ranging for different loci from less
than one per million to perhaps more than 500 per million sex cells-not a
trivial number considering that each male ejaculation contains from 200 to 500
million sperm. While natural or spontaneous mutations have largely unknown
causes, aptly referred to as biological "noise," it has been shown
experimentally that mutations can result from radiation (X-rays, gamma rays,
cosmic rays, and ultraviolet radiation). Certain chemical substances are also
mutagenic.
The creation of new alleles by spontaneous mutation along with the
recombination of alleles in gametogenesis are essential conditions for the
evolution of all forms of life. A new allele with phenotypic effects that
decrease an individual's fitness in a given environment, compared to the
nonmutated allele that would normally occupy the same chromosomal locus, will
be passed on to fewer descendants and will eventually go to extinction. The
gene is driven out of existence, so to speak, by losing in the competition
with other alleles that afford greater fitness. Biological fitness (also known
as Darwinian fitness), as a technical term in evolutionary genetics, refers
only to an individual's reproductive success, often defined operationally as
the number of surviving fertile progeny of that individual. (A horse mated
with a donkey, for example, might produce many surviving offspring, but
because they are all sterile, the horse and donkey in this mating have a
fitness of zero.) The frequency of a particular gene in all of an individual's
relatives is termed the inclusive fitness of that gene. The inclusive fitness
of a gene is a measure of its effect on the survival and reproductive success
of both the individual bearing the gene and all of the individual's relatives
bearing the identical gene. Technically speaking, an individual's biological
fitness denotes nothing more than that individual's genetic contribution to
the next generation's gene pool relative to the average for the population.
The term does not necessarily imply any traits one may deem personally
desirable, such as vigor, physical strength, or a beautiful body, although
some such traits, to the extent that they are heritable, were undoubtedly
genetically selected in the course of evolution only because, we know in
retrospect, they enhanced individuals' reproductive success in succeeding
generations. The survival of any new allele and its rate of spreading through
subsequent generations is wholly a function of the degree to which its
phenotypic expression enhances the inclusive fitness of those who inherit the
allele. An allele with any advantageous phenotypic effect, in this respect,
spreads to an ever-larger part of the breeding population in each successive
generation.
New alleles created by mutation are subject to natural selection according
to the degree of fitness they confer in a particular environment. Changed
environmental conditions can alter the selection pressure for a certain
allele, depending on the nature of its phenotypic expression, thereby either
increasing or decreasing its frequency in a breeding population. Depending on
its fitness in a given environment, it may go to extinction in the population
or it may go to fixation (with every member of the population eventually
possessing the allele). Many polymorphic gene loci harbor one or another
allele of a balanced polymorphism, wherein two or more alleles with comparable
fitness values (in a particular environment) are maintained at equilibrium in
the population. Thus spontaneous genetic mutation and recombination, along
with differential selection of new alleles according to how their phenotypic
expression affects inclusive fitness, are crucial mechanisms of the whole
evolutionary process. The variation in all inherited human characteristics has
resulted from this process, in combination with random changes caused by
genetic drift and gene frequency changes caused by migration and intermarriage
patterns.
Races as Breeding Populations with Fuzzy Boundaries. Most anthropologists
and population geneticists today believe that the preponderance of evidence
from both the dating of fossils and the analysis of the geographic
distribution of many polymorphic genes in present-day indigenous populations
argues that genus Homo originated in Africa. Estimates are that our direct
distant hominid precursor split off from the great apes some four to six
million years ago. The consensus of human paleontologists (as of 1997) accept
the following basic scenario of human evolution.
Australopithecus afarensis was a small (about 3'6"), rather ape-like
hominid that appears to have been ancestral to all later hominids. It was
bipedal, walking more or less upright, and had a cranial capacity of 380 to
520 cm3 (about the same as that of the chimpanzee, but relatively larger for
its overall body size). Branching from this species were at least two
lineages, one of which led to a new genus, Homo.
Homo also had several branches (species). Those that were precursors of
modern humans include Homo habilis, which lived about 2.5 to 1.5 million years
ago. It used tools and even made tools, and had a cranial capacity of 510 to
750 cm3 (about half the size of modern humans). Homo erectus lived about 1.5
to 0.3 million years ago and had a cranial capacity of 850 to 1100 cm3 (about
three-fourths the size of modern humans). The first hominid whose fossil
remains have been found outside Africa, Homo erectus, migrated as far as the
Middle East, Europe, and Western and Southeastern Asia. No Homo erectus
remains have been found in Northern Asia, whose cold climate probably was too
severe for their survival skills.
Homo sapiens branched off the Homo erectus line in Africa at least 100
thousand years ago. During a period from about seventy to ten thousand years
ago they spread from Africa to the Middle East, Europe, all of Asia,
Australia, and North and South America. To distinguish certain archaic
subspecies of Homo sapiens (e.g., Neanderthal man) that became extinct during
this period from their contemporaries who were anatomically modern humans, the
latter are now referred to as Homo sapiens sapiens (or Homo s. sapiens); it is
this line that branched off Homo erectus in Africa and spread to every
continent during the last 70,000 years. These prehistoric humans survived as
foragers living in small groups that frequently migrated in search of food.
GENETIC DISTANCE
As small populations of Homo s. sapiens separated and migrated further away
from Africa, genetic mutations kept occurring at a constant rate, as occurs in
all living creatures. Geographic separation and climatic differences, with
their different challenges to survival, provided an increasingly wider basis
for populations to become genetically differentiated through natural
selection. Genetic mutations that occurred after each geographic separation of
a population had taken place were differentially selected in each
subpopulation according to the fitness the mutant gene conferred in the
respective environments. A great many mutations and a lot of natural selection
and genetic drift occurred over the course of the five or six thousand
generations that humans were gradually spreading over the globe.
The extent of genetic difference, termed genetic distance, between
separated populations provides an approximate measure of the amount of time
since their separation and of the geographic distance between them. In
addition to time and distance, natural geographic hindrances to gene flow
(i.e., the interchange of genes between populations), such as mountain ranges,
rivers, seas, and deserts, also restrict gene flow between populations. Such
relatively isolated groups are termed breeding populations, because a much
higher frequency of mating occurs between individuals who belong to the same
population than occurs between individuals from different populations. (The
ratio of the frequencies of within/between population matings for two breeding
populations determines the degree of their genetic isolation from one
another.) Hence the combined effects of geographic separation [or cultural
separation], genetic mutation, genetic drift, and natural selection for
fitness in different environments result in population differences in the
frequencies of different alleles at many gene loci.
There are also other causes of relative genetic isolation resulting from
language differences as well as from certain social, cultural, or religious
sanctions against persons mating outside their own group. These restrictions
of gene flow may occur even among populations that occupy the same territory.
Over many generations these social forms of genetic isolation produce breeding
populations (including certain ethnic groups) that evince relatively slight
differences in allele frequencies from other groups living in the same
locality.
When two or more populations differ markedly in allele frequencies at a
great many gene loci whose phenotypic effects visibly distinguish them by a
particular configuration of physical features, these populations are called
subspecies. Virtually every living species on earth has two or more
subspecies. The human species is no exception, but in this case subspecies are
called races. Like all other subspecies, human races are interfertile breeding
populations whose individuals differ on average in distinguishable physical
characteristics.
Because all the distinguishable breeding populations of modern humans were
derived from the same evolutionary branch of the genus Homo, namely, Homo s.
sapiens, and because breeding populations have relatively permeable
(non-biological) boundaries that allow gene flow between them, human races can
be considered as genetic "fuzzy sets." That is to say, a race is one of a
number of statistically distinguishable groups in which individual membership
is not mutually exclusive by any single criterion, and individuals in a given
group differ only statistically from one another and from the group's central
tendency on each of the many imperfectly correlated genetic characteristics
that distinguish between groups as such. The important point is that the
average difference on all of these characteristics that differ among
individuals within the group is less than the average difference between the
groups on these genetic characteristics.
What is termed a cline results where groups overlap at their fuzzy
boundaries in some characteristic, with intermediate gradations of the
phenotypic characteristic, often making the classification of many individuals
ambiguous or even impossible, unless they are classified by some arbitrary
rule that ignores biology. The fact that there are intermediate gradations or
blends between racial groups, however, does not contradict the genetic and
statistical concept of race. The different colors of a rainbow do not consist
of discrete bands but are a perfect continuum, yet we readily distinguish
different regions of this continuum as blue, green, yellow, and red, and we
effectively classify many things according to these colors. The validity of
such distinctions and of the categories based on them obviously need not
require that they form perfectly discrete Platonic categories.
It must be emphasized that the biological breeding populations called races
can only be defined statistically, as populations that differ in the central
tendency (or mean) on a large number of different characteristics that are
under some degree of genetic control and that are correlated with each other
through descent from common ancestors who are relatively recent in the time
scale of evolution (i.e., those who lived about ten thousand years ago, at
which time all of the continents and most of the major islands of the world
were inhabited by relatively isolated breeding populations of Homo s.
sapiens).
Of course, any rule concerning the number of gene loci that must show
differences in allele frequencies (or any rule concerning the average size of
differences in frequency) between different breeding populations for them to
be considered races is necessarily arbitrary, because the distribution of
average absolute differences in allele frequencies in the world's total
population is a perfectly continuous variable. Therefore, the number of
different categories, or races, into which this continuum can be divided is,
in principle, wholly arbitrary, depending on the degree of genetic difference
a particular investigator chooses as the criterion for classification or the
degree of confidence one is willing to accept with respect to correctly
identifying the area of origin of one's ancestors.
Some scientists have embraced all of Homo sapiens in as few as two racial
categories, while others have claimed as many as seventy. These probably
represent the most extreme positions in the "lumper" and "splitter" spectrum.
Logically, we could go on splitting up groups of individuals on the basis of
their genetic differences until we reach each pair of monozygotic twins, which
are genetically identical. But as any pair of MZ twins are always of the same
sex, they of course cannot constitute a breeding population. (If
hypothetically they could, the average genetic correlation between all of the
offspring of any pair of MZ twins would be 2/3; the average genetic
correlation between the offspring of individuals paired at random in the total
population is 1/2; the offspring of various forms of genetic relatedness, such
as cousins [a preferred match in some parts of the world], falls somewhere
between 2/3 and 1/2.) However, as I will explain shortly, certain multivariate
statistical methods can provide objective criteria for deciding on the number
and composition of different racial groups that can be reliably determined by
the given genetic data or that may be useful for a particular scientific
purpose. But one other source of genetic variation between populations must
first be explained.
Genetic Drift. In addition to mutation, natural selection, and migration,
another means by which breeding population may differ in allele frequencies is
through a purely stochastic (that is, random) process termed genetic drift.
Drift is most consequential during the formation of new populations when their
numbers are still quite small. Although drift occurs for all gene loci,
Mendelian characters (i.e., phenotypic traits), which are controlled by a
single gene locus, are more noticeably affected by drift than are polygenic
traits (i.e., those caused by many genes). The reason is purely statistical.
Changes in a population's allele frequencies attributable to genetic drift
can be distinguished from changes due to natural selection for two reasons:
(1) Many genes are neutral in the sense that their allele frequencies have
remained unaffected by natural selection, because they neither increase nor
decrease fitness; over time they move across the permeable boundaries of
different breeding populations. (2) When a small band of individuals emigrates
from the breeding population of origin to found a new breeding population, it
carries with it only a random sample of all of the alleles, including neutral
alleles, that existed in the entire original population. That is, the allele
frequencies at all gene loci in the migrating band will not exactly match the
allele frequencies in the original population. The band of emigrants, and of
course all its descendants (who may eventually form a large and stable
breeding population), therefore differs genetically from its parent population
as the result of a purely random process. This random process is called
founder effect. It applies to all gene loci. All during the time that genetic
drift was occurring, gene mutations steadily continued, and natural selection
continued to produce changes in allele frequencies at many loci. Thus the
combined effects of genetic drift, mutation, and natural selection ensure that
a good many alleles are maintained at different frequencies in various
relatively isolated breeding populations. This process did not happen all at
once and then cease. It is still going on, but it takes place too slowly to be
perceived in the short time span of a few generations.
It should be noted that the phenotypic differences between populations that
were due to genetic drift are considerably smaller than the differences in
those phenotypic characteristics that were strongly subject to natural
selection, especially those traits that reflect adaptations to markedly
different climatic conditions, such as darker skin color (thought to have
evolved as protection from the tropical sun's rays that can cause skin cancer
and to protect against folate decomposition by sunlight), light skin color (to
admit more of the ultraviolet rays needed for the skin's formation of vitamin
D in northern regions; also because clothing in northern latitudes made dark
skin irrelevant selectively and it was lost through random mutation and
drift), and globular versus elongated body shape and head shape (better to
conserve or dissipate body heat in cold or hot climates, respectively).
Since the genetic drift of neutral genes is a purely random process, and
given a fairly constant rate of drift, the differing allele frequencies of
many neutral genes in various contemporary populations can be used as a
genetic clock to determine the approximate time of their divergence. The same
method has been used to estimate the extent of genetic separation, termed
genetic distance, between populations.
Measurement and Analysis of Genetic Distance Between Groups. Modern genetic
technology makes it possible to measure the genetic distance between different
populations objectively with considerable precision, or statistical
reliability. This measurement is based on a large number of genetic
polymorphisms for what are thought to be relatively neutral genes, that is,
genes whose allele frequencies therefore differ across populations more
because of mutations and genetic drift than because of natural selection.
Population allele frequencies can be as low as zero or as high as 1.0 (as
there are certain alleles that have large frequencies in some populations but
are not found at all in other populations). Neutral genes are preferred in
this work because they provide a more stable and accurate evolutionary "clock"
than do genes whose phenotypic characters have been subjected to the kinds of
diverse external conditions that are the basis for natural selection. Although
neutral genes provide a more accurate estimate of populations' divergence
times, it should be noted that, by definition, they do not fully reflect the
magnitude of genetic differences between populations that are mainly
attributable to natural selection.
The technical rationale and formulas for calculating genetic distance are
fully explicated elsewhere. For present purposes, the genetic distance, D,
between two groups can be thought of here simply as the average difference in
allele frequencies between two populations, with D scaled to range from zero
(i.e., no allele differences) to one (i.e., differences in all alleles). One
can also think of D as the complement of the correlation coefficient r (i.e.,
D= 1- r, and r=1- D). This conversion of D to r is especially useful, because
many of the same objective multivariate statistical methods that were
originally devised to analyze large correlation matrices (e.g., principal
components analysis, factor analysis, hierarchical cluster analysis,
multidimensional scaling) can also be used to analyze the total matrix of
genetic distances (after they are converted to correlations) between a large
number of populations with known allele frequencies based on some large number
of genes.
The most comprehensive study of population differences in allele
frequencies to date is that of the Stanford University geneticist Luigi Luca
Cavalli-Sforza and his coworkers. Their recent 1,046-page book reporting the
detailed results of their study is a major contribution to the science of
population genetics. The main analysis was based on blood and tissue specimens
obtained from representative samples of forty-two populations, from every
continent (and the Pacific islands) in the world. All the individuals in these
samples were aboriginal or indigenous to the areas in which they were selected
samples; their ancestors have lived in the same geographic area since no later
than 1492, a familiar date that generally marks the beginning of extensive
worldwide European explorations and the consequent major population movements.
In each of the Stanford study's population samples, the allele frequencies of
120 alleles at forty-nine gene loci were determined. Most of these genes
determine various blood groups, enzymes, and proteins involved in the immune
system, such as human lymphocyte antigens (HLA) and immunoglobulins. These
data were then used to calculate the genetic distance (D) between each group
and every other group. (DNA sequencing was also used in separate analyses of
some groups; it yields finer genetic discrimination between certain groups
than can the genetic polymorphisms used in the main analysis.) From the total
matrix of (42 X 41)/2 = 861 D values, Cavalli-Sforza et al. constructed a
genetic linkage tree. The D value between any two groups is represented
graphically by the total length of the line that connects the groups in the
branching tree. (See Figure 12.1.)
The greatest genetic distance, that is, the largest D, is between the five
African groups (listed at the top of Figure 12.1) and all the other groups.
The next largest D is between the Australian + New Guinean groups and the
remaining other groups; the next largest split is between the South Asians +
Pacific Islanders and all the remaining groups, and so on. The clusters at the
lowest level (i.e., at far right in Figure 12.1) can also be clustered to show
the D values between larger groupings, as in Figure 12.2. Note that these
clusters produce much the same picture as the traditional racial
classifications that were based on skeletal characteristics and the many
visible physical features by which non-specialists distinguish "races."
It is noteworthy, but perhaps not too surprising, that the grouping of
various human populations in terms of invisible genetic polymorphisms for many
relatively neutral genes yields results that are highly similar to the classic
methods of racial classification based on directly observable anatomical
features.
Another notable feature of the Stanford study is that the geographic
distances between the locations of the groups that are less than 5,000 miles
apart are highly correlated (r ~.95) with the respective genetic distances
between these groups. This argues that genetic distance provides a fairly good
measure of the rate of gene flow between populations that were in place before
A.D. 1492.
None of the 120 alleles used in this study has equal frequencies across all
of the forty-two populations. This attests to the ubiquity of genetic
variation among the world's populations and subpopulations.
All of the modern human population studies based on genetic analysis
(including analyses based on DNA markers and sequences) are in close agreement
in showing that the earliest, and by far the greatest, genetic divergence
within the human species is that between Africans and non-Africans (see
Figures 12.1 and 12.2).
Cavalli-Sforza et al. transformed the distance matrix to a correlation
matrix consisting of 861 correlation coefficients among the forty-two
populations, so they could apply principal components (PC) analysis to their
genetic data. (PC analysis is similar to factor analysis; the essential
distinction between them is explained in Chapter 3, Note 13.) PC analysis is a
wholly objective mathematical procedure. It requires no decisions or judgments
on anyone's part and yields identical results for everyone who does the
calculations correctly. (Nowadays the calculations are performed by a computer
program specifically designed for PC analysis.) The important point is that if
the various populations were fairly homogeneous in genetic composition,
differing no more genetically than could be attributable only to random
variation, a PC analysis would not be able to cluster the populations into a
number of groups according to their genetic propinquity. In fact, a PC
analysis shows that most of the forty-two populations fall very distinctly
into the quadrants formed by using the first and second principal components
as axes (see Figure 12.3). They form quite widely separated clusters of the
various populations that resemble the "classic" major racial groups-Caucasians
in the upper right, Negroids in the lower right, Northeast Asians in the upper
left, and Southeast Asians (including South Chinese) and Pacific Islanders in
the lower left. The first component (which accounts for 27 percent of the
total genetic variation) corresponds roughly to the geographic migration
distances (or therefore time since divergence) from sub-Saharan Africa,
reflecting to some extent the differences in allele frequencies that are due
to genetic drift. The second component (which accounts for 16 percent of the
variation) appears to separate the groups climatically, as the groups'
positions on PC2 are quite highly correlated with the degrees latitude of
their geographic locations. This suggests that not all of the genes used to
determine genetic distances are entirely neutral, but at least some of them
differ in allele frequencies to some extent because of natural selection for
different climatic conditions. I have tried other objective methods of
clustering on the same data (varimax rotation of the principal components,
common factor analysis, and hierarchical cluster analysis). All of these types
of analysis yield essentially the same picture and identify the same major
racial groupings.
African-Americans. The first Africans arrived in North America in 1619 and
for more than two centuries thereafter, mostly between 1700 and 1800, the
majority of Africans were brought to America as slaves. The end to this
involuntary migration came between 1863 and 1865, with the Emancipation
Proclamation. Nearly all of the Africans who were enslaved came from
sub-Saharan West Africa, specifically the coastal region from Senegal to
Angola. The populations in this area are often called West African or North
West and Central West Bantu.
Steadily over time, the real, but relatively low frequency of cross-mating
between blacks and whites produced an infusion of Caucasoid genes into the
black gene pool. As a result, the present-day population of black Americans is
genetically different from the African populations from whom they descended.
Virtually 100 percent of contemporary black Americans have some Caucasian
ancestry. Most of the Caucasian genes in the present-day gene pool of black
Americans entered the black gene pool during the period of slavery.
Estimates of the proportion of Caucasoid genes in American blacks are based
on a number genetic polymorphisms that have fairly high allele frequencies in
the European population but zero or near-zero frequencies in the West African
population, or vice versa. For any given allele, the estimated proportion (M)
of white European ancestry in American blacks is obtained by the formula M
=(qB-qAf)/qW-qAf) where qB is the given allele's frequency in the black
American population, qAf is its frequency in the African population, and qW is
its frequency in the white European population. The average value of M is
obtained over each of twenty or so genes with alleles that are unique either
to Africans or to Europeans. The largest studies, which yield estimates with
the greatest precision, give mean values of M close to 25 percent, with a
standard error of about 3 percent. This is probably the best estimate for the
African-American population overall. However, M varies across different
regions of the United States, being as low as 4 percent to 10 percent in some
southeastern States and spreading out in a fan-shaped gradient toward the
north and the west to reach over 40 percent in some northeastern and
northwestern states. Among the most typical and precise estimates of M are
those for Oakland, California (22.0 percent) and Pittsburgh, Pennsylvania
(25.2 percent). This regional variation in M reflects the pattern of selective
migration of blacks from the Deep South since the mid-nineteenth century. Gene
flow, of course, goes in both directions. In every generation there has been a
small percentage of persons who have some African ancestry but whose ancestry
is predominantly Caucasian and who permanently "pass as white." The white
American gene pool therefore contains some genes that can be traced to
Africans who were brought over as slaves (estimated by analyses of genetic
polymorphisms to be less than 1 percent).
Genetic Distance and Population Differences in g. The preceding discourse
on the genetics of populations is germane to any discussion of population
differences in g. The differences in gene frequencies that originally created
different breeding populations largely explain the physical phenotypic
differences observed between populations called races. Most of these
differences in visible phenotypic characteristics are the result of natural
selection working over the course of human evolution. Selection changes gene
frequencies in a population by acting directly on any genetically based
phenotypic variation that affects Darwinian fitness for a given environment.
This applies not only to physical characteristics, but also to behavioral
capacities, which are necessarily to some degree a function of underlying
physical structures. Structure and function are intimately related, as their
evolutionary origins are inseparable.
The behavioral capacities or traits that demonstrate genetic variation can
also be viewed from an evolutionary perspective. Given the variation in allele
frequencies between populations for virtually every known polymorphic gene, it
is exceedingly improbable that populations do not differ in the alleles that
affect the structural and functional basis of heritable behavioral traits. The
empirical generalization that every polygenic physical characteristic that
shows differences between individuals also shows mean differences between
populations applies to behavioral as well as physical characteristics. Given
the relative genetic distances between the major racial populations, one might
expect some behavioral differences between Asians and Europeans to be of
lesser magnitude than those between these groups and sub-Saharan Africans.
The behavioral, psychological, or mental characteristics that show the
highest g loadings are the most heritable and have the most biological
correlates (see Chapter 6) and are therefore the most likely to show genetic
population differences. Because of the relative genetic distances, they are
also the most likely to show such differences between Africans (including
predominantly African descendants) and Caucasians or Asians.
Of the approximately 100,000 human polymorphic genes, about 50,000 are
functional in the brain and about 30,000 are unique to brain functions. The
brain is by far the structurally and functionally most complex organ in the
human body and the greater part of this complexity resides in the neural
structures of the cerebral hemispheres, which, in humans, are much larger
relative to total brain size than in any other species. A general principle of
neural organization states that, within a given species, the size and
complexity of a structure reflect the behavioral importance of that structure.
The reason, again, is that structure and function have evolved conjointly as
an integrated adaptive mechanism. But as there are only some 50,000 genes
involved in the brain's development and there are at least 200 billion neurons
and trillions of synaptic connections in the brain, it is clear that any
single gene must influence some huge number of neurons-not just any neurons
selected at random, but complex systems of neurons organized to serve special
functions related to behavioral capacities.
It is extremely improbable that the evolution of racial differences since
the advent of Homo sapiens excluded allelic changes only in those 50,000 genes
that are involved with the brain.
Brain size has increased almost threefold during the course of human
evolution, from about 500 cm3 in the australopithecenes to about 1,350 cm3
(the present estimated worldwide average) in Homo sapiens. Nearly all of this
increase in brain volume has occurred in connection with those parts of the
cerebral hemispheres associated with cognitive processes, particularly the
prefrontal lobes and the posterior association areas, which control foresight,
planning, goal-directed behavior, and the integration of sensory information
required for higher levels of information processing. The parts of the brain
involved in vegetative and sensorimotor functions per se differ much less in
size, relative to total brain size, even between humans and chimpanzees than
do the parts of the brain that subserve cognitive functions. Moreover, most of
the evolutionary increase in brain volume has resulted not from a uniform
increase in the total number of cortical neurons per Se, but from a much
greater increase in the number and complexity of the interconnections between
neurons, making possible a higher level of interneuronal communication on
which complex information processing depends. Although the human brain is
three times larger than the chimpanzee brain, it has only 1.25 times as many
neurons; the much greater difference is in their degree of arborization, that
is, their number of synapses and interconnecting branches.
No other organ system has evolved as rapidly as the brain of Homo sapiens,
a species that is unprecedented in this respect. Although in hominid evolution
there was also an increase in general body size, it was not nearly as great as
the increase in brain size. In humans, the correlation between individual
differences in brain size and in stature is only about + .20. One minus the
square of this relatively small correlation, which is .96, reflects the
proportion of the total variance in brain size that cannot be accounted for by
variation in overall body size. Much of this residual variance in brain size
presumably involves cognitive functions.
Bear in mind that, from the standpoint of natural selection, a larger brain
size (and its corresponding larger head size) is in many ways decidedly
disadvantageous. A large brain is metabolically very expensive, requiring a
high-calorie diet. Though the human brain is less than 2 percent of total body
weight, it accounts for some 20 percent of the body's basal metabolic rate
(BMR). In other primates, the brain accounts for about 10 percent of the BMR,
and for most carnivores, less than 5 percent. A larger head also greatly
increases the difficulty of giving birth and incurs much greater risk of
perinatal trauma or even fetal death, which are much more frequent in humans
than in any other animal species. A larger head also puts a greater strain on
the skeletal and muscular support. Further, it increases the chances of being
fatally hit by an enemy's club or missile. Despite such disadvantages of
larger head size, the human brain, in fact, evolved markedly in size, with its
cortical layer accommodating to a relatively lesser increase in head size by
becoming highly convoluted in the endocranial vault. In the evolution of the
brain, the effects of natural selection had to have reflected the net
selective pressures that made an increase in brain size disadvantageous versus
those that were advantageous. The advantages obviously outweighed the
disadvantages to some degree or the increase in hominid brain size would not
have occurred.
The only conceivable advantage to an increase in the size and complexity of
the brain is the greater behavioral capacity this would confer. This would
include: the integration of sensory information, fine hand-eye coordination,
quickness of responding or voluntary response inhibition and delayed reaction
depending on the circumstances, perceiving functional relationships between
two things when only one or neither is physically 'present, connecting past
and future events, learning from experience, generalization, far transfer of
learning, imagery, intentionality and planning, short-term and long-term
memory capacity, mentally manipulating objects without need to handle them
physically, foresight, problem solving, use of denotative language in vocal
communication, as well as all of the information processes that are inferred
from performance on what were referred to in Chapter 8 as "elementary
cognitive tasks." These basic information processes are involved in coping
with the natural exigencies and the contingencies of humans' environment. An
increase in these capabilities and their functional efficiency are, in fact,
associated with allometric differences in brain size between various species
of animals, those with greater brain volume in relation to their overall body
size generally displaying more of the kinds of capabilities listed above. The
functional efficiency of the various behavioral capabilities that are common
to all members of a given species can be enhanced differentially by natural
selection, in the same way (though probably not to the same degree) that
artificial selection has made dogs of various breeds differ in propensities
and trainability for specific types of behavior.
What kinds of environmental pressures encountered by Homo erectus and early
Homo sapiens would have selected for increased size and complexity of the
brain? Evolutionists have proposed several plausible scenarios. Generally, a
more complex brain would be advantageous in hunting skill, cooperative social
interaction, and the development of tool use, followed by the higher-order
skill of using tools to make other tools, a capacity possessed by no
contemporary species other than Homo sapiens.
The environmental forces that contributed to the differentiation of major
populations and their gene pools through natural selection were mainly
climatic, but parasite avoidance and resistance were also instrumental. Homo
sapiens evolved in Africa from earlier species of Homo that originated there.
In migrating from Africa and into Europe and Asia, they encountered highly
diverse climates. These migrants, like their parent population that remained
in sub-Saharan Africa, were foragers, but they had to forage for sustenance
under the highly different conditions of their climatically diverse habitats.
Foraging was possible all during the year in the tropical and subtropical
climates of equatorial regions, while in the more northern climate of Eurasia
the abundance of food that could be obtained by hunting and gathering greatly
fluctuated with the seasons. This necessitated the development of more
sophisticated techniques for hunting large game, requiring vocal communication
and cooperative efforts (e.g., by ambushing, trapping, or corralling), along
with foresight in planning ahead for the preservation, storage, and rationing
of food in order to survive the severe winter months when foraging is
practically impossible. Extreme seasonal changes and the cold climate of the
northern regions (now inhabited by Mongoloids and Caucasians) also demanded
the ingenuity and skills for constructing more permanent and sturdy dwellings
and designing substantial clothing to protect against the elements. Whatever
bodily and behavioral adaptive differences between populations were wrought by
the contrasting conditions of the hot climate of sub-Saharan Africa and the
cold seasons of northern Europe and northeast Asia would have been markedly
intensified by the last glaciation, which occurred approximately 30,000 to
10,000 years ago, after Homo sapiens had inhabited most of the globe. During
this long period of time, large regions of the Northern Hemisphere were
covered by ice and the north Eurasian winters were far more severe than they
have ever been for over 10,000 years.
It seems most plausible, therefore, that behavioral adaptations of a kind
that could be described as complex mental abilities were more crucial for
survival of the populations that migrated to the northern Eurasian regions,
and were therefore under greater selection pressure as fitness characters,
than in the populations that remained in tropical or subtropical regions.
Climate has also influenced the evolution of brain size apparently
indirectly through its direct effect on head size, particularly the shape of
the skull. Head size and shape are more related to climate than is the body as
a whole. Because the human brain metabolizes 20 percent of the body's total
energy supply, it generates more heat in relation to its size than any other
organ. The resting rate of energy output of the average European adult male's
brain is equal to about three-fourths that of a 100-watt light bulb. Because
temperature changes in the brain of only four to five degrees Celsius, are
seriously adverse to the normal functioning of the brain, it must conserve
heat (in a cold environment) or dissipate heat (in a hot environment). Simply
in terms of solid geometry, a sphere contains a larger volume (or cubic
capacity) for its total surface area than does than any other shape.
Conversely, a given volume can be contained in a sphere that has a smaller
surface area than can be contained by a non-spherical shape with the same
surface area (an elongated oval shape, for instance). Since heat radiation
takes place at the surface, more spherical shapes will radiate less heat and
conserve more heat for a given volume than a non-spherical shape, and less
spherical shapes will lose more heat by radiation. Applying these geometric
principles to head size and shape, one would predict that natural selection
would favor a smaller head with a less spherical (dolichocephalic) shape
because of its better heat dissipation in hot climates, and would favor a more
spherical (brachycephalic) head to accommodate a larger volume of brain matter
with a smaller surface area because of its better heat conservation in cold
climates. (The dolichocephalic-brachycephalic dimension is related to the
head's width:length ratio, known as the cephalic index.) In brief, a smaller,
dolichocephalic cranium is advantageous for thermoregulation of the brain in a
hot climate, whereas a larger, brachycephalic cranium is advantageous in a
cold climate. In the world's populations, head breadth is correlated about +.8
with cranial capacity; head length is correlated about +.4.
Evidence that the average endocranial volume of various populations is
related to cranial shape and that both phenomena are, in some part,
adaptations to climatic conditions in different regions has been shown by
physical anthropologist Kenneth Beals and his co-workers. They amassed
measurements of endocranial volume in modern humans from some 20,000
individual crania collected from every continent, representing 122 ethnically
distinguishable populations. They found that the global mean cranial capacity
for populations in hot climates is 1,297 ± 10.5 cm3 for populations in cold
and temperate climates it is 1,386 ± 6.7 cm3, a highly significant (p < l0-4)
difference of 89 cm3. Beals also plotted a correlation scatter diagram of the
mean cranial capacity in cm3 of each of 122 global populations as a function
of their distance from the equator (in absolute degrees north or south
latitude). The Pearson correlation between absolute distance from the equator
and cranial capacity was r = +.62 (p < 10-5). (The regression equation is:
cranial capacity = 2.5 cm3 X {degrees latitude} + 1257.3 cm3; that is, an
average increase of 2.5 cm3 in cranial capacity for every 1 degree increase in
latitude.) The same analysis applied to populations of the African-Eurasian
landmass showed a cranial capacity X latitude correlation of + .76 (p < 10-4)
and a regression slope of 3.1 cm3 increase in cranial capacity per every 1
degree of absolute latitude in distance from the equator. The indigenous
populations of North and South American continents show a correlation of + .44
and a regression slope of 1.5; the relationship of cranial capacity to
latitude is less pronounced in the New World than in the Old World, probably
because Homo sapiens inhabited the New World much more recently, having
migrated from Asia to North America only about 15,000 years ago, while Homo
sapiens have inhabited the African and Eurasian continents for a much longer
period.
RACIAL DIFFERENCES IN HEAD/BRAIN SIZE
Are the climatic factors associated with population differences in cranial
capacity, as summarized in the preceding section, reflected in the average
cranial or brain-size measurements of the three broadest contemporary
population groups, generally termed Caucasoid (Europeans and their
descendants), Negroid (Africans and descendants), and Mongoloid (Northeast
Asians and descendants)? A recent comprehensive review summarized the
worldwide literature on brain volume in cm3 as determined from four kinds of
measurements: (a) direct measurement of the brain obtained by autopsy, (b)
direct measurement of endocranial volume of the skull, (c) cranial capacity
estimated from external head measurements, and (d) cranial capacity estimated
from head measurements and corrected for body size. The aggregation of data
obtained by different methods, based on large samples, from a number of
studies tends to average-out the sampling error and method effects and
provides the best overall estimates of the racial group means in head/brain
size measurements. The results of this aggregation are shown in Table 12.1.
Probably the technically most precise data on brain size for American
whites and blacks were obtained from a study of autopsied brains by a team of
experts at the Case-Western Reserve University's Medical School in Cleveland,
Ohio. It measured the autopsied brains of 811 whites and 450 blacks matched
for mean age (sixty years). Subjects with any brain pathology were excluded
from the study. The same methods were used to remove, preserve, and weigh the
brains for all subjects. The results for each race X sex group are shown in
Table 12.2. As the total sample (N = 1,261) ranged in age from 25 to 80 years,
with a mean of 60 years in both racial groups, it was possible to estimate (by
regression) the mean brain weight for each race X sex group at age 25 based on
all of the data for each group (shown in the last column of Table 12.2). For
the mean height-adjusted brain weight, the W-B difference in standard
deviation units is 0.76s for males, 0.78s for females. (The actual
height-adjusted W-B differences are 102 g for males and 95 g for females.)
Neurologically, a difference of 100 g in brain weight corresponds to
approximately 550 million cortical neurons. But this average estimate ignores
any sex differences in brain size and density of cortical neurons.
Note that for each racial group the sexes differ in brain weight by about
130 g, which is about 30 g more than the average racial difference. This
presents a paradox, because while brain size is correlated with IQ, there is
little or no sex difference in IQ (even the largest IQ differences that have
been claimed by anyone are much smaller than would be predicted by the sex
difference in brain size). Attempts to explain this paradox amount to
plausible speculations. One thing seems certain: Because of the small
correlation (about .20) between brain size and body size, the sex difference
in brain volume and weight can be only partially accounted for by the
regression of brain size on body size. The resolution of this paradox may come
from the evidence that females have a higher density of neurons in the
posterior temporal cortex, which is the major association area and is involved
in higher thought processes. Females have 11 percent more neurons per unit
volume than do males, which, if true for the brain as a whole, would more than
offset the 10 percent male-female difference in overall brain volume. This sex
difference in neuronal packing density is considered a true sexual dimorphism,
as are the sex differences in overall body size, skeletal form, the proportion
and distribution of body fat, and other secondary sexual characteristics.
Sexual dimorphism is seen throughout the animal kingdom and in many species is
far more extreme than in Homo sapiens. I have not found any investigation of
racial differences in neuron density that, as in the case of sex differences,
would offset the racial difference in brain weight or volume. Until doubts on
this point are empirically resolved, however, interpretations of the
behavioral significance of the racial difference in brain size remain
tentative. One indication that the race difference in brain weight is not of
the same nature as the sex difference is that the allometric ratio of brain
weight (in g) to body weight (in kg) is less similar between the racial groups
than between the sexes within each racial group.
Also, we must take into account the fact that, on average, about 30 percent
of total adult female body weight is fat, as compared to 15 percent for males.
Because body fat is much less innervated than muscle tissue, brain size is
more highly correlated with fat-free body weight than with total body weight.
Statistically controlling for fat-free body weight (instead of total body
weight) has been found to reduce the sex difference in head circumference by
about 77 percent, or about three times as much as controlling for total body
weight. Because head circumference is an imperfect proxy for brain size, the
percentage reduction of the sex difference in directly measured brain volume
(or weight) that would be achieved by controlling for fat-free weight will be
uncertain until such studies are performed. Measuring fat-free body weight
should become routine in the conduct of brain-size studies based on autopsied
brains or on in vivo brain measurements obtained by imaging techniques.
The white-black difference in head/brain size is significant in neonates
(about 0.4s difference in head circumference) and within each racial group
head size at birth is correlated (about +.13) with IQ at age seven years, when
the average within-groups correlation with IQ is +.21. A retrospective study
of two groups of seven-year-old children, those with IQ < 80 and those with IQ
> 120 were found to have differed by 0.5s in head circumference measured at
one year of age. Also, small head size measured at eight months has been found
to interact most unfavorably with birth weight; infants with very low birth
weight who had subnormal head size at eight months had an average IQ about
nine points (0.6s) lower at school age than did infants of comparable birth
weight but with normal head size (corrected for prematurity).
I have not found an estimate of the heritability of directly measured brain
size. However, the heritability, h2, of cranial capacity (estimated by formula
from head length, width, and circumference) based on Falconer' s formula
[h2=rMZ-rDZ) applied to 107 MZ twin pairs and 129 DZ twin pairs ranged widely
for different race X sex subgroups, for a within-subgroup average of .19. When
the estimates of cranial capacity were adjusted for age, stature, and weight,
the h2 values averaged 0.53. The narrow h2 (i.e., the proportion of the total
variance attributable only to additive genetic effects) of various head
measurements determined in a Caucasoid sample (Bulgarians) by the midparent X
offspring correlation (all offspring over fifteen years of age) were: length
.37, height .33, breadth .46, circumference .52. All of these estimates of the
heritability of cranial size indicate a considerable amount of nongenetic (or
environmental) variance, at least as much as for IQ. Moreover, much more of
the nongenetic variance is within-families (i.e., unshared among siblings
reared together) than is between-families (shared) variance. This implies that
shared environmental effects, such as those associated with parents'
education, occupation, and general socioeconomic level, are not the major
source of variance in cranial capacity as estimated from head measurements.
Also, what little evidence we have suggests that the total environmental
variance in head measurements is greater for blacks than for whites. (The
nature of these environmental influences is discussed later in this chapter.)
Implications of Brain Size for IQ Differences. Chapter 6 reviewed the major
evidence showing that head measurements and brain size itself are
significantly correlated with IQ. The only available correlations for blacks
are based on head length, width, and circumference (and cranial capacity
estimated by formula from these measurements); as yet there are no reported
correlations between IQ and directly measured brain size for blacks. However,
the head measurements are significantly correlated with IQ for age-matched
whites and blacks, both on raw measurements and on measurements corrected for
height and weight, although the correlations are somewhat lower in blacks.
Longitudinal data show that the head circumference X IQ correlation
significantly increases between ages 4 and 7, and cross-sectional data
indicate that the correlation gradually increases up to 15 years of age, by
which time the average growth curves for head size and brain size have reached
asymptote.
It is especially important to note that for both racial groups the head
size X IQ correlation exists within-families as well as between-families,
indicating an intrinsic, or functional, relationship, as explained in Chapter
6. Equally important is the fact that within each sex, whites and blacks share
precisely one and the same regression line for the regression of head size on
IQ. When blacks and whites are perfectly matched for true-score IQ (i.e., IQ
corrected for measurement error), either at the black mean or at the white
mean, the overall average W-B difference in head circumference is virtually
nil, as shown in Table 12.3.
Taken together, these findings suggest that head size and IQ are similarly
related to IQ for both blacks and whites. Although matching blacks and whites
for IQ virtually eliminates the average difference in head size, matching the
groups on head size does not equalize their IQs. This is what we in fact
should expect if brain size is only one of a number of brain factors involved
in IQ. When matched on IQ, the groups are thereby also equal on at least one
of these brain factors, in this case, size. But when black and white groups
are matched on head or brain size, they still differ in IQ, though to a lesser
degree than in unmatched or representative samples of each population.
The black-white difference in head/brain size is also related to Spearman's
hypothesis. A study in which head measurements were correlated (within racial
groups) with each of seventeen diverse psychometric tests showed that the
column vector of seventeen correlations was rank-order correlated + .64 (p <
.01) with the corresponding vector composed of each test's g loading (within
groups). In other words, a test's g loading significantly predicts the degree
to which that test is correlated with head/brain size. We would also predict
from Spearman's hypothesis that the degree to which each test was correlated
with the head measurements should correlate with the magnitude of the W-B
difference on each test. In fact, the column vector of test X head-size
correlations and the vector of standardized mean W-B differences on each of
the tests correlate + .51 (p < .05).
From the available empirical evidence, we can roughly estimate the fraction
of the mean IQ difference between the black and white populations that could
be attributed to the average difference in brain size. As noted in Chapter 6,
direct measurements of in vivo brain size obtained by magnetic resonance
imaging (MRI) show an average correlation with IQ of about + .40 in several
studies based on white samples. Given the reasonable assumption that this
correlation is the same for blacks, statistical regression would predict that
an IQ difference equivalent to 1s would be reduced by 0.4s, leaving a
difference of only 0.6s, for black and white groups matched on brain size.
This is a sizable effect. As the best estimate of the W-B mean IQ difference
in the population is equivalent to 1.ls or 16 IQ points, then 0.40 X 16=6 IQ
points of the black-white IQ difference would be accounted for by differences
in brain size. (Slightly more than 0.4s would predictably be accounted for if
a hypothetically pure measure of g could be used.) Only MRI studies of brain
size in representative samples of each population will allow us to improve
this estimate.
Other evidence of a systematic relationship between racial differences in
cranial capacity and IQ comes from an "ecological" correlation, which is
commonly used in epidemiological research. It is simply the Pearson r between
the means of three or more defined groups, which disregards individual
variation within the groups. Referring back to Table 12.1, I have plotted the
median IQ of each of the three populations as a function of the overall mean
cranial capacity of each population. The median IQ is the median value of all
of the mean values of IQ reported in the world literature for Mongoloid,
Caucasoid, and Negroid populations. (The source of the cranial capacity means
for each group was explained in connection with Table 12.1.) The result of
this plot is shown in Figure 12.4. The regression of median IQ on mean cranial
capacity is almost perfectly linear, with a Pearson r = +.998. Unless the data
points in Figure 12.4 are themselves highly questionable, the near-perfect
linearity of the regression indicates that IQ can be regarded as a true
interval scale. No mathematical transformation of the IQ scale would have
yielded a higher correlation. Thus it appears that the central tendency of IQ
for different populations is quite accurately predicted by the central
tendency of each population's cranial capacity.
POPULATION DIFFERENCES IN g: THE DEFAULT HYPOTHESIS
Consider the following items of evidence: the many biological correlates of
g; the fact that among all of the psychometric factors in the domain of
cognitive abilities the g factor accounts for the largest part of the mean
difference between blacks and whites; the evolutionary history of Homo sapiens
and the quantitative differentiation of human populations in allele
frequencies for many characteristics, including brain size, largely through
adaptive selection for fitness in highly varied climates and habitats; the
brain evolved more rapidly than any other organ; half of humans' polymorphic
genes affect brain development; the primary evolutionary differentiation and
largest genetic distance between human populations is that between the African
populations and all others; the intrinsic positive correlation between brain
size and measures of g; the positive mean white-black difference in brain
size; the positive correlation between the variable heritability of individual
differences in various measures of cognitive abilities and the variable
magnitudes of their g loadings. All these phenomena, when viewed together,
provide the basis for what I shall call the default hypothesis concerning the
nature of population or racial differences in g.
Although we are concerned here with variation between populations, it is
also important to keep in mind that, from an evolutionary perspective, it is
most unlikely that there are intraspecies differences in the basic structural
design and operating principles of the brain. The main structural and
functional units of the brain found in any one normal human being should be
validly generalizable to all other normal humans. That is to say, the
processes by which the brain perceives, learns, reasons, remembers, and the
like are the same for everyone, as are the essential structures and functions
of every organ system in the entire body. Individual differences and
population differences in normal brain processes exist at a different level,
superimposed, as it were, over and above the brain's common structures and
operating principles.
The default hypothesis states that human individual differences and
population differences in heritable behavioral capacities, as products of the
evolutionary process in the distant past, are essentially composed of the same
stuff, so to speak, controlled by differences in allele frequencies, and that
differences in allele frequencies between populations exist for all heritable
characteristics, physical or behavioral, in which we find individual
differences within populations.
With respect to the brain and its heritable behavioral correlates, the
default hypothesis holds that individual differences and population
differences do not result from differences in the brain's basic structural
operating mechanisms per se, but result entirely from other aspects of
cerebral physiology that modify the sensitivity, efficiency, and effectiveness
of the basic information processes that mediate the individual's responses to
certain aspects of the environment. A crude analogy would be differences in
the operating efficiency (e.g., miles per gallon, horsepower, maximum speed)
of different makes of automobiles, all powered by internal combustion engines
(hence the same operating mechanisms) but differing in, say, the number of
cylinders, their cubic capacity, and the octane rating of the gasoline they
are using. Electric motor cars and steam-engine cars (analogous to different
species or genera) would have such distinctively different operating
mechanisms that their differences in performance would call for quite
different explanations.
In brief, the default hypothesis states that the proximal causes of both
individual differences and population differences in heritable psychological
traits are essentially the same, and are continuous variables. The population
differences reflect differences in allele frequencies of the same genes that
cause individual differences. Population differences also reflect
environmental effects, as do individual differences, and these may differ in
frequency between populations, as do allele frequencies.
In research on population differences in mean levels of g, I think that the
default hypothesis should be viewed as the true "null" hypothesis, that is,
the initial hypothesis that must be disproved. The conventional null
hypothesis of inferential statistics (i.e., no differences between
populations) is so improbable in light of evolutionary knowledge as to be
scientifically inappropriate for the study of population differences in any
traits that show individual differences. The real question is not whether
population differences exist for a given polygenic trait, but rather the
direction and magnitude of the difference.
The question of direction of a difference brings up another aspect of the
default hypothesis, namely, that it is rare in nature for genotypes and
phenotypes of adaptive traits to be negatively correlated. It is exceedingly
improbable that racial populations, which are known to differ, on average, in
a host of genetically conditioned physical characteristics, would not differ
in any of the brain characteristics associated with cognitive abilities, when
half of all segregating genes in the human genome are involved with the brain.
It is equally improbable that heritable variation among individuals in
polygenic adaptive traits, such as g, would not show nontrivial differences
between populations, which are aggregations of individuals. Again, from a
scientific standpoint, the only real questions about population differences
concern their direction, their magnitude, and their causal mechanism(s). One
may also be interested in the social significance of the phenotypic
differences. Research will be most productively focused not on whether or not
genes are involved in population differences, but in discovering the relative
effects of genetic and environmental causes of differences and the nature of
these causes, so they can be better understood and perhaps influenced.
The rest of this chapter deals only with the scientific aspect of the
default hypothesis. (For a discussion of its social significance, see Chapter
14.) Since far more empirical research relevant to the examination of the
default hypothesis with respect to g has been done on the black-white
difference, particularly within the United States, than on any other
populations, I will focus exclusively on the causal basis of the mean
black-white difference in the level of g.
HERITABILITY OF IQ WITHIN GROUPS AND BETWEEN GROUPS
One of the aims of science is to comprehend as wide a range of phenomena as
possible within a single framework, using the fewest possible mechanisms with
the fewest assumptions and ad hoc hypotheses. With respect to IQ, the default
hypothesis relating individual differences and population differences is
consistent with this aim, as it encompasses the explanation of both
within-group (WG) and between-group (BG) differences as having the same causal
sources of variance. The default hypothesis that the BG and WG differences are
homogeneous in their causal factors implies that a phenotypic difference of PD
between two population groups in mean level of IQ results from the same causal
effects as does any difference between individuals (within either of the two
populations) whose IQs differ by PD (i.e., the phenotypic difference). In
either case, PD is the joint result of both genetic (G) and environmental (E)
effects. In terms of the default hypothesis, the effects of genotype X
environment covariance are the same between populations as within populations.
The same is hypothesized for genotype x environment interaction, although
studies have found it contributes negligibly to within-population variance in
g.
It is possible for a particular allele to be present in one population but
absent in another, or for alleles at certain loci to be turned on in some
environments and turned off in others, or to be regulated differently in
different environments. These conditions would constitute exceptions to the
default hypothesis. But without empirical evidence of these conditions with
respect to population differences in g, which is a highly polygenic trait in
which most of the variance within (and probably between) populations is
attributable to quantitative differences in allele frequencies at many loci,
initial investigation is best directed at testing the default hypothesis.
In terms of the black-white IQ difference, the default hypothesis means
that the question of why (on average) two whites differ by amount PD in IQ, or
two blacks differ by amount PD or a black and a white differ by amount PD can
all be answered in the same terms. There is no need to invoke any special
"racial" factor, either genetic or cultural.
The countervailing dual hypothesis contends that: (1) within-group
individual differences (WG), on the one hand, and between-group mean
differences (BG), on the other, have different, independent causes; and (2)
there is no relationship between the sources of WG differences and of BG
differences. In this view, the high heritability of individual differences in
g within groups tells us nothing about the heritability (if any) of g between
groups.
The empirical fact that there is a large genetic component in WG individual
differences in g is so well established by now (see Chapter 7) that, with rare
exceptions, it is no longer challenged by advocates for the dual hypothesis.
The defining tenet of the dual hypothesis, at least as it applies to the
phenotypic black-white IQ difference, is that there is no genetic component in
the mean BG difference; that is, the causes of the observed BG difference in
IQ are entirely environmental. These environmental sources may include
nutrition and other biological conditions, as well as socioeconomic,
attitudinal, or cultural group differences, to name the most frequently
hypothesized causal factors. (Psychometric test bias, as such, has been
largely ruled out; see Chapter 11, pp. 360-67.)
Within-Group Heritability of IQ in Black and in White Groups. Before
contrasting the dual and the default hypotheses in terms of their formal
implications and their consistency with empirical findings, we need to
understand what is, and is not, known about the heritability of individual
differences in IQ within each population.
The many studies of IQ heritability based on white samples are summarized
in Chapter 7. They give estimates that range mostly between .40 and .60 for
children and adolescents, and between .60 and .80 for adults.
The few studies of IQ heritability in black samples have all been performed
in conjunction with age-matched white samples, so that group comparisons would
be based on the same tests administered under the same conditions. Only two
such studies based on large samples (total Ns of about 300 and 700) of black
and white twins of school age have been reported. The data of these studies do
not support rejection of the null hypothesis of no black-white difference in
the heritability coefficients for IQ. Nor do these studies show any evidence
of a statistically significant racial difference between the magnitudes of the
correlations for either MZ or DZ twins. But the sample sizes in these studies,
though large, are not large enough to yield statistical significance for real,
though small, group differences. The small differences between the black and
white twin correlations observed in these studies are, however, consistent
with the black-white differences in the correlations, between full siblings
found in a study of all of the school-age sibling pairs in the total black and
white populations of the seventeen elementary schools of Berkeley, California.
The average sibling correlations for IQ in that study were +.38 for blacks and
+.40 for whites. (For height, the respective age-corrected correlations were
.45 and .42.) Because the samples totaled more than 1,500 sibling pairs, even
differences as small as .02 are statistically significant. If the heritability
of IQ, calculated from twin data, were very different in the black and white
populations, we would expect the difference to show up in the sibling
correlations as well. The fact that sibling correlations based on such large
samples differ so little between blacks and whites suggests that the
black-white difference in IQ heritability is so small that rejection of the
null hypothesis of no W-B difference in IQ heritability would require enormous
samples of black and white MZ and DZ twins- far more than any study has yet
attempted or is ever likely to attempt. Such a small difference, even if it
were statistically reliable, would be of no theoretical or practical
importance. On the basis of the existing evidence, therefore, it is reasonable
to conclude that the difference between the U.S. black and white populations
in the proportion of within-group variance in IQ attributable to genetic
factors (that is, the heritability of IQ) is probably too small to be
detectable.
The Relationship of Between-Group to Within-Group Heritability.
The mantra invoked to ward off any unpalatable implications of the fact
that IQ has substantially equal heritability in both the black and the white
populations is that "heritability within groups does not imply (or prove, or
generalize to) heritability between groups." Arguing that the fact that there
is genetic variance in individual differences within groups gives no warrant
to generalize to differences between groups is, of course, formally equivalent
to saying exactly the same thing about environmental variance, which is the
complement of the within-groups heritability (i.e., 1-h2). But a little
analysis is required to understand the peculiar nature of the relationship
between within-group heritability (WGH) and between-group heritability (BGH).
To say there is no relationship of any kind between WGH and BGH is wrong.
They are mathematically related according to the following equation:
BGH = WGH*(rg(1-rp)/rp(1-rg)) where BGH is the between-group heritability
and WGH is the within-group heritability. rg is the genetic intraclass
correlation within groups, i.e., rg = (genetic variance between
groups)/(genetic variance between groups + genetic variance within groups).
rp, is the phenotypic intraclass correlation within groups; it is equal to the
squared point-biserial correlation between individuals' nominal group
membership (e.g., black or white, quantitized as 0 or 1) and the quantitative
variable of interest (e.g., IQ).
This is termed the formal relationship between WGH and BGH. Although there
is no argument about the mathematical correctness of this formulation, it is
not empirically applicable, because a single equation containing two unknowns
(i.e., BGH and rg), cannot be solved. (It is also clear mathematically that
the formula must assume that WGH is greater than zero and that rg is less than
unity.) The value of rp can easily be obtained empirically. (For example, if
two groups each have the same standard deviation on a given variable and the
group means differ by one such standard deviation, the value of rp = .20). If
we knew the value of rg we could solve the equation for BGH (or vice versa).
(If the between-groups difference were entirely nongenetic, as strict
environmentalists maintain, then of course rg would be zero.) But we know
neither rg nor BGH, so the formula is empirically useless.
However, this formula does indicate that for an hypothesized value of rg
greater than zero, BGH is a linearly increasing function of WGH. As I will
point out, the hypothesized relationship between WGH and BGH can suggest some
useful conjectures and empirical analyses. The formal relationship between WGH
and BGH makes no assumptions about the sources of either the genetic or the
environmental variance in BGH and WGH, or whether BGH and WGH are
qualitatively the same or different in this respect. The default hypothesis,
however, posits that the genetic and the environmental factors that cause the
between-groups difference exist within each group (but not necessarily in
equal degrees). The opposing dual hypothesis is that the environmental factors
that cause variance between groups are different not just in degree, but in
kind, from the environmental factors that cause individual differences within
a group. This conjecture raises problems that I will examine shortly.
The between-groups (BG) versus within-groups (WG) problem can be visualized
as shown in Figure 12.5. Assume a population is composed of two equal-sized
subpopulations, A and B, and assume that on some characteristic (e.g., IQ) the
phenotypic means of these two subpopulations differ, that is, A-B = PD.
(Sampling error and measurement error are assumed to be zero in this didactic
diagram.) The measurement of the phenotypic characteristic (P) is standardized
in the total population, so its population standard deviation is 1s and the
total variance is the square of the standard deviation, 1s2. Any variance can
he visualized as the area of a square. The square in Figure 12.5 represents
the total phenotypic variance (1s2) of the whole population, and its square
root is the standard deviation (1s) of the phenotypic measurements. The total
variance (area of the square) is partitioned horizontally into the variance
between groups (BG) and the variance within groups (WG). The total variance is
partitioned vertically into the genetic (G) variance, i.e., heritability (h2)
and the environmental (E) variance, i.e., environmentality (e2). At present,
the only variables we are able to determine empirically are the total
phenotypic variance, h2WG , and the within-group genetic and environmental
variances, h2WG, and e2WG. The between-group variables, h2BG and e2BG, are
undetermined (and so are shown in parentheses). As the genetic and
environmental proportions of the BG variance have not been empirically
determined, they are shown separated by a dotted line in Figure 12.5. This
dotted line could move either to the left or to the right, based on new
empirical evidence. Its approximate position is the bone of contention between
the advocates of the default hypothesis and those of the conventional null
hypothesis.
Extreme "environmentalists" argue that both h2WG=0 and h2BG=0, leaving
environmental agents as the source of all observed phenotypic variance.
(Hardly anyone now holds this position with respect to IQ.) A much more common
position nowadays is to accept the empirically established WG values, but
maintain that the BG variance is all environmental. "Agnostics" would say
(correctly) that h2BG is not empirically known, and some might add that,
though unknown, it is plausibly greater than zero.
The strong form of the default hypothesis is represented in Figure 12.5 by
the dotted-line extension of the solid vertical line, thus partitioning both
the WG and BG variances into the same proportions of genetic and environmental
variance. A "relaxed" form of the default hypothesis still posits h2BG > 0,
but allows h2BG to differ from h2WG. In general, this is closer to reality
than is the strong form of the default hypothesis. In both forms of the
default hypothesis
WG variance and BG variance are attributable to the same causal factors,
although they may differ in degree. The purpose of hypothesizing some fairly
precise value for h2BG is not because one necessarily thinks it is true, or
wants to "sell" it to someone, but rather because scientific knowledge
advances by the process that Karl Popper described as "conjectures and
refutations"-a strong hypothesis (or conjecture) can permit certain possibly
testable deductions or inferences, and can be decisively refuted only if
formulated precisely and preferably quantitatively. Any hypothesis is merely
the temporary scaffolding that assists in discovering new facts about nature.
It helps us to formulate questions precisely and further focuses investigative
efforts on research that will yield diacritical results. Beyond this purpose,
a hypothesis has no other use. It is not a subject for advocacy.
A clear quantitative statement of the default hypothesis depends upon
understanding some important technical points about variance and its relation
to linear measurement. The large square in Figure 12.6 represents the total
variance (0.2) of a standardized phenotypic variable (P), with a standard
deviation sp = 1. The area of the large square (total phenotypic variance) is
partitioned into its genetic and environmental components, corresponding to a
heritability of .75 (which makes it easy to visualize). The genetic variance
sG2 in Figure 12.6 (unshaded area) is equal to .75, leaving the environmental
component sE2 (shaded area) equal to .25. Since the variance of each effect is
shown in the diagram as an area, the square root of the area represents the
standard deviation of that effect. The linear distances or differences between
points on a scaled variable are shown as line segments scaled in standard
deviation units, not in variance units. Thus the line segments that form the
area in the lower right of the shaded square in Figure 12.6 are each equal to
0.25^0.5 or .5 (in standard deviation units). The linear distances represented
by the environmental variance is 0.5; and the linear distance represented by
the genetic variance is 0.866. Notice that these two linear measurements do
not add up to the length of the side of the total square, which is 1. That is,
standard deviation units are not additive. Before the sum of the standard
deviations of two or more component elements can represent the standard
deviation of the total of the component elements, you must first take the
square root of the sum of the squared standard deviations.
[From this point on -- I have eliminated many hard to edit formulas. See
the original text for the complete derivations of expressions]
We can now ask, "How many units of environmental variance are needed to add
up to the total phenotypic variance? The answer is 4. This ratio is in
variance units. To express it in linear terms, it has to be converted into
standard deviation units, that is, 2.
Suppose we obtain IQ scores for all members of two equal-size groups called
A and B. Further assume that within each group the IQs have a normal
distribution, and the mean of group A is greater than the mean of group B. To
keep the math simple, let the IQ scores have perfect reliability, let the
standard deviation of the scores be the same in both groups, and let the mean
phenotypic difference be equal to the average within-group phenotypic standard
deviation.
Now consider the hypothesis that the between-group heritability (BGH) is
zero and that therefore the cause of the A-B difference is purely
environmental. Assume that the within-group heritability (WGH) is the same in
each group, say, WGHA = WGHB = .75. Now, if we remove the variance
attributable to genetic factors (WGH) from the total variance of each group's
scores, the remainder gives us the proportion of within-group variance
attributable to purely environmental factors. If both the genetic and
environmental effects on test scores are normally distributed within each
group, the resulting curves after the genetic variance has been removed from
each represent the distribution of environmental effects on test scores. Note
that this does not refer to variation in the environment per se, but rather to
the effects of environmental variation on the phenotypes (i.e., IQ scores, in
this case.) The standard deviation of this distribution of environmental
effects provides a unit of measurement for environmental effects.
The distribution of just the total environmental effects (assuming WGH =
.75) is shown in the two curves in the bottom half of Figure 12.7. The
phenotypic difference between the group means is kept constant, but on the
scale of environmental effects (measured in environmental standard deviation
units), the mean environmental effects for groups A and B differ by the ratio
2sE, as shown in the lower half of Figure 12.7. What this means is that for
two groups to differ phenotypically by 1sP When WGH = .75 and BGH = 0, the two
groups would have to differ by 2sE on the scale of environmental effects. This
is analogous to two groups in which each member of one group has a monozygotic
twin in the other group, thus making the distribution of genotypes exactly the
same in both groups. For the test score distributions of these two
genotypically matched groups to differ by 1sP, the groups would have to differ
by 2sE on the scale of environmental effects (assuming WGH = .75).
The hypothetical decomposition of a mean phenotypic difference between two
groups as expressed in terms of the simplest model is that the phenotypic
difference between the groups is completely determined by their genetic
difference and their environmental difference. These variables are related
quantitatively by the simple path model shown in Figure 12.8. The arrows
represent the direction of causation; each arrow is labeled with the
respective regression coefficients (also called path coefficients), h and e,
between the variables, which, when, are mathematically equivalent to the
respective correlation coefficients, and to the standard deviations of the
genetic and environmental effects. In reality, of course, there could be a
causal path, but this would not alter the essential point of the present
argument. We see that the phenotypic difference can be represented as a
weighted sum of the genetic and the environmental effects on PD, the weights
being h and e. Since these values are equivalent to standard deviations, they
cannot be summed.
A phenotypic difference between the means of two groups can be expressed in
units of the standard deviation of the average within-groups environmental
effect, where BGH is the between-groups heritability and WGH is the
within-groups heritability. Thus the phenotypic difference between the means
of the two curves in the lower half of Figure is 2sE. That is, the means of
the two environmental-effect curves differ by two standard deviations. The
body of empirical evidence shows that an environmental effect on IQ this large
would predictably occur only rarely in pairs of monozygotic twins reared apart
(whose IQs are correlated .75) except for random errors of measurement. The
difference in IQ attributable solely to nongenetic differences between random
pairs of individuals in a population in which h2 is .75 is about the same as
for MZ twins reared apart. On an IQ scale with s = 15, a difference of 2sE is
approximately equal to 30 IQ points (i.e., 2 X 15). But the largest IQ
difference between MZ twins reared apart reported in the literature is 1 .5s,
or 23 IQ points. Further, the average absolute difference in IQ (assuming a
perfectly normal distribution of IQ) between all random pairs of persons in
the population (who differ both in g and in E) would be 1.1284s, or
approximately 17 IQ points.
Now consider again the two groups in the upper half of Figure 12.7, called
A and B. They differ in their mean test scores, with a phenotypic difference
A-B =1sP, and have a within-group environmental effect difference of 2sE. If
we hypothesize that the difference between the phenotypic means is entirely
nongenetic (i.e., environmental), then the phenotypic difference of 1sP must
be equal to 2sE.
By the same reasoning, we can determine the size of the environmental
effect that is required to produce a phenotypic difference of lsP, given any
values of the within-groups heritability (WGH) and the between-groups
heritability (BGH). For a phenotypic difference of lsP. The strong default
hypothesis is defined in terms of BGH = WGH; the relaxed default hypothesis
allows independent values of BGH and WGH.
For example, in the first column inside Table 12.4(A), the BGH = .00. This
represents the hypothesis that the cause of the mean group difference in test
scores is purely environmental. When WGH is also equal to .00, the
environmental difference of lsE between the groups accounts for all of the
phenotypic difference of lsP, and thus accords perfectly with the
environmental hypothesis that lsP= lsE. Table 12.4(A) shows that when WGH =
BGH = .00, the value of sE = 1.00.
Maintaining the same purely environmental hypothesis that the BGH = 0, but
with the WGH = .10, for two groups to differ phenotypically by lsP they must
differ by l.05sE in environmental effect, which deviates .05 from the
hypothesized value of lsE. The critical point of this analysis is that if the
BGH= 0, values of WGH greater than 0 then require that sE be greater than
1.00. We can see in Table 12.4(A) that as the WGH increases, the required
value of lsE must increasingly deviate from the hypothesized value of lsE,
thereby becoming increasingly more problematic for empirical explanation.
Since the empirical value of WGH for the IQ of adults lies within the range of
.60 to .80, with a mean close to .70, it is particularly instructive to
examine the values of lsE, for this range in WGH. When WGH = .70 and BGH = 0,
for example, the lsP, difference between the groups is entirely due to
environmental causes and amounts to l,83sE. Table 12.4(A) indicates that as we
hypothesize levels of BGH that approach the empirically established levels of
WGH, the smaller is the size of the environmental effect required to account
for the phenotypic difference of lsP in group means.
Factor X. Recall that the strong form of the default hypothesis states that
the average difference in test scores observed between groups A and B results
from the same kinds of genetic (G) and environmental (E) influences acting to
the same degree to produce individual differences within each group. The
groups may differ, however, in the mean values of either G, or E, or both.
Stated in terms of the demonstration in Table 12.4(A), this means that if WGH
is the same for both groups, A and B, then, given any empirically obtained
value of WGH, the limits of BGH are constrained, as shown. The hypothesis that
BGH = 0 therefore appears improbable, given the typical range of empirical
values of WGH.
To accept the preponderance of evidence that WGH > 0 and still insist that
BGH = 0 regardless of the magnitude of the WGH, we must attribute the cause of
the group difference to either of two sources: (1) the same kinds of
environmental factors that influence the level of g but that do so at much
greater magnitude between groups than within either group, or (2) empirically
identified environmental factors that create variance between groups but do
not do so within groups. The "relaxed" default hypothesis allows both of these
possibilities. The dual hypothesis, on the other hand, requires either much
larger environmental effects between groups than are empirically found, on
average, within either group, or the existence of some additional empirically
unidentified source of nongenetic variance that causes the difference between
groups but does not contribute to individual differences within either group.
If the two groups are hypothesized not to differ in WGH or in total phenotypic
variance, this hypothesized additional source of nongenetic variance between
groups must either have equal but opposite effects within each group, or it
must exist only within one group but without producing any additional variance
within that group. In 1973, I dubbed this hypothesized additional nongenetic
effect Factor X. When groups of blacks and whites who are matched on virtually
all of the environmental variables known to be correlated with IQ within
either racial population still show a substantial mean difference in IQ,
Factor X is the favored explanation in lieu of the hypothesis that genetic
factors, though constituting the largest source of variance within groups, are
at all involved in the IQ difference between groups. Thus Factor X is an ad
hoc hypothesis that violates Occam's razor, the well-known maxim in science
which states that if a phenomenon can he explained without assuming some
hypothetical entity, there is no ground for assuming it.
The default hypothesis also constrains the magnitude of the genetic
difference between groups, as shown in Table 12.4(B). (The explanations that
were given for interpreting Table 12.4(A) apply here as well.) For two groups,
A and B, whose phenotypic means differ by A-B = lsP, the strong default
hypothesis (i.e., BGH = WGH) means that the groups differ on the scale of
genetic effect by BGH/WGH = lsG.
The values of lsG in Table 12.4(B) show that the strong default hypothesis
is not the same as a purely genetic hypothesis of the group difference. For
example, for WGH = .70 and BGH = .70, the groups differ by lsG (Table 12.4B),
and also the groups differ by lsE (Table 12.4A). For the relaxed default
hypothesis, the environmental and genetic differences associated with each and
every intersection of WGH and BGH in Tables 12.4A and 12.4B add up to lsP.
The foregoing analysis is relevant to the often repeated "thought
experiment" proposed by those who argue for the plausibility of the dual
hypothesis, as in the following example from an article by Carol Tavris:
"Suppose that you have a bag of tomato seeds that vary genetically; all things
being equal, some seeds will produce tomatoes that are puny and tasteless, and
some will produce tomatoes that are plump and delicious. You take a random
bunch of seeds in your left hand and random bunch in your right. Though one
seed differs genetically from another, there is no average difference between
the seeds in your left hand and those in your right. Now you plant the left
hand's seeds in Pot A. You have doctored the soil in Pot A with nitrogen and
other nutrients. You feed the pot every day, sing arias to it from La
Traviata, and make sure it gets lots of sun. You protect it from pests, and
you put in a trellis, so even the weakest little tomatoes have some support.
Then you plant the seeds in your right hand in Pot B, which contains sandy
soil lacking nutrients. You don't feed these tomatoes, or water them; you
don't give them enough sun; you let pests munch on them. When the tomatoes
mature, they will vary in size within each pot, purely because of genetic
differences. But there will also be an average difference between the tomatoes
of enriched Pot A and those of depleted Pot B. This difference between pots is
due entirely to their different soils and tomato-rearing experiences."
Statistically stated, the argument is that (1) WGH = 1, BGH = 0. What is
the expected magnitude of the required environmental effect implied by these
conditions? In terms of within-group standard deviation units, it is sE =1/0.
But of course the quotient of any fraction with zero in the denominator is
undefined, so no inference about the magnitude is possible at all, given these
conditions. However, if we make the WGH slightly less than perfect, say, .99,
the expected difference in environmental effect becomes l0sE. This is an
incredibly large, but in this case probably not unrealistic, effect given
Tavris's descriptions of the contrasting environments of Pot A and Pot B.
The story of tomatoes-in-two-pots doesn't contradict the default
hypothesis. Rather, it makes the very point of the default hypothesis by
stating that Pots A and B each contain random samples of the same batch of
seeds, so an equally massive result would have been observed if the left-hand
and right-hand seeds had been planted in opposite pots. Factor X is not needed
to explain the enriched and deprived tomatoes; the immense difference in the
environmental conditions is quite sufficient to produce a difference in tomato
size ten times greater than the average differences produced by environmental
variation within each pot.
Extending the tomato analogy to humans, Tavris goes on to argue, "Blacks
and whites do not grow up, on the average, in the same kind of pot". The
question, then, is whether the average environmental difference between blacks
and whites is sufficient to cause a lsP difference in IQ if BGH = 0 and WGH is
far from zero. The default hypothesis, positing values of BGH near those of
the empirical values of WGH, is more plausible than the hypothesis that BGH =
0. (A third hypothesis, which can be ruled out of serious consideration on
evolutionary grounds, given the observed genetic similarity between all human
groups, is that the basic organization of the brain and the processes involved
in mental development are qualitatively so different for blacks and whites
that any phenotypic difference between the groups cannot, even in principle,
be analyzed in terms of quantitative variation on the same scale of the
genetic or of the environmental factors that influence individual development
of mental ability within one racial group.)
The Default Hypothesis in Terms of Multiple Regression. The behavioral
geneticist Eric Turkheimer has proposed an approach for relating the
quantitative genetic analysis of individual and of group differences.
Phenotypic variance can be conceptually partitioned into its genetic and its
environmental components in terms of a multiple regression equation.
Turkheimer's method allows us to visualize the relationship of within-group
and between-group genetic effects and environmental effects in terms of a
regression plane located in a three-dimensional space in which the orthogonal
dimensions are phenotype (P), genotype (G), and environment (E). Both
individual and group mean phenotypic values (e.g., IQ) can then be represented
on the surface of this plane. This amounts to a graphic statement of the
strong default hypothesis, where the phenotypic difference between two
individuals (or two group means), A and B, can be represented by the multiple
regression of the phenotypic difference on the genetic and environmental
differences (GD and ED).
According to the default hypothesis, mental development is affected by the
genetic mechanisms of inheritance and by environmental factors in the same way
for all biologically normal individuals in either group. (Rejection of this
hypothesis would mean that evolution has caused some fundamental intraspecies
differences in brain organization and mental development, a possibility which,
though seemingly unlikely, has not yet been ruled out.) Thus the default
hypothesis implies that a unit increase in genetic value 0 for individuals in
group A is equal to the same unit increase in G for individuals in group B,
and likewise for the environmental value E. Within these constraints posited
by the default hypothesis, however, the groups may differ, on average, in the
mean values of G, or E, or both. Accordingly, individuals of either group will
fall at various points (depending on their own genotype and environment) on
the same regression lines (i.e., for the regression of P on G and of P on E).
This can be visualized graphically as a regression plane inside a square box
(Figure 12.9). The G and E values for individuals (or for group means) A and B
are projected onto the tilted plane; the projections are shown as a dot and a
square. Their positions on the plane are then projected onto the phenotype
dimension of the box.
The important point here is that the default hypothesis states that, for
any value of WGH, the predicted scores of all individuals (and consequently
the predicted group means) will lie on one and the same regression plane.
Assuming the default hypothesis, this clearly shows the relationship between
the heritability of individual differences within groups (WGH) and the
heritability of group differences (BGH). This formulation makes the default
hypothesis quantitatively explicit and therefore highly liable to empirical
refutation. If there were some environmental factor(s) that is unique to one
group and that contributes appreciably to the mean difference between the two
groups, their means would not lie on the same plane. This would result, for
example, if there were a between-groups G X E interaction. The existence of
such an interaction would be inconsistent with the default hypothesis, because
it would mean that the groups differ phenotypically due to some nonadditive
effects of genes and environment so that, say, two individuals, one from each
group, even if they had identical levels of IQ, would have had to attain that
level by different developmental processes and environmental influences. The
fact that significant G X E interactions with respect to IQ (or g) have not
been found within racial groups renders such an interaction between groups an
unlikely hypothesis.
It should be noted that the total nongenetic variance has been represented
here as e2. As explained in Chapter 7, the true-score nongenetic variance can
be partitioned into two components: between-families environment (BFE is also
termed shared environment because it is common to siblings or to any children
reared together) and within-family environment (WFE, or unshared environment,
that part of the total environmental effect that differs between persons
reared together).
The WFE results largely from an accumulation of more or less random
microenvironmental factors. We know from studies of adult MZ twins reared
apart and studies of genetically unrelated adults who were reared together
from infancy in adoptive homes that the BFE has little effect on the phenotype
of mental ability, such as IQ scores, even over a quite wide range of
environments (see Chapter 7 for details). The BF environment certainly has
large effects on mental development for the lowest extreme of the physical and
social environment, conditions such as chronic malnutrition, diseases that
affect brain development, and prolonged social isolation, particularly in
infancy and early childhood. These conditions occur only rarely in First World
populations. But some would argue that American inner cities are Third World
environments, and they certainly resemble them in some ways. On a scale of
environmental quality with respect to mental development, these adverse
environmental conditions probably fall more than 2s below the average
environment experienced by the majority of whites and very many blacks in
America. The hypothetical function relating phenotypic mental ability (e.g.,
IQ) on the total range of BFE effects (termed the reaction range or reaction
norm for the total environmental effect) is shown in Figure 12.10.
Pseudo-race Groups and the Default Hypothesis. In my studies of test bias,
I used what I termed pseudo-race groups to test the hypothesis that many
features of test performance are simply a result of group differences in the
mean and distribution of IQ per se rather than a result of any cultural
differences between groups. Pseudo-race groups are made up entirely of white
subjects. The standard group is composed of individuals selected on the basis
of estimated true-scores so as to be normally distributed, with a mean and
standard deviation of the IQ distribution of whites in the general population.
The pseudo-race group is composed of white individuals from the same
population as the standard group, but selected on the basis of their estimated
true-scores so as to be normally distributed, but with a mean and standard
deviation of the IQ distribution of blacks in the general population. The two
groups, with age controlled, are intentionally matched with the white and
black populations they are intended to represent only on the single variable
of interest, in this case IQ (or preferably g factor scores). Therefore, the
groups should not differ systematically on any other characteristics, except
for whatever characteristics may be correlated with IQ. Estimated true-scores
must be used to minimize the regression (i.e., toward the white mean of 100)
effect that would otherwise result from selecting white subjects on IQ so as
to form a group with a lower mean IQ than that of the population from which
they were selected.
The creation of two groups that, in this manner, are made to differ on a
single trait can be viewed as another model of the strong default hypothesis.
This method is especially useful in empirically examining various
nonpsychometric correlates of the standard group versus pseudo-race group
difference. These differences can then be compared against any such
differences found between representative samples of the actual white and black
populations. The critical question is, in the circumstances of daily life how
closely does the behavior of the pseudo-race group resemble that of a
comparable sample of actual blacks? The extent of the pseudo-race versus
actual race difference in nonpsychometric or "real-life" behavior would
delimit the g factor's power to account for the observed racial differences in
many educationally, occupationally, and socially significant variables.
Notice that the standard and pseudo-race groups would perfectly simulate
the conditions of the strong default hypothesis. Both genetic and
environmental sources of variance exist in nearly equal degrees within each
group, and the mean difference between the groups necessarily comprises
comparable genetic and environmental sources of variance. If this particular
set of genetic and environmental sources of IQ variance within and between the
standard and pseudo-race groups simulates actual white-black differences in
many forms of behavior that have some cognitive aspect but are typically
attributed solely to cultural differences, it constitutes strong support for
the default hypothesis. Experiments of this type could tell us a lot and
should be performed.
EMPIRICAL EVIDENCE ON THE DEFAULT HYPOTHESIS
Thus far the quantitative implications of the default hypothesis have been
considered only in theoretical or formal terms, which by themselves prove
nothing, but are intended only to lend some precision to the statement of the
hypothesis and its predicted empirical implications. It should be clear that
the hypothesis cannot feasibly be tested directly in terms of applying
first-order statistical analyses (e.g., the t test or analysis of variance
applied to phenotypic measures) to determine the BGH of a trait, as is
possible in the field of experimental genetics with plants or animals. In the
latter field, true breeding experiments with cross-fostering in controlled
environments across different subspecies and subsequent measurement of the
phenotypic characteristics of the progeny of the cross-bred strains for
comparison with the same phenotypes in the parent strains are possible and, in
fact, common. In theory, such experiments could be performed with different
human subspecies, or racial groups, and the results (after replications of the
experiment to statistically reduce uncertainty) would constitute a nearly
definitive test of the default hypothesis. An even more rigorous test of the
hypothesis than is provided by a controlled breeding and cross-fostering
experiment would involve in vitro fertilization to control for possible
differences in the prenatal environment of the cross-fostered progeny. Such
methods have been used in livestock breeding for years without any question as
to the validity of the results. But, of course, for ethical reasons the
methods of experimental genetics cannot be used for research in human
genetics. Therefore, indirect methods, which are analytically and
statistically more complex, have been developed by researchers in human
genetics.
The seemingly intractable problem with regard to phenotypic group
differences has been the empirical estimation of the BGH. To estimate the
genetic variance within groups one needs to know the genetic kinship
correlations based on the theoretically derived proportions of alleles common
to relatives of different degrees (e.g., MZ twins = 1.00, DZ twins and full
siblings, parent-child = 0.50 [or more with assortative mating, half-siblings
= 0.25, first cousins = .125, etc.). These unobserved but theoretically known
genetic kinship correlations are needed as parameters in the structural
equations used to estimate the proportion of genetic variance (heritability)
from the phenotypic correlations between relatives of different degrees of
kinship. But we generally do not have phenotypical correlations between
relatives that bridge different racial groups. Since few members of one racial
group have a near relative (by common descent) in a different racial group, we
don't have the parameters needed to estimate between-group heritability.
Although interracial matings can produce half-siblings and cousins who are
members of different racial groups, the offspring of interracial matings are
far from ideal for estimating BGH because, at least for blacks and whites, the
parents of the interracial offspring are known to be unrepresentative of these
populations. Thus such a study would have doubtful generality.
An example of cross-racial kinships that could be used would be a female of
group A who had two offspring by a male of group A and later had two offspring
by a male of group B, resulting finally in two pairs of full-siblings who are
both AA and two pairs of half-siblings who are both AB. A biometric genetic
analysis of phenotypic measurements obtained on large samples of such
full-siblings and half-siblings would theoretically afford a way of estimating
both WGH and BGH. Again, however, unless such groups arose from a controlled
breeding experiment, the resulting estimate of BGH would probably not be
generalizable to the population groups of interest but would apply only to the
specific groups used for this determination of BGH (and other groups obtained
in the same way). There are two reasons: First, the degree of assortative
mating for IQ is most likely the same, on average, for interracial and
intraracial matings; that is, the A and B mates of the hypothetical female in
our example would probably be phenotypically close in IQ, so at least one of
them would be phenotypically (hence also probably genetically)
unrepresentative of his own racial population. Therefore, the mixed offspring
AB are not likely to differ genetically much, if at all, on average, from the
unmixed offspring AA. Second, aside from assortative mating, it is unlikely
that interracial half-siblings are derived from parents who are random or
representative samples of their respective racial populations. It is known,
for example, that present-day blacks and whites in interracial marriages in
the United States are not typical of their respective populations in IQ
related variables, such as levels of education and occupation.
How then can the default hypothesis be tested empirically? It is tested
exactly as is any other scientific hypothesis; no hypothesis is regarded as
scientific unless predictions derived from it are capable of risking
refutation by an empirical test. Certain predictions can be made from the
default hypothesis that are capable of empirical test. If the observed result
differs significantly from the prediction, the hypothesis is considered
disproved, unless it can be shown that the tested prediction was an incorrect
deduction from the hypothesis, or that there are artifacts in the data or
methodological flaws in their analysis that could account for the observed
result. If the observed result does in fact accord with the prediction, the
hypothesis survives, although it cannot be said to be proven. This is because
it is logically impossible to prove the null hypothesis, which states that
there is no difference between the predicted and the observed result. If there
is an alternative hypothesis, it can also be tested against the same observed
result.
For example, if we hypothesize that no tiger is living in the Sherwood
Forest and a hundred people searching the forest fail to find a tiger, we have
not proved the null hypothesis, because the searchers might have failed to
look in the right places. If someone actually found a tiger in the forest,
however, the hypothesis is absolutely disproved. The alternative hypothesis is
that a tiger does live in the forest; finding a tiger clearly proves the
hypothesis. The failure of searchers to find the tiger decreases the
probability of its existence, and the more searching, the lower is the
probability, but it can never prove the tiger's nonexistence.
Similarly, the default hypothesis predicts certain outcomes under specified
conditions. If the observed outcome does not differ significantly from the
predicted outcomes, the default hypothesis is upheld but not proved. If the
prediction differs significantly from the observed result, the hypothesis must
be rejected. Typically, it is modified to accord better with the existing
evidence, and then its modified predictions are empirically tested with new
data. If it survives numerous tests, it conventionally becomes a "fact." In
this sense, for example, it is a "fact" that the earth revolves around the
sun, and it is a "fact" that all present-day organisms have evolved from
primitive forms.
Structural Equation Modeling. Probably the most rigorous methodology
presently available to test the default hypothesis is the application of
structural equation modeling to what is termed the biometric decomposition of
a phenotypic mean difference into its genetic and environmental components.
This methodology is an extraordinarily complex set of mathematical and
statistical procedures, an adequate explanation of which is beyond the scope
of this book, but for which detailed explanations are available. It is
essentially a multiple regression technique that can be used to statistically
test the differences in "goodness-of-fit" between alternative models, such as
whether (1) a phenotypic mean difference between groups consists of a linear
combination of the same genetic (G) and environmental (E) factors that
contribute to individual differences within the groups, or (2) the group
difference is attributable to some additional factor (an unknown Factor X)
that contributes to variance between groups but not to variance within groups.
Biometric decomposition by this method requires quite modern and
specialized computer programs (LISREL VII) and exacting conditions of the data
to which it is applied -- above all, large and representative samples of the
groups whose phenotypic means are to be decomposed into their genetic and
environmental components. All subjects in each group must be measured with at
least three or more different tests that are highly loaded on a common factor,
such as g, and this factor must have high congruence between the two groups.
Also, of course, each group must comprise at least two different degrees of
kinship (e.g., MZ and DZ twins, or full-siblings and half-siblings) to permit
reliable estimates of WGH for each of the tests. Further, in order to meet the
assumption that WGH is the same in both groups, the estimates of WGH obtained
for each of the tests should not differ significantly between the groups.
Given these stringent conditions, one can test whether the mean group
difference in the general factor common to the various tests is consistent
with the default model, which posits that the between-groups mean difference
comprises the same genetic and environmental factors as do individual
differences within each group. The goodness-of-fit of the data to the default
model (i.e., group phenotypic difference = G + E) is then compared against the
three alternative models, which posit only genetic (G) factors, or only
environment (E), or neither G nor E, respectively, as the cause of the group
difference. The method has been applied to estimate the genetic and
environmental contributions to the observed sex difference in average blood
pressure.
This methodology was applied to a data set that included scores on thirteen
mental tests (average g loading = .67) given to samples of black and white
adolescent MZ and DZ twins totaling 190 pairs. Age and a measure of
socioeconomic status were regressed out of the test scores. The data showed by
far the best fit to the default model, which therefore could not be rejected,
while the fit of the data to the alternative models, by comparison with the
default model, could be rejected at high levels of confidence (p < .005 to p <
.001). That is, the observed W-B group difference is probably best explained
in terms of both G and E factors, while either 0 or E alone is inadequate,
given the assumption that G and E are the same within both groups. This
result, however, does not warrant as much confidence as the above p values
would indicate, as these particular data are less than ideal for one of the
conditions of the model. The data set shows rather large and unsystematic
(though nonsignificant) differences in the WGHs of blacks and whites on the
various tests. Therefore, the estimate of BGH, though similar to the overall
WGH of the thirteen tests (about .60), is questionable. Even though the WGHs
of the general factor do not differ significantly between the races, the
difference is large enough to leave doubt as to whether it is merely due to
sampling error or is in fact real but cannot be detected given the sample
size. If the latter is true, then the model used in this particular method of
analysis (termed the psychometric factor model) cannot rigorously be applied
to these particular data.
A highly similar methodology (using a less restrictive model termed the
biometric factor model) was applied to a much larger data set by behavioral
geneticists David Rowe and co-workers. But Rowe's large-scale preliminary
studies should first be described. He began by studying the correlations
between objective tests of scholastic achievement (which are substantially
loaded on g as well as on specific achievement factors) and assessment of the
quality of the child's home environment based on environmental variables that
previous research had established as correlates of IQ and scholastic
achievement and which, overall, are intended to indicate the amount of
intellectual stimulation afforded by the child's environment outside of
school. Measures of the achievement and home environment variables were
obtained on large samples of biologically full-sibling pairs, each tested
twice (at ages 6.6 and 9.0 years). The total sample comprised three groups:
white, black, and Hispanic, and represented the full range of socioeconomic
levels in the United States, with intentional oversampling of blacks and
Hispanics.
The data on each population group were treated separately, yielding three
matrices (white, black, and Hispanic), each comprising the correlations
between (1) the achievement and the environmental variables within and between
age groups, (2) the full-sibling correlations on each variable at each age,
and (3) the cross-sibling correlations on each variable at each age --
yielding twenty-eight correlation coefficients for each of the three ethnic
groups.
Now if, in addition to the environmental factors measured in this study,
there were some unidentified Factor X that is unique to a certain group and is
responsible for most of the difference in achievement levels between the
ethnic groups, one would expect that the existence of Factor X in one (or
two), but not all three, of the groups should be detectable by an observed
difference between groups in the matrix of correlations among all of the
variables. That is, a Factor X hypothesized to represent a unique causal
process responsible for lower achievement in one groups but not in the others
should cause the pattern of correlations between environment and achievement,
or between siblings, or between different ages, to be distinct for that group.
However, since the correlation matrices were statistically equal, there was
not the slightest evidence of a Factor X operating in any group. The
correlation matrices of the different ethnic groups were as similar to one
another as were correlation matrices derived from randomly selected
half-samples within each ethnic group.
Further analyses by Rowe et al. that included other variables yielded the
same results. Altogether the six data sets used in their studies included
8,582 whites, 3,392 blacks, 1,766 Hispanics, and 906 Asians. None of the
analyses required a minority-unique developmental process or a
cultural-environmental Factor X to explain the correlations between the
achievement variables and the environmental variables in either of the
minority groups. The results are consistent with the default hypothesis, as
explained by Rowe et al: "Our explanation for the similarity of developmental
precesses is that (a) different racial and ethnic groups possess a common gene
pooi, which can create behavioral similarities, and that (b) among
second-generation ethnic and racial groups in the United States, cultural
differences are smaller than commonly believed because of the omnipresent
force of our mass-media culture, from television to fast-food restaurants.
Certainly, a burden of proof must shift to those scholars arguing a cultural
difference position. They need to explain how matrices representing
developmental processes can be so similar across ethnic and racial groups if
major developmental processes exert a minority-specific influence on school
achievement."
The dual hypothesis, which attributes the within-group variance to both
genetic and environmental factors but excludes genetic factors from the mean
differences between groups, would, in the light of these results, have to
invoke a Factor X which, on the one hand, is so subtle and ghostly as to be
perfectly undetectable in the whole matrix of correlations among test scores,
environmental measures, full-siblings, and ages, yet sufficiently powerful to
depress the minority group scores, on average, by as much as one-half a
standard deviation.
To test the hypothesis that genetic as well as environmental factors are
implicated in the group differences, Rowe and Cleveland designed a study that
used the kind of structural equation modeling methodology (with the biometric
factor model) mentioned previously. The study used full-siblings and
half-siblings to estimate the WGH for large samples of blacks and whites
(total N = 1,220) on three Peabody basic achievement tests (Reading
Recognition, Reading Comprehension, and general Mathematics). A previous study
had found that the heritability (WGH) of these tests averaged about .50 and
their average correlation with verbal IQ = .65. The achievement tests were
correlated among themselves about .75., indicating that they all share a large
common factor, with minor specificities for each subtest.
The default hypothesis that the difference between the black and white
group means on the single general achievement factor has the same genetic and
non-genetic causes that contribute to individual differences within each group
could not be rejected. The data fit the default model extremely well, with a
goodness-of-fit index of .98 (which, like a correlation coefficient, is scaled
from zero to one). The authors concluded that the genetic and environmental
sources of individual differences and of differences between racial means
appear to be identical. Compared to the white siblings, the black siblings had
lower means on both the genetic and the environmental components. To
demonstrate the sensitivity of their methodology, the authors substituted a
fake mean value for the real mean for whites on the Reading Recognition test
and did the same for blacks on the Math test. The fake white mean
approximately equaled the true black mean and vice versa. When the same
analysis was applied to the data set with the fake means, it led to a
clear-cut rejection of the default hypothesis. For the actual data set,
however, the BGH did not differ significantly from the WGH. The values of the
BGH were .66 to .74 for the verbal tests and .36 for the math test. On the
side of caution, the authors state, "These estimates, of course, are imprecise
because of sampling variation; they suggest that a part of the Black versus
White mean difference is caused by racial genetic differences, but that it
would take a larger study, especially one with more genetically informative
half-sibling pairs, to make such estimates quantitatively precise".
Regression to the Population Mean. In the 1860s, Sir Francis Galton
discovered a phenomenon that he first called reversion to the mean and later
gave it the more grandiloquent title the law of filial regression to
mediocrity. The phenomenon so described refers to the fact that, on every
quantitative hereditary trait that Galton examined, from the size of peas to
the size of persons, the measurement of the trait in the mature offspring of a
given parent (or both parents) was, on average, closer to the population mean
(for their own sex) than was that of the parent(s). An exceptionally tall
father, for example, had Sons who are shorter than he; and an exceptionally
short father had sons who were taller than he. (The same for mothers and
daughters.)
This "regression to the mean" is probably better called regression toward
the mean, the mean being that of the subpopulation from which the parent and
offspring were selected. In quantitative terms, Galton's "law" predicts that
the more that variation in a trait is determined by genetic factors, the
closer the degree of regression (from one parent to one child), on average,
approximates one-half. This is because an offspring receives exactly one-half
of its genes from each parent, and therefore the parent-offspring genetic
correlation equals .50. The corresponding phenolypic correlation, of course,
is subject to environmental influences, which may cause the phenotypic sibling
correlation to be greater than or (more usually) less than the genetic
correlation of .50. The more that the trait is influenced by nongenetic
factors, the greater is the departure of the parent-offspring correlation from
.50. The average of the parent-child correlations for IQ reported in
thirty-two studies is +.42. Traits in which variation is almost completely
genetic, such as the number of fingerprint ridges, show a parent-offspring
correlation very near .50. Mature height is also quite near this figure, but
lower in childhood, because children attain their adult height at different
rates. (Differences in both physical and mental growth curves are also largely
genetic.)
Regression occurs for all degrees of kinship, its degree depending on the
genetic correlation for the given kinship. Suppose we measure individuals
(termed probands) selected at random from a given population and then measure
their relatives (all of the same degree of kinship to the probands). Then,
according to Galton's "law" and the extent to which the trait of interest is
genetically determined, the expected value (i.e., best prediction) of the
proband's relative (in standardized units, z) is rGZP. The expected difference
between a proband and his or her relative will be equal to rGZP, where rG is
the theoretical genetic correlation between relatives of a given degree of
kinship, ZP is the standardized phenotypic measurement of the proband, and ZR
the predicted or expected measurement of the proband's relative. It should be
emphasized that this prediction is statistical and therefore achieves a high
degree of accuracy only when averaged over a large number of pairs of
relatives. The standard deviation of the errors of prediction for individual
cases is quite large.
A common misconception is that regression to the mean implies that the
total variance in the population shrinks from one generation to the next,
until eventually everyone in the population would be located at the mean on a
given trait. In fact, the population variance does not change at all as a
result of the phenomenon of regression. Regression toward the mean works in
both directions. That is, offspring with phenotypes extremely above (or below)
the mean have parents whose phenotypes are less extreme, but are, on average,
above (or below) the population mean. Regression toward the mean is a
statistical result of the imperfect correlation between relatives, whatever
the causes of the imperfect correlation, of which there may be many.
Genetic theory establishes the genetic correlations between various
kinships and thereby indicates how much of the regression for any given degree
of kinship is attributable to genetic factors. Without the genetic prediction,
any particular kinship regression (or correlation) is causally not
interpretable. Resemblance between relatives could be attributed to any
combination of genetic and nongenetic factors.
Empirical determination of whether regression to the mean accords with the
expectation of genetic theory, therefore, provides another means of testing
the default hypothesis. Since regression can result from environmental as well
as from genetic factors (and always does to some extent, unless the trait
variation has perfect heritability [i.e., h2 = 1] and the phenotype is without
measurement error), the usefulness of the regression phenomenon based on only
one degree of kinship to test a causal hypothesis is problematic, regardless
of its purely statistical significance. However, it would be remarkable (and
improbable) if environmental factors consistently simulated the degree of
regression predicted by genetic theory across a number of degrees of kinship.
A theory that completely excludes any involvement of genetic factors in
producing an observed group difference offers no quantitative prediction as to
the amount of regression for a given kinship and is unable to explain certain
phenomena that are both predictable and explainable in terms of genetic
regression. For example, consider Figure 11.2 (p. 358) in the previous
chapter. It shows a phenomenon that has been observed in many studies and
which many people not familiar with Galton's "law" find wholly surprising. One
would expect, on purely environmental grounds, that the mean IQ difference
between black and white children should decrease at each successively higher
level of the parental socioeconomic status (i.e., education, occupational
level, income, cultural advantages, and the like). It could hardly be argued
that environmental advantages are not greater at higher levels of SES, in both
the black and the white populations. Yet, as seen in Figure 11.2, the black
and white group means actually diverge with increasing SES, although IQ
increases with SES for both blacks and whites. The specific form of this
increasing divergence of the white and black groups is also of some
theoretical interest: the black means show a significantly lower rate of
increase in IQ as a function of SES than do the white means. These two related
phenomena, black-white divergence and rate of increase in mean IQ as a
function of SES, are predictable and explainable in terms of regression, and
would occur even if there were no difference in IQ between the mean IQs of the
black and the white parents within each level of SES. These results are
expected on purely genetic grounds, although environmental factors also are
most likely involved in the regression. For a given parental IQ, the offspring
IQs (regardless of race) regress about halfway to their population mean. As
noted previously, this is also true for height and other heritable physical
traits.
Probably the single most useful kinship for testing the default hypothesis
is full siblings reared together, because they are plentiful, they have
developed in generally more similar environments than have parents and their
own children, and they have a genetic correlation of about .50. I say "about
.50" because there are two genetic factors that tend slightly to alter this
correlation. As they work in opposite directions, their effects tend to cancel
each other. When the total genetic variance includes nonadditive genetic
effects (particularly genetic dominance) it slightly decreases the genetic
correlation between full siblings, while assortative mating (i.e., correlation
between the parents' genotypes) slightly increases the sibling correlation.
Because of nongenetic factors, the phenotypic correlation between siblings is
generally below the genetic correlation. Meta-analyses of virtually all of the
full-sibling IQ correlations reported in the world literature yield an overall
average r of only slightly below the predicted +.50.
Some years ago, an official from a large school system came to me with a
problem concerning the school system's attempt to find more black children who
would qualify for placement in classes for the "high potential" or
"academically gifted" pupils (i.e., IQ of 120 or above). Black pupils were
markedly underrepresented in these classes relative to whites and Asians
attending the same schools. Having noticed that a fair number of the white and
Asian children in these classes had a sibling who also qualified, the school
system tested the siblings of the black pupils who had already been placed in
the high-potential classes. However, exceedingly few of the black siblings in
regular classes were especially outstanding students or had IQ scores that
qualified them for the high-potential program. The official, who was concerned
about bias in the testing program, asked if I had any other idea as to a
possible explanation for their finding. His results are in fact fully
explainable in terms of regression toward the mean.
I later analyzed the IQ scores on all of the full-sibling pairs in grades
one through six who had taken the same IQ tests (Lorge-Thorndike) normed on a
national sample in all of the fourteen elementary schools of another
California school district. As this study has been described more fully
elsewhere, I will only summarize here. There were over 900 white sibling pairs
and over 500 black sibling pairs. The sibling intraclass correlations for
whites and blacks were .40 and .38, respectively. The departure of these
correlations from the genetically expected value of .50 indicates that
nongenetic factors (i.e., environmental influences and unreliability of
measurement) affect the sibling correlation similarly in both groups. In this
school district, blacks and whites who were perfectly matched for a true-score
IQ of 120 had siblings whose average IQ was 113 for whites and 99 for blacks.
In about 33 percent of the white sibling pairs both siblings had an IQ of 120
or above, as compared with only about 12 percent of black siblings.
Of more general significance, however, was the finding that Galton's "law"
held true for both black and white sibling pairs over the full range of IQs
(approximately IQ 50 to IQ 150) in this school district. In other words, the
sibling regression lines for each group showed no significant deviation from
linearity. (Including nonlinear transformations of the variables in the
multiple regression equation produced no significant increment in the simple
sibling correlation.) These regression findings can be regarded, not as a
proof of the default hypothesis, but as wholly consistent with it. No purely
environmental theory would have predicted such results. Of course, ex post
facto and ad hoc explanations in strictly environmental terms are always
possible if one postulates environmental influences on IQ that perfectly mimic
the basic principles of genetics that apply to every quantitative physical
characteristic observed in all sexually reproducing plants and animals.
A number of different mental tests besides IQ were also given to the pupils
in the school district described above. They included sixteen age-normed
measures of scholastic achievement in language and arithmetic skills,
short-term memory, and a speeded paper-and-pencil psychomotor test that mainly
reflects effort or motivation in the testing situation. Sibling intraclass
correlations were obtained on each of the sixteen tests. IQ, being the most g
loaded of all the tests, had the largest sibling correlation. All sixteen of
the sibling correlations, however, fell below +.50 to varying degrees; the
correlations ranged from .10 to .45., averaging .30 for whites and .28 for
blacks. (For comparison, the average age-adjusted sibling correlations for
height and weight in this sample were .44 and .38, respectively.) Deviations
of these sibling correlations from the genetic correlation of .50 are an
indication that the test score variances do reflect nongenetic factors to
varying degrees. Conversely, the closer the obtained sibling correlation
approaches the expected genetic correlation of .50, the larger its genetic
component. These data, therefore, allow two predictions, which, if borne out,
would be consistent with the default hypothesis: (1.) The varying magnitudes
of the sibling correlations on the sixteen diverse tests in blacks and whites
should be positively correlated. In fact, the correlation between the vector
of sixteen black sibling correlations and the corresponding vector of sixteen
white sibling correlations was r = +.71, p = .002. (2.) For both blacks and
whites, there should be a positive correlation between (a) the magnitudes of
the sibling correlations on the sixteen tests and (b) the magnitudes of the
standardized mean W-B differences on the sixteen tests. The results show that
the correlation between the standardized mean W-B differences on the sixteen
tests and the siblings correlations is r = +.61, p < .013 for blacks, and r =
+.80, p < .001 for whites.
Note that with regard to the second prediction, a purely environmental
hypothesis of the mean W-B differences would predict a negative correlation
between the magnitudes of the sibling correlations and the magnitudes of the
mean W-B differences. The results in fact showing a strong positive
correlation contradict this purely nongenetic hypothesis.
CONTROLLING THE ENVIRONMENT: TRANSRACIAL ADOPTION
Theoretically, a transracial adoption study should provide a strong test of
the default hypothesis. In reality, however, a real-life adoption study can
hardly meet the ideal conditions necessary to make it definitive. Such
conditions can be perfectly met only through the cross-fostering methods used
in animal behavior genetics, in which probands can be randomly assigned to
foster parents. Although adoption in infancy is probably the most
comprehensive and powerful environmental intervention possible with humans,
under natural conditions the adoption design is unavoidably problematic
because the investigator cannot experimentally control the specific selective
factors that affect transracial adoptions -- the adopted children themselves,
their biological parents, or the adopting parents. Prenatal and perinatal
conditions and the preadoption environment are largely uncontrolled. So, too,
is the willingness of parents to volunteer their adopted children for such a
study, which introduces an ambiguous selection factor into the subject
sampling of any adoption study. It is known that individuals who volunteer as
subjects in studies that involve the measurement of mental ability generally
tend to be somewhat above-average in ability. For these reasons, and given the
scarcity of transracial adoptions, few such studies have been reported in the
literature. Only one of these, known as the Minnesota Transracial Adoption
Study, is based on large enough samples of black and white adoptees to permit
statistical analysis. While even the Minnesota Study does not meet the
theoretically ideal conditions, it is nevertheless informative with respect to
the default hypothesis.
Initiated and conducted by Sandra Scarr and several colleagues, the
Minnesota Transracial Adoption Study examined the same groups of children when
they were about age 7 and again in a 10-year follow-up when they were about
age 17. The follow-up study is especially important, because it has been found
in other studies that family environmental influences on IQ decrease from
early childhood to late adolescence, while there is a corresponding increase
in the phenotypic expression of the genetic component of IQ variance.
Therefore, one would have more confidence in the follow-up data (obtained at
age 17) as a test of the default hypothesis than in the data obtained at age
7.
Four main groups were compared on IQ and scholastic performance:
1. Biological offspring of the white adoptive parents.
2. Adopted children whose biological father and mother were both white
(WW).
3. Adopted interracial children whose biological fathers were black and
whose mothers were white (BW).
4. Adopted children whose biological father and mother were both black
(BB).
The adoptive parents were all upper-middle class, employed in professional
and managerial occupations, with an average educational level of about sixteen
years (college graduate) and an average WAIS IQ of about 120. The biological
parents of the BB and BW adoptees averaged 11.5 years and 12.5 years of
education, respectively. The IQs of the adoptees' biological parents were not
known. Few of the adoptees ever lived with their biological parents; some
lived briefly in foster homes before they were legally adopted. The average
age of adoption was 32 months for the BB adoptees, 9 months for the BW
adoptees, and 19 months for the WW adoptees. The adoptees came mostly from the
North Central and North Eastern regions of the United States. The
Stanford-Binet and the Wechsler Intelligence Scale for Children (WTSC) were
used in the first study (at age seven), the Wechsler Adult Intelligence Scale
(WAIS) was used in the follow-up study (at age seventeen).
The investigators hypothesized that the typical W-B IQ difference results
from the lesser relevance of the specific information content of IQ tests to
the blacks' typical cultural environment. They therefore suggest that if black
children were reared in a middle or upper-middle class white environment they
would perform near the white average on IQ tests and in scholastic
achievement. This cultural-difference hypothesis therefore posits no genetic
effect on the mean W-B IQ difference; rather, it assumes equal black and white
means in genotypic g. The default hypothesis, on the other hand, posits both
genetic and environmental factors as determinants of the mean W-B IQ
difference. It therefore predicts that groups of black and white children
reared in highly similar environments typical of the white middle-class
culture would still differ in IQ to the extent expected from the heritability
of IQ within either population.
The data of the Minnesota Study also allow another prediction based on the
default hypothesis, namely, that the interracial children (BW) should score,
on average, nearly (but not necessarily exactly) halfway between the means of
the WW and BB groups. Because the alleles that enhance IQ are genetically
dominant, we would expect the BW group mean to be slightly closer to the mean
of the WW group than to the mean of the BB group. That is, the heterosis
(outbreeding enhancement of the trait) due to dominance deviation would raise
the BW group's mean slightly above the midpoint between the BB and WW groups.
(This halfway point would be the expected value if the heritability of IQ
reflected only the effects of additive genetic variance.) Testing this
predicted heterotic effect is unfortunately debased by the fact that the IQs
of the biological parents of the BB and BW groups were not known. As the BB
biological parents had about one year less education than the BW parents,
given the correlation between IQ and education, it is likely that the mean IQ
of the BB parents was somewhat lower than the mean IQ of the BW parents, and
so would produce a result similar to that predicted in terms of heterosis. It
is also possible, though less likely, that the later age of adoption (by
twenty-one months) of the BB adoptees than of the BW adoptees would produce an
effect similar to that predicted in terms of heterosis.
The results based on the subjects who were tested on both occasions are
shown in Table 12.5. Because different tests based on different
standardization groups were used in the first testing than were used in the
follow-up testing, the overall average difference of about eight IQ points
(evident for all groups) between the two test periods is of no theoretical
importance for the hypothesis of interest. The only important comparisons are
those between the WW, BW, and BB adopted groups within each age level. They
show that:
* The biological offspring have about the same average IQ as has been
reported for children of upper-middle-class parents. Their IQs are lower, on
average, than the average IQ of their parents, consistent with the expected
genetic regression toward the population mean (mainly because of genetic
dominance, which is known to affect IQ-see Chapter. 7, pp. 189-91). The
above-average environment of these adoptive families probably counteracts the
predicted genetic regression effect to some extent, expectably more at age
seven than at age seventeen.
* The BB adoptees' mean IQ is close to the mean IQ of ninety for blacks in
the same North Central area (from, which the BB adoptees came) reared by their
own parents. At age seventeen the BB group's IQ is virtually identical to the
mean IQ of blacks in the North Central part of the United States. Having been
reared from two years of age in a white upper-middle-class environment has
apparently had little or no effect on their expected IQ, that is, the average
IQ of black children reared in the average black environment. This finding
specifically contradicts the expectation of the cultural-difference
explanation of the W-B IQ difference, but is consistent with the default
hypothesis.
* The BB group is more typical of the U.S. black population than is the BW
group. The BB group's IQ at age seventeen is sixteen points below that of the
white adoptees and thirteen points below the mean IQ of whites in the national
standardization sample of the WAIS. Thus the BB adoptees' IQ is not very
different from what would be expected if they were reared in the average
environment of blacks in general (i.e., IQ eighty-five).
* The mean IQ of the interracial adoptees (BW), both at ages seven and
seventeen, is nearly intermediate between the WW and BB adoptees, but falls
slightly closer to the WW mean. This is consistent with, but does not prove,
the predicted heterotic effect of outbreeding on IQ. The intermediate IQ at
age seven is (WW + BB)/2 = (117.6 + 95.4)/2 = 106.5, or three points below the
observed IQ of the BW group; at age seventeen the intermediate IQ is 97.5, or
one point below the observed IQ of the BW group. Of course, mean deviations of
this magnitude, given the sample sizes in this study, are not significant.
Hence no conclusion can be drawn from these data regarding the predicted
heterotic effect. But all of the group IQ means do differ significantly from
one another, both at age seven and at age seventeen, and the fact that the BW
adoptees are so nearly intermediate between the WW and BR groups is hard to
explain in purely environmental or cultural terms. But it is fully consistent
with the genetic prediction. An ad hoc explanation would have to argue for the
existence of some cultural effects that quantitatively simulate the prediction
of the default hypothesis, which is derived by simple arithmetic from accepted
genetic theory.
* Results similar to those for IQ were also found for scholastic
achievement measured at age seventeen, except that the groups differed
slightly less on the scholastic achievement measures than on IQ. This is
probably because the level of scholastic achievement is generally more
susceptible to family influences than is the IQ. The mean scores based on the
average of five measures of scholastic achievement and aptitude expressed on
the same scale as the IQ were: Nonadopted biological offspring = 107.2, WW
adoptees =103.1, BW adoptees = 100.1, BB adoptees = 95.1. Again, the BW
group's mean is but one point above the midpoint between the means of the WW
and BB groups.
In light of what has been learned from many other adoption studies, the
results of this transracial adoption study are hardly surprising. As was noted
in Chapter 7 (pp. 177-79), adoption studies have shown that the between-family
(or shared) environment is the smallest component of true-score IQ variance by
late adolescence.
It is instructive to consider another adoption study by Scarr and Weinberg,
based on nearly 200 white children who, in their first year of life, were
adopted into 104 white families. Although the adoptive families ranged rather
widely in socioeconomic status, by the time the adoptees were adolescents
there were nonsignificant and near-zero correlations between the adoptee's IQs
and the characteristics of their adoptive families, such as the parents'
education, IQ, occupation, and income. Scarr and Weinberg concluded that,
within the range of "humane environments," variations in family socioeconomic
characteristics and in child-rearing practices have little or no effect on IQ
measured in adolescence. Most "humane environments," they claimed, are
functionally equivalent for the child's mental development.
In the transracial adoption study, therefore, one would not expect that the
large differences between the mean IQs of the WW, BW, and BB adoptees would
have been mainly caused by differences in the unquestionably humane and
well-above-average adoptive family environments in which these children grew
up. Viewed in the context of adoption studies in which race is not a factor,
the group differences observed in the transracial adoption study would be
attributed to genetic factors.
There is simply no good evidence that social environmental factors have a
large effect on IQ, particularly in adolescence and beyond, except in cases of
extreme environmental deprivation. In the Texas Adoption Study, for example,
adoptees whose biological mothers had IQs of ninety-five or below were
compared with adoptees whose biological mothers had IQs of 120 or above.
Although these children were given up by their mothers in infancy and all were
adopted into good homes, the two groups differed by 15.7 IQ points at age 7
years and by 19 IQ points at age 17. These mean differences, which are about
one-half of the mean difference between the low-IQ and high-IQ biological
mothers of these children, are close to what one would predict from a simple
genetic model according to which the standardized regression of offspring on
biological parents is .50.
In still another study, Turkheimer used a quite clever adoption design in
which each of the adoptee probands was compared against two nonadopted
children, one who was reared in the same social class as the adopted proband's
biological mother, the other who was reared in the same social class as the
proband's adoptive mother. (In all cases, the proband's biological mother was
of lower SES than the adoptive mother.) This design would answer the question
of whether a child born to a mother of lower SES background and adopted into a
family of higher SES background would have an IQ that is closer to children
who were born and reared in a lower SES background than to children born and
reared in a higher SES background. The result: the proband adoptees' mean IQ
was nearly the same as the mean IQ of the nonadopted children of mothers of
lower SES background but differed significantly (by more than 7.5 IQ points)
from the mean IQ of the nonadopted children of mothers of higher SES
background. In other words, the adopted probands, although reared by adoptive
mothers of higher SES than that of the probands' biological mothers, turned
out about the same with respect to IQ as if they had been reared by their
biological mothers, who were of lower SES. Again, it appears that the family
social environment has a surprisingly weak influence on IQ. This broad factor
therefore would seem to carry little explanatory weight for the IQ differences
between the WW, BW, and BB groups in the transracial adoption study.
There is no evidence that the effect of adoption is to lower a child's IQ
from what it would have been if the child were reared by it own parents, and
some evidence indicates the contrary. Nor is there evidence that transracial
adoption per se is disadvantageous for cognitive development. Three
independent studies of Asian children (from Cambodia, Korea, Thailand, and
Vietnam) adopted into white families in the United States and Belgium have
found that, by school age, their IQ (and scholastic achievement), on average,
considerably exceeds that of middle-class white American and Belgian children
by at least ten IQ points, despite the fact that many of the Asian children
had been diagnosed as suffering from malnutrition prior to adoption.
The authors of the Minnesota Study suggest the difference in age of
adoption of the BB and BW groups (32 months and 9 months, respectively) as a
possible cause of the lower IQ of the BB group (by 12 points at age 7, 9
points at age 17). The children were in foster care prior to adoption, but
there is no indication that the foster homes did not provide a humane
environment. A large-scale study specifically addressed to the effect of early
versus late age of adoption on children's later IQ did find that infants who
were adopted before one year of age had significantly higher IQs at age four
years than did children adopted after one year of age, but this difference
disappeared when the children were retested at school age. The adoptees were
compared with nonadopted controls matched on a number of biological, maternal,
prenatal, and perinatal variables as well as on SES, education, and race. The
authors concluded, "The adopted children studied in this project not only did
not have higher IQ than the [matched] controls, but also did not perform at
the same intellectual level as the biologic children from the same high
socioeconomic environment into which they were adopted. . . . the better
socioeconomic environment provided by adoptive parents is favorable for an
adopted child's physical growth (height and weight) and academic achievement
but has no influence on the child's head measurement and intellectual
capacity, both of which require a genetic influence."
In the Minnesota Transracial Adoption Study, multiple regression analyses
were performed to compare the effects of ten environmental variables with the
effects of two genetic variables in accounting for the IQ variance at age
seventeen in the combined black and interracial groups (i.e., BB & BW). The
ten environmental variables were those associated with the conditions of
adoption and the adoptive family characteristics (e.g., age of placement, time
in adoptive home, number of preadoptive placements, quality of preadoptive
placements, adoptive mother's and father's education, IQ, occupation, and
family income). The two genetic variables were the biological mother's race
and education. (The biological father's education, although it was known, was
not used in the regression analysis; if it were included, the results might
lend slightly more weight to the genetic variance accounted for by this
analysis.) The unbiased multiple correlation (R) between the ten environmental
variables and IQ was .28. The unbiased R between the two genetic variables and
IQ was .39. This is a fairly impressive correlation, considering that mother's
race was treated, as a dichotomous variable with a 72%(BW mothers)/28%(BB
mothers) split. (The greater the departure from the optimal 50%/50% split, the
more restricted is the size of the obtained correlation. If the obtained
correlation of .39 were corrected to compensate for this suboptimal split, the
corrected value would be .43.) Moreover, mother's education (measured in
years) is a rather weak surrogate for IQ; it is correlated about + .7 with IQ
in the general population. (In the present sample, the biological mothers'
years of education in the BB group had a mean of 10.9, SD = 1.9 years, range
6-14 years; the BW group had a mean of 12.4, SD = 1.8, range 7-18.)
The two critiques, by Levin and by Lynn, of the authors'
social-environmental interpretation of the results of their follow-up study
are well worth reading, as is the authors' detailed reply, in which they
state, "We think that it is exceedingly implausible that these differences are
either entirely genetically based or entirely environmentally based."
STUDIES BASED ON RACIAL ADMIXTURE
In the Minnesota Transracial Adoption Study, the interracial adoptees
labeled BW (black father, white mother) had a mean IQ approximately
intermediate between those of the white (WW) and the black (BB) adoptees. One
might expect, therefore, that individual variation in IQ among the population
of black Americans would be correlated with individual variation in the
percentage of Caucasian admixture. (The mean percentage of European genes in
American blacks today is approximately 25 percent, with an undetermined
standard deviation for individual variation.) This prediction could be used to
test the hypothesis that blacks and whites differ in the frequencies of the
alleles whose phenotypic effects are positively correlated with g. The several
attempts to do so, unfortunately, are riddled with technical difficulties and
so are unable to reduce the uncertainty as to the nature of the mean W-B
difference in IQ.
An ideal study would require that the relative proportions of European and
African genes in each hybrid individual be known precisely. This, in turn,
would demand genealogical records extending back to each individual's earliest
ancestors of unmixed European and African origin. In addition, for the results
to be generalizable to the present-day populations of interest, one would also
need to know how representative of the white and black populations in each
generation of interracial ancestors of the study probands (i.e., the present
hybrid individuals whose level of g is measured) were. A high degree of
assortative mating for g, for example, would mean that these ancestors were
not representative and that cross-racial matings transmitted much the same
g-related alleles from each racial line. Also, the results would be ambiguous
if there were a marked systematic difference in the g levels of the black and
white mates (e.g., in half of the matings the black [or hybrid] g > white g
and vice versa in the other half). This situation would act to cancel any
racial effect in the offspring's level of g.
A large data set that met these ideal conditions would provide a strong
test of the genetic hypothesis. Unfortunately, such ideal data do not exist,
and are probably impossible to obtain. Investigators have therefore resorted
to estimating the degree of European admixture in representative samples of
American blacks 'by means of blood-group analyses, using those blood groups
that differ most in frequency between contemporary Europeans and Africans in
the regions of origin of the probands' ancestors. Each marker blood group is
identified with a particular polymorphic gene. Certain antigens or
immunoglobulins in the blood serum, which have different polymorphic gene
loci, are also used in the same way. The gene loci for all of the known human
blood loci constitute but a very small fraction of the total number of genes
in the human genome. To date, only two such loci, the Fy (Duffy) blood group
and the immunoglobulin Gm, have been identified that discriminate very
markedly between Europeans and Africans, with near-zero frequencies in one
population and relatively high frequencies in the other. A number of other
blood groups and blood serum antigens also discriminate between Europeans and
Africans, but with much less precision. T. E. Reed, an expert on the genetics
of blood groups, has calculated that a minimum of eighteen gene loci with
perfect discrimination power (i.e., 100 percent frequency in one population
and 0 percent in the other) are needed to determine the proportions of
European/African admixture with a 5 percent or less error rate for specific
individuals. This condition is literally impossible to achieve given the small
number of blood groups and serum antigens known to differ in racial
frequencies. However, blood group data, particularly that of Fy and Gm,
aggregated in reasonably large samples are capable of showing statistically
significant mean differences in mental test scores between groups if in fact
the mean difference has a genetic component.
A critical problem with this methodology is that we know next to nothing
about the level of g in either the specific European or African ancestors or
of the g-related selective factors that may have influenced mating patterns
over the many subsequent generations of the hybrid offspring, from the time of
the first African arrivals in America up to the present. Therefore, even if
most of the European blood-group genes in present-day American blacks had been
randomly sampled from European ancestors, the genes associated with g may not
have been as randomly sampled, if systematic selective mating took place
between the original ancestral groups or in the many generations of hybrid
descendants.
Another problem with the estimation of racial admixture from blood-group
frequencies is that most of the European genes in the American black gene pool
were introduced generations ago, mostly during the period of slavery.
According to genetic principles, the alleles of a particular racial origin
would become in increasingly disassociated from one another in each subsequent
generation. The genetic result of this disassociation, which is due to the
phenomena known as crossing-over and independent segregation of alleles, is
that any allele that shows different frequencies in the ancestral racial
groups becomes increasingly less predictive of other such alleles in each
subsequent generation of the racially hybridized population. If a given blood
group of European origin is not reliably correlated with other blood groups of
European origin in a representative sample of hybrid individuals, we could
hardly expect it to be correlated with the alleles of European origin that
affect g. In psychometric terms, such a blood group would be said to have
little or no validity for ranking hybrid individuals according to their degree
of genetic admixture, and would therefore be useless in testing the hypothesis
that variation in g in a hybrid (black-white) population is positively
correlated with variation in amount of European admixture.
This disassociation among various European genes in black Americans was
demonstrated in a study based on large samples of blacks and whites in Georgia
and Kentucky. The average correlations among the seven blood-group alleles
that differed most in racial frequencies (out of sixteen blood groups tested)
were not significantly different from zero, averaging -.015 in the white
samples (for which the theoretically expected correlation is zero) and -.030
in the black samples. (Although the correlations between blood groups in
individuals were nil, the total frequencies of each of the various blood
groups were quite consistent [r=.88] across the Georgia and Kentucky samples.)
Gm was not included in this correlation analysis but is known to be correlated
with Fy. These results, then, imply that virtually all blood groups other than
Fy and Gm are practically useless for estimating the proportions of Caucasian
admixture in hybrid black individuals. It is little wonder, then, that, in
this study, the blood-group data from the hybrid black sample yielded no
evidence of being significantly or consistently correlated with g (which was
measured as the composite score on nineteen tests).
A similar study, but much more complex in design and analyses, by Sandra
Scarr and co-workers, ranked 181 black individuals (in Philadelphia) on a
continuous variable, called an "odds" index, estimated from twelve genetic
markers that indicated the degree to which an individual's genetic markers
resembled those of Africans without any Caucasian ancestry versus the genetic
markers of Europeans (without any African ancestry). This is probably an even
less accurate estimate of ancestral admixture than would be a direct measure
of the percentage of African admixture, which (for reasons not adequately
explained by the authors) was not used in this study, although it was used
successfully in another study of the genetic basis of the average white-black
difference in diastolic blood pressure. The "odds" index of African ancestry
showed no significant correlation with individual IQs. It also failed to
discriminate significantly between the means of the top and bottom one-third
of the total distribution on the "ancestral odds" index of Caucasian ancestry.
In brief, the null hypothesis (i.e., no relationship between hybrid mental
test score and amount of European ancestry) could not be rejected by the data
of this study. The first principal component of four cognitive tests yielded a
correlation of only -.05 with the ancestral index. Among these tests, the best
measure of fluid g, Raven matrices, had the largest correlation (-.13) with
the estimated degree of African ancestry. (In this study, a correlation of
Â?.14 would be significant at p < .05, one-tailed.) But even the correlation
between the ancestral odds index based on the three best genetic markers and
the ancestral odds index based on the remaining nine genetic markers was a
nonsignificant +.10. A measure of skin color (which has a much greater
heritability than mental test scores) correlated .27 (p < .01) with the index
of African ancestry. When skin color and SES were partialed out of the
correlation between ancestry and test scores, all the correlations were
reduced (e.g., the Raven correlation dropped from ?.13 to ?.10). Since both
skin color and SES have genetic components that are correlated with the
ancestral index and with test scores, partialing out these variables further
favors the null hypothesis by removing some of the hypothesized genetic
correlation between racial admixture and test scores.
It is likely that the conclusions of this study constitute what
statisticians refer to as Type II error, acceptance of the null hypothesis
when it is in fact false. Although these data cannot reject the null
hypothesis, it is questionable whether they are capable in fact of rejecting
an alternative hypothesis derived from the default theory. The specific
features of this data set severely diminish its power to reject the null
hypothesis. In a rather complex analysis, I have argued that the limitations
of this study (largely the lack of power due to the low validity of the
ancestral index when used with an insufficient sample size) would make it
incapable of rejecting not only the null hypothesis, but also any reasonable
alternative hypothesis. This study therefore cannot reduce the
heredity-environment uncertainty regarding the W-B difference in psychometric
g. In another instance of Type II error, the study even upholds the null
hypothesis regarding the nonexistence of correlations that are in fact well
established by large-scale studies. It concludes, for example, that there is
no significant correlation between lightness of skin color and SES of American
blacks, despite the fact that correlations significant beyond the .01 level
are reported in the literature, both for individuals' SES of origin and for
attained SES.
Skin Color and IQ. Earlier researchers relied on objective measures of skin
color as an index of the amount of African/European admixture. In sixteen out
of the eighteen studies of the IQ of American blacks in which skin color was
measured, the correlations between lightness of skin color and test scores
were positive (ranging from +.12 to +.30).
Although these positive correlations theoretically might well reflect the
proportion of Caucasian genes affecting IQ in the hybrid blacks, they are weak
evidence, because skin color is confounded with social attitudes that may
influence IQ or its educational and occupational correlates. It is more likely
that the correlations are the result of cross-assortative mating for skin
color and IQ, which would cause these variables to be correlated in the black
population. (There is no doubt that assortative mating for skin color has
taken place in the black population.) The same is of course true for the other
visible racial characteristics that may be correlated with IQ. If, in the
black population, lighter skin color (or a generally more Caucasoid
appearance) and higher IQ (or its correlates: education, occupation, SES) are
both considered desirable in a mate, they will be subject to assortative
mating and to cross-assortative mating for the two characteristics, and the
offspring would therefore tend to possess both characteristics. But any
IQ-enhancing genes are as likely to have come from the African as from the
European ancestors of the hybrid descendants.
In general, skin color and the other visible physical aspects of racial
differences are unpromising variables for research aimed at reducing the
heredity-environment uncertainty of the causal basis of the average W-B
difference in g.
Black-White Hybrids in Post-World War II Germany. We saw in the Minnesota
Transracial Adoption Study that the interracial (BW) adoptees, whose
biological fathers were black and whose biological mothers were white,
averaged lower in IQ than the adoptees who had two white parents (WW). This
finding appears to be at odds with the study conducted by Eyferth in Germany
following World War II, which found no difference between offspring of BW and
WW matings who were reared by their biological mothers. All of the fathers
(black or white) were members of the U.S. occupation forces stationed in
Germany. The mothers were unmarried German women, mostly of low SES. There
were about ninety-eight interracial (BW) children and about eighty-three white
children (WW). The mothers of the BW and WW children were approximately
matched for SES. The children averaged about 10 years of age, ranging between
ages 5 and 13 years. They all were tested with the German version of the
Wechsler Intelligence Scale for Children (WISC). The results are shown in
Table 12.6. The overall WW-BW difference is only one IQ point. As there is no
basis for expecting a difference between boys and girls (whose average IQs are
equal in the WISC standardization sample), the eight-point difference between
the WW boys and WW girls in this study is most likely due to sampling error.
But sampling error does not only result in sample differences that are larger
than the corresponding population difference; it can also result in sample
differences that are smaller than the population difference, and this could be
the case for the overall mean WW-BW difference.
This study, although consistent with a purely environmental hypothesis of
the racial difference in test scores, is not conclusive, however, because the
IQs of the probands' mothers and fathers' were unknown and the white and black
fathers were not equally representative of their respective populations, since
about 30 percent of blacks, as compared with about 3 percent of whites, failed
the preinduction mental test and were not admitted into the armed services.
Further, nothing was known about the Army rank of the black or white fathers
of the illegitimate offspring; they could have been more similar in IQ than
the average black or white in the occupation forces because of selective
preferences on the part of the German women with whom they had sexual
relations. Then, too, nearly all of the children were tested before
adolescence, which is before the genotypic aspect of IQ has become fully
manifested. Generally in adoption studies, the correlation of IQ and genotype
increases between childhood and late adolescence, while the correlation
between IQ and environment decreases markedly. Finally, heterosis (the
outbreeding effect; see Chapter 7, p. 196) probably enhanced the IQ level of
the interracial children, thereby diminishing the IQ difference between the
interracial children and the white children born to German women. A heterotic
effect equivalent to about +4 IQ points was reported for European-Asian
interracial offspring in Hawaii.
Genetic Implications of IQ and Fertility for Black and White Women.
Fertility is defined as the number of living children a woman (married or
unmarried) gives birth to during her lifetime. If, in a breeding population,
IQ (and therefore g) is consistently correlated with fertility, it will have a
compounded effect on the trend of the population's mean IQ in each generation
-- an increasing trend if the correlation is positive, a decreasing trend if
it is negative (referred to as positive or negative selection for the trait).
This consequence naturally follows from the fact that mothers' and children's
IQs are correlated, certainly genetically and usually environmentally.
If IQ were more negatively correlated with fertility in one population than
in another (for example, the American black and white populations), over two
or more generations the difference between the two populations' mean IQs would
be expected to diverge increasingly in each successive generation. Since some
part of the total IQ variance within each population is partly genetic (i.e.,
the heritability), the intergenerational divergence in population means would
also have to be partly genetic. It could not be otherwise, unless one assumed
that the mother-child correlation for IQ is entirely environmental (an
assumption that has been conclusively ruled out by adoption studies).
Therefore, in each successive generation, as long as there is a fairly
consistent difference in the correlation between IQ and fertility for the
black and white populations, some part of the increasing mean group difference
in IQ is necessarily genetic. If fertility is negatively correlated with a
desirable trait that has a genetic component, IQ for example, the trend is
called dysgenic; if positively correlated, eugenic.
The phenomenon of regression toward the population mean (see Chapter 12,
pp. 467-72) does not mitigate a dysgenic trend. Regression to the mean does
not predict that a population's genotypic mean in one generation regresses
toward the genotypic mean of the preceding generation. In large populations,
changes in the genotypic mean of a given trait from one generation to the next
can come about only through positive (or negative) selection for that trait,
that is, by changes in the proportion's of the breeding population that fall
into different intervals of the total distribution of the trait in question.
It is also possible that a downward genetic trend can be phenotypically
masked by a simultaneous upward trend in certain environmental factors that
favorably affect IQ, such as advances in prenatal care, obstetrical practices,
nutrition, decrease in childhood diseases, and education. But as the positive
effect of these environmental factors approaches asymptote, the downward
dysgenic trend will continue, and the phenotypic (IQ) difference between the
populations will begin to increase.
Is there any evidence for such a trend in the American black and white
populations? There is, at least presently and during the last half of this
century, since U.S. Census data relevant to this question have been available.
A detailed study based on data from the U.S. Census Bureau and affiliated
agencies was conducted by Daniel Vining, a demographer at the University of
Pennsylvania. His analyses indicate that, if IQ is, to some degree heritable
(which it is), then throughout most of this century (and particularly since
about 1950) there has been an overall downward trend in the genotypic IQ of
both the white and the black populations. The trend has been more unfavorable
for the black population.
But how could the evidence for a downward trend in the genotypic component
of IQ be true, when other studies have shown a gradual rise in phenotypic IQ
over the past few decades? (This intergenerational rise in IQ, known as the
"Flynn effect," is described in Chapter 10, pp. 318-22). Since the evidence
for both of these effects is solid, the only plausible explanation is that the
rapid improvement in environmental conditions during this century has offset
and even exceeded the dysgenic trend. However, this implies that the effect of
the dysgenic trend should become increasingly evident at the phenotypic level
as improvements in the environmental factors that enhance mental development
approach their effective asymptote for the whole population.
Table 12.7 shows the fertility (F) of white and black women within each one
standard deviation interval of the total distribution of IQ in each
population. (The average fertility estimates include women who have had
children and women who have not had any children by age thirty-four.) Assuming
a normal distribution (which is closely approximated for IQ within the range
of ± 2s), the table also shows: (a) the estimated proportion (P) of the
population within each interval, (b) the product of F X P, and (c) the mean IQ
of the women within each interval. The average fertility in each of the IQ
intervals and the average IQs in those intervals are negatively correlated
(-.86 for whites, ?.96 for blacks), indicating a dysgenic trend in both
populations, though stronger in the black population.
Now, as a way of understanding the importance of Table 12.7, let us suppose
that the mean IQ for whites was 100 and the mean IQ for blacks was 85 in the
generation preceding that of the present sample of women represented in Table
12.7. Further, suppose that in that preceding generation the level of
fertility was the same within each IQ interval. Then their offspring (that is,
the present generation) would have an overall mean IQ equal to the weighted
mean of the average IQ within each IQ interval (the weights being the
proportion, P, of the population falling within each IQ interval). These means
would also be 100 and eighty-five for the white and black populations,
respectively.
But now suppose that in the present generation there is negative selection
for IQ, with the fertility of the women in each IQ interval exactly as shown
in Table 12.7. (This represents the actual condition in 1978 as best as we can
determine.)
What then will be the overall mean IQ of the subsequent generation of
offspring? The weights that must be used in the calculation are the products
of the average fertility (F) in each interval and the proportion (P) of women
in each interval (i.e., the of values F X P, shown in Table 12.7). The
predicted overall weighted mean IQ, then, turns out to be 98.2 for whites and
82.6 for blacks, a drop of 1.8 IQ points and of 2.4 IQ points, respectively.
The effect thus increases the W-B IQ difference from 15 IQ points in the
parent generation to 15.6 IQ points in the offspring generation -- an increase
in the W-B difference of 0.6 IQ points in a single generation. Provided that
IQ has substantial heritability within each population, this difference must
be partly genetic. So if blacks have had a greater relative increase in
environmental advantages that enhance IQ across the generations than whites
have had, the decline of the genetic component of the black mean would be
greater than the decline of the white genetic mean, because of environmental
masking, as previously explained. We do not know just how many generations
this differential dysgenic trend has been in effect, but extrapolated over
three or four generations it would have worsening consequences for the
comparative proportions in each population that fall above or below 100 IQ.
(Of course, fertility rates could change in the positive direction, but so far
there is no evidence of this.) In the offspring generation of the population
samples of women shown in Table 12.7, the percentage of each population
above/below IQ 100 would be: whites 43.6%/56.4%, blacks 12.4%/87.6% (assuming
no increase in environmental masking between the generations). The W/B ratio
above 100 IQ is about 43.6%/12.4% = 3.5; the B/W ratio below 100 IQ is
.87.6%/56.4% = 1.55. These ratios or any approximations of them would have
considerable consequences if, for example, an IQ of 100 is a critical cutoff
score for the better-paid types of employment in an increasingly technological
and information-intensive economy (see Chapter 14). Because generation time
(measured as mother's age at the birth of her first child) is about two years
less in blacks than in whites, the dysgenic trend would compound faster over
time in the black population than in the white. Therefore, the figures given
above probably underestimate any genetic component of the W-B IQ difference
attributable to differential fertility.
This prediction follows from recent statistics on fertility rates. A direct
test of this effect would require a comparison of the average IQ of women in
one generation with the average IQ of all of their children who constitute the
next generation. Such cross-generational IQ data are available from the
National Longitudinal Study of Youth (NLSY). Large numbers of youths,
including whites and blacks, originally selected as part of a nationally
representative sample of the U.S. population, were followed to maturity. The
mean IQ of the women in this group was compared with the mean IQ of their
school-age children. Whereas the mean IQ difference between the white and
black mothers in the study was 13.2 IQ points, the difference between the
white and black children was 17.5 IQ points. That is, the overall mean W-B IQ
difference in this sample had increased by about four IQ points in one
generation. As there is no indication that the children had been reared in
less advantaged environments than their mothers, this effect is most
reasonably attributable to the negative correlation between the mothers' IQs
and their fertility, which is more marked in the NLSY sample than in the
Census sample represented in Table 12.7. But I have not found any bona fide
data set that disconfirms either the existence of a dysgenic trend for IQ of
the population as a whole or the widening disparity in the mean W-B IQ
difference.
Racial Differences in Neonate Behavior. Although individual differences in
infant psychomotor behavior (i.e., reactivity to sensory stimulation, muscular
strength, and coordination) have very little, if any, correlation with mental
ability measured from about age three years and up (and therefore are not
directly relevant to individual or group differences in g), black and white
infants, both in Africa and in America, differ markedly in psychomotor
behavior even within the first few days and weeks after birth. Black neonates
are more precocious in psychomotor development, on average, than whites, who
are more precocious in this respect than Asians. This is true even when the
black, white, and Asian babies were born in the same hospital to mothers of
similar SES background who gave birth under the same obstetrical conditions.
Early precocity in motor behavior among blacks also appears to be positively
related to degree of African ancestry and is negatively related to their SES.
African blacks are more precocious than American blacks, and, at least in the
United States, black infants of lower SES are more precocious in motor
development than blacks of middle and upper-middle SES. (The same SES
relationship is also observed in whites.) These behavioral differences appear
so early (e.g., one or two days after delivery, when the neonates are still in
hospital and have had little contact with the mothers) that purely cultural or
environmental explanations seem unlikely. Substantiated in at least three
dozen studies, these findings constitute strong evidence for innate behavioral
differences between groups.
Relationship of Myopia to IQ and Race. In Chapter 6 it was noted that
myopia (nearsightedness) is positively correlated with IQ and that the
relationship appears to be pleiotropic, that is, a gene affecting one of the
traits also has some effect on the other trait. Further, there are significant
racial and ethnic differences in the frequency of myopia. Among the major
racial groups measured, the highest rates of myopia are found in Asians
(particularly Chinese and Japanese); the lowest rates among Africans; and
Europeans are intermediate. Among Europeans, Jews have the highest rate of
myopia, about twice that of gentiles and about on a par with that of the
Asians. The same rank ordering of all these groups is found for the central
tendency of scores on highly g-loaded tests, even when these groups have had
comparable exposure to education. Cultural and environmental factors, except
as they may have had an evolutionary impact in the distant past, cannot
adequately explain the differences found among contemporary populations. Among
populations of the same ethnic background, no relationship has been found
between myopia and literacy. Comparisons of groups of the same ethnicity who
learned to read before age twelve with those who learned after age twelve
showed no difference in rates of myopia.
Table 12.8 shows the results of preinduction examinations of random samples
of 1,000 black and 11,000 white draftees for the U.S. Armed Services who were
diagnosed as (a) mildly myopic and accepted for service, and (b) too severely
myopic to be accepted. As myopia (measured in diopters) is approximately
normally distributed in the population, the percentages of whites and blacks
diagnosed as myopic can also be expressed in terms of their deviations from
the population mean in standard deviation (s) units. These average deviations
are shown on the right side of Table 12.8. They indicate the approximate
cutoff points (in s units) for the diagnosis of mild and of severe myopia in
the total frequency distribution of refractive error (extending from extreme
hyperopia, or farsightedness [+3s], to emmetropia, or normal vision [0s], to
extreme myopia [-3s]). The last column in Table 12.8 shows the W-B difference
in the cutoff point for the diagnosis of myopia, which is ls for all who had
either mild or severe myopia. Unfortunately, mental test scores on these
subjects were not reported, but from other studies one would expect the group
diagnosed as myopic to score about 0.5s higher than the nonmyopic. Studies in
Europe and in the United States have reported differences of about seven to
eight IQ points between myopes and nonmyopes.
Because myopia appears to be pleiotropic with IQ, the black-white
difference in myopia is consistent with the hypothesis of a genetic component
in the racial IQ difference. Further studies would be needed to make it an
importantly interesting hypothesis. For one thing, the pleiotropy of myopia is
not yet all that firmly established. Although one study provides fairly strong
evidence for it, confirming studies are needed before one can make any
inferences in regard to racial differences. More crucial, it is not known if
myopia and IQ are also pleiotropic in the black population; there are no
published studies of the correlation between IQ and myopia in blacks. Failure
to find such a relationship would nullify the hypothesis.
Other testable hypotheses could also be based on various highly heritable
physical traits that are correlated with g (see Chapter 6), some of which show
racial differences (e.g., the ability to taste phenylthiocarbamide, color
vision, visual acuity, susceptibility to perceptual illusions). But it is
first necessary to establish that the correlation of the physical trait with g
is pleiotropic within each racial group.
As each specific gene in the human genome related to g is discovered -- a
search that is getting underway -- a determination of the genes' frequencies
in different populations may make it possible to estimate the minimum
percentage of the between-race variance in g that has a genetic basis.
Assuming that the genetic research on quantitative trait loci already underway
continues apace, it is possible that the uncertainty regarding the existence,
and perhaps even the magnitude, of genetic group differences in g could
probably be resolved, should we so desire, within the first decade of the next
century.
ENVIRONMENTAL CAUSES OF GROUP DIFFERENCES IN g
From the standpoint of research strategy, it is sensible to ask where one
can best look for the environmental variables that are the most likely to
cause the nongenetic component of the black-white difference in g. The Factor
X hypothesis encourages a search for nongenetic factors that are unique to the
black-white difference and absent from individual differences among whites or
among blacks. The default hypothesis leads us to look at the same kinds of
environmental factors that contribute to g variance within each population as
causal factors in the g difference between groups.
Among the environmental factors that have been shown to be important within
either group, the between-families environmental variance markedly decreases
after childhood, becoming virtually nil by late adolescence (see Chapter 7,
pp. 179-81). In contrast, the within-family environmental variance remains
fairly constant from early childhood to maturity, when it accounts for nearly
all of the nongenetic variance and constitutes about 20 percent of the total
true-score variance in psychometric g. The macroenvironmental variables
responsible for the transient between-families variance in g would therefore
seem to be an unlikely source of the observed population difference in g. A
more likely source is the microenvironment that produces the within-family
variance. The macroenvironment consists of those aspects of interpersonal
behavior, values, customs, preferences, and life-style to which children are
exposed at home and which clearly differ between families and ethnic groups in
American society. The microenvironment consists of a great many small, often
random, events that take place in the course of prenatal and postnatal life.
Singly they have small effects on mental development, but in the aggregate
they may have a large cumulative effect on the individual. These
microenvironmental effects probably account for most of the nongenetic
variance in IQ that remains after childhood.
This difference in the potency and persistence of the macro- and
microenvironments has been consistently demonstrated in environmental
enrichment and intervention programs specifically intended to provide
underprivileged black children with the kinds of macroenvironmental advantages
typically experienced by white middle-class children. They include use of
educational toys and picture books, interaction with nurturing adults,
attendance in a preschool or cognitively oriented day-care center, early
adoption by well-educated white parents, and even extraordinarily intensive
cognitive development programs such as the Milwaukee Project and the
Abecedarian Project (Chapter 10, pp. 340-44). The effects of these programs on
IQ and scholastic performance have generally been short-lived, and it is still
debatable whether these improvements in the macroenvironment have actually
raised the level of g at all. This is not surprising if we consider that the
same class of environmental variables, largely associated with socioeconomic
status (SES), has so little, if any, positive effect on g or on IQ beyond
childhood within the white population. Recent research has shown that the
kinds of macroenvironmental factors typically used to describe differences
between white lower-middle class and white upper-middle class child-rearing
environments and long thought to affect children's cognitive development
actually have surprisingly little effect on IQ beyond childhood. The
macroenvironmental variables associated with SES, therefore, seem unlikely
sources of the black-white difference in g.
Hypothesizing environmental factors that are not demonstrably correlated
with IQ within one or both populations is useless from the standpoint of
scientific explanation. Unless an environmental variable can be shown to
correlate with IQ, it has no explanatory value. Many environment-IQ
correlations reported in the psychological literature, though real and
significant, can be disqualified, however, because the relevant studies
completely confound the environmental and the genetic causes of IQ variance.
Multiple correlations between a host of environmental assessments and
children's IQs ranging from below .50 to over .80 have been found for children
reared by their biological parents. But nearly all the correlations found in
these studies actually have a genetic basis. This is because children's IQs
have 50 percent of their genetic variance in IQ in common with their
biological parents, and the parents' IQs are highly correlated (usually about
.70) with the very environmental variables that supposedly cause the variance
in children's mental development. For children reared by adoptive parents for
whom there is no genetic relationship, these same environmental assessments
show little correlation with the children's IQs, and virtually zero
correlation when the children have reached adolescence. The kinds of
environmental variables that show little or no correlation with the IQs of the
children who were adopted in infancy, therefore, are not likely to be able to
explain IQ differences between subpopulations all living in the same general
culture. This is borne out by the study of transracial adoptions (reviewed
previously, pp. 472-78).
We can now review briefly the main classes of environmental variables that
have been put forth to explain the black-white IQ difference, and evaluate
each one in light of the above methodological criteria and the current
empirical evidence.
Socioeconomic Status. Measures of SES are typically a composite of
occupation, education, income, location of residence, membership in civic or
social organizations, and certain amenities in the home (e.g., telephone, TV,
phonograph, records, books, newspapers, magazines). Children's SES is that of
their parents. For adults, SES is sometimes divided into "attained SES" and
"SES of origin" (i.e., the SES of the parents who reared the individual). All
of these variables are highly correlated with each other and they share a
large general factor in common. Occupation (rank ordered on a scale from
unskilled labor to professional and managerial) has the highest loading on
this general SES factor.
The population correlations between SES and IQ for children fall in the
range .30 to .40; for adults the correlations are .50 to .70, increasing with
age as individuals approach their highest occupational level. There has
probably been a higher degree of social mobility in the United States than in
any other country. The attained SES of between one-third and one-half of the
adult population in each generation ends up either above or below their SES of
origin. IQ and the level of educational attainments associated with IQ are the
best predictors of SES mobility. SES is an effect of IQ rather than a cause.
If SES were the cause of IQ, the correlation between adults' IQ and their
attained SES would not be markedly higher than the correlation between
children's IQ and their parents' SES. Further, the IQs of adolescents adopted
in infancy are not correlated with the SES of their adoptive parents. Adults'
attained SES (and hence their SES as parents) itself has a large genetic
component, so there is a genetic correlation between SES and IQ, and this is
so within both the white and the black populations. Consequently, if black and
white groups are specially selected so as to be matched or statistically
equate on SES, they are thereby also equated to some degree on the genetic
component of IQ. Whatever IQ difference remains between the two SES-equated
groups, therefore, does not represent a wholly environmental effect. (Because
the contrary is so often declared by sociologists, it has been termed the
sociologist's fallacy.)
When representative samples of the white and black populations are matched
or statistically equated on SES, the mean IQ difference is reduced by about
one-third. Not all of this five or six IQ points reduction in the mean W-B
difference represents an environmental effect, because, as explained above,
whites and blacks who are equated on SES are also more alike in the genetic
part of IQ than are blacks and whites in general. In every large-scale study,
when black and white children were matched within each level on the scale of
the parents' SES, the children's mean W-B IQ difference increased, going from
the lowest to the highest level of SES. A statistical corollary of this
phenomenon is the general finding that SES has a somewhat lower correlation
(by about .10) with children's IQ in the black than in the white population.
Both of these phenomena simply reflect the greater effect of IQ regression
toward the population mean for black than for white children matched on
above-average SES, as previously explained in this chapter (pp. 467-72). The
effect shows up not only for IQ but for all highly g-loaded tests that have
been examined in this way. For example, when SAT scores were related to the
family income levels of the self-selected students taking the SAT for college
admission, Asians from the lowest income level scored higher than blacks from
the highest, and black students scored more than one standard deviation below
white students from the same income level. It is impossible to explain the
overall subpopulation differences in g-loaded test performance in terms of
racial group differences in the privileges (or their lack) associated with SES
and income.
Additional evidence that W-B differences in cognitive abilities are not the
same as SES differences is provided by the comparison of the profile of W-B
differences with the profile of SES differences on a variety of psychometric
tests that measure somewhat different cognitive abilities (in addition to g).
This is illustrated in the three panels of Figure 12.1. The W-B difference
in the national standardization sample on each of the thirteen subtests of the
Wechsler Intelligence Scale for Children-Revised (WISC-R) is expressed as a
point-biserial correlation between age-controlled scale scores and race
(quantitized as white = 1, black = 0). The upper (solid-line) profile in each
panel shows the full correlations of race (i.e., W or B) with the age-scaled
subtest scores. The lower (dashed-line) profile in each panel shows the
partial correlations, with the Full Scale IQ partialed out. Virtually all of
the g factor is removed in the partial correlations, thus showing the profile
of W-B differences free of g. The partial correlations (i.e., W-B differences)
fall to around zero and differ significantly from zero on only six of the
thirteen subtests (indicated by asterisks). The profile points for subtests on
which whites outperform blacks are positive; those on which blacks outperform
whites are negative (i.e., below zero).
Whites perform significantly better than blacks on the subtests called
Comprehension, Block Design, Object Assembly, and Mazes. The latter three
tests are loaded on the spatial visualization factor of the WISC-R. Blacks
perform significantly better than whites on Arithmetic and Digit Span. Both of
these tests are loaded on the short-term memory factor of the WISC-R. (As the
test of arithmetic reasoning is given orally, the subject must remember the
key elements of the problem long enough to solve it.) It is noteworthy that
Vocabulary is the one test that shows zero W-B difference when g is removed.
Along with Information and Similarities, which even show a slight (but
nonsignificant) advantage for blacks, these are the subtests most often
claimed to be culturally biased against blacks. The same profile differences
on the WISC-R were found in another study based on 270 whites and 270 blacks
who were perfectly matched on Full Scale IQ.
Panels B and C in Figure 12.11 show the profiles of the full and the
partial correlations of the WISC-R subtests with SES, separately for whites
and blacks. SES was measured on a five-point scale, which yields a mean W-B
difference of 0.67 in standard deviation units. Comparison of the profile for
race in Panel A with the profiles for SES in Panels B and C reveals marked
differences. The Pearson correlation between profiles serves as an objective
measure of their degree of similarity. The profiles of the partial
correlations for race and for SES are negatively correlated: ?.45 for whites;
?.63 for blacks. The SES profiles for whites and for blacks are positively
correlated: +0.59. While the profile of race X subtest correlations and the
profile of SES X subtest correlations are highly dissimilar, the black profile
of SES X subtest scores and the white profile of SES X subtest scores are
fairly similar. Comparable results were found in another study that included
racial and SES profiles based on seventy-five cognitive variables measured in
a total sample of 70,000 high school students. The authors concluded,
"[C]omparable levels of socioeconomic status tend to move profiles toward
somewhat greater degrees of similarity, but there are also powerful causal
factors that operate differentially for race [black-white] that are not
revealed in these data. Degree of [economic] privilege is an inadequate
explanation of the differences" (p. 205).
Race and SES Differences in Educational Achievement. Because the specific
knowledge content of educational achievement tests is explicitly taught and
learned in school, of course, scores on such tests reflect not only the
individual's level of g but also the amount and type of schooling, the quality
of teaching, and the degree of motivation for scholastic achievement.
Nevertheless, tests of educational achievement are quite g-loaded, especially
for groups of high school age with comparable years of schooling.
It is informative, therefore, to look at the black-white difference on
achievement tests for the two most basic scholastic subjects, reading/verbal
skills and mathematics, when a number of SES-related factors have been
controlled. Such data were obtained on over 28,000 high school students in two
independent large-scale surveys, the National Longitudinal Survey of Youth
(NLSY) and the National Education Longitudinal Survey (NELS). In the two
studies, the actual W-B mean differences on three tests (Math, Verbal,
Reading) ranged from about 0.75 to 1 .25s. Regression analyses of the test
scores obtained in each study controlled for a number of SES-related factors:
family income, mother's education, father's education, age of mother at the
birth of the proband, sex, number of siblings, mother single or married,
mother working (or not), region of the country in which the proband lives.
When the effects of these SES factors on test scores statistically were
removed by regression, the mean W-B differences in the NLSY were: for Math
0.49s, for Verbal 0.55s; in the NELS, for Math 0.59s, for Reading 0.51s. In a
multiple-regression analysis for predicting the achievement test scores from
twenty-four demographic and personal background variables, no other variable
among the twenty-four had a larger predictive weight (independently of all the
other variables in the regression equation) than the dichotomous W/B variable.
Parents' education was the next most strongly predictive variable
(independently of race and all other variables), averaging only about half as
much predictive weight as the W/B variable. That most of the predictive power
of parental education in these analyses is genetically mediated is inferred
from the studies of individuals reared by adoptive parents, whose IQs and
educational attainment, have a near-zero correlation with that of the
adoptees. See Chapter 7.) Thus for measures of educational achievement, as for
IQ. demographic and SES variables have been shown to account for only a small
part of the W-B difference.
The Cumulative Deficit Theory. Cumulative deficit is really an empirical
phenomenon that, in the 1960s, became a general theory of how environmental
deprivation progressively decreased the IQ and scholastic performance of black
children with increasing age relative to white age norms. The phenomenon
itself is more accurately termed 'age-related decrement in IQ and
achievement,' which is neutral as regards its nature and cause. The theory of
cumulative deficit, its history, and empirical literature have been reviewed
elsewhere. The theory says that environmental and educational disadvantages
that cause a failure to learn something at an early age cause further failure
at a later age and the resulting performance deficit, which affects IQ and
scholastic achievement alike increases with age at an accelerating rate,
accumulating like compound interest. At each stage of learning, the increasing
deficit of prerequisite knowledge and skills hinders learning at each later
stage of learning. This theory of the cause of shortfall in IQ and achievement
of blacks and other poorly achieving group was a prominent feature of the
rationale for the large-scale federal program intended to ameliorate these
conditions begun in the 1960s -- interventions such as Head Start,
compensatory education, and a host of experimental preschool programs for
disadvantaged children.
The raw scores on all mental tests, including tests of scholastic
achievement show an increasing divergence among individuals as they mature,
from earl childhood to the late teens. In other words, both the mean and the
standard deviation of raw scores increase with age. Similarly, the mean W-B
difference in raw scores increases with age. This age-related increase in the
mean W-B raw score difference, however, is not what is meant by the term
"cumulative deficit." The cumulative deficit effect can only be measured at
each age in term of the standardized scores (i.e., measures in unit of the
standard deviation) for each age. A significant increase of the mean W-B
difference in standardize scores (i.e., in s units) constitutes evidence for
cumulative deficit, although this term does not imply the nature of its cause,
which has remained purely hypothetical.
The mental test and scholastic achievement data of large-scale studies,
such as those from the famous Coleman Report based on 450,000 pupils in 6,00
schools across the nation, failed to find any sign of the cumulative deficit
effect for blacks in the nation as a whole. However, suggestive evidence was
found for some school districts in the rural South, where the W-B difference
in test of verbal ability increased from l.5s to l.7s to l.9s in Grades 6, 9,
and 12, respectively. These findings were only suggestive because they were
entirely based on cross-sectional data (i.e., different samples tested at each
grade level rather than longitudinal data (the same sample tested at different
grade levels).
Cross-sectional studies of age effects are liable to migratory and
demographic changes in the composition of a local population.
Another method with fewer disadvantages even than a longitudinal study
(which can suffer from nonrandom attrition of the study sample) compares the
IQs of younger and older siblings attending the same schools. Cumulative
deficit would be revealed by consistent IQ differences in favor of younger (Y)
rather than older (O) siblings. This is measured by the signed difference
between younger and older siblings (i.e., Y-O) on age-standardization test
scores that constitute an equal-interval scale throughout their full range.
Averaged over a large number of sibling pairs, the mean Y-O difference
represents only an environmental or nongenetic effect, because there is
nothing in genetic theory that relates sibling differences to birth order. The
expected mean genotypic value of the signed differences between younger and
older full siblings is therefore necessarily zero. A phenotypic Y-O difference
would indicate the presence of a cumulative IQ deficit with increasing age.
This method was applied to IQ data obtained from all of the full siblings
from kindergarten through grade six in a total of seventeen schools in
California that had about 60 percent white and 40 percent black pupils. In
general, there was no evidence of a cumulative deficit effect, either for
blacks or for whites, with the exception of blacks in the primary grades, who
showed the effect only on the verbal part of the IQ test that required some
reading skill; the effect was largely attributable to the black males' greater
lag in early reading skills compared to the black females; in the early years
of schooling, boys in general tend to advance less rapidly in reading than do
girls. Blacks showed no cumulative deficit effect at all in nonverbal IQ, and
beyond the elementary grades there was no trace of a cumulative deficit in
verbal IQ.
Overall, the cumulative deficit hypothesis was not borne out in this
California school district, although the mean W-B IQ difference in this school
population was greater than ls. However, the black population in this
California study was socioeconomically more advantaged and socially more
integrated with the white population than is true for blacks in many other
parts of the country, particular those in the rural South. It is possible that
the California black pupils did not show a cumulative deficit in IQ because
the vast majority of them had grown up in a reasonably good environment and
the cumulative deficit phenomenon might be manifested only when the blacks'
degree of environmental disadvantage falls below some critical threshold for a
normal rate of mental growth.
Exactly the same methodology, based on Y-O sibling differences in IQ, was
therefore applied in an entire school system of a county in rural Georgia. It
perfectly exemplified a generally poor community, especially its black
population, which was well below the national black average in SES. Although
the school population (49 percent white and 51 percent black) had long since
been racially desegregated when the test data were obtained, the blacks' level
of scholastic performance was exceedingly low by national standards. The mean
W-B IQ difference for the entire school population was 1 .95s (white mean 102,
SD 16.7; black mean 71, SD 15.1). If cumulative deficit were a genuine
phenomenon and not an artifact of uncontrolled demographic variables in
previous cross-sectional studies, the sibling methodology should reveal it in
this rural Georgia community. One would be hard put to find a more
disadvantaged black community, by all indices, anywhere in the United States.
This study, therefore, provides a critical test of the cumulative deficit
hypothesis.
The rural Georgia study included all of the full siblings of both racial
groups from kindergarten through grade twelve. Appropriate forms of the same
standardized IQ test (California Test of Mental Maturity) were used at each
grade level. An examination of the test's scale properties in this population
showed that it measured IQ as an interval scale throughout the full range of
IQ at every age in both the black and white groups, had equally high
reliability for both groups, and, despite the nearly two standard deviations
IQ difference between the groups, IQ had an approximately normal distribution
within each group.
No cumulative deficit effect could be detected in the white group. The Y-O
sibling differences for whites showed no increase with age and they were
uncorrelated with the age difference between siblings.
The result for blacks, however, was markedly different. The cumulative
deficit effect was manifested at a high level of significance (p < .001).
Blacks showed large decrements in IQ with increasing age that were almost
linear from five to sixteen years of age, for both verbal and nonverbal IQ.
For total IQ, the blacks had an average rate of IQ decrement of 1.42 points
per year during their first ten or eleven years in school -- in all, a total
decrement of about sixteen IQ points, or about half the total W-B difference
of thirty-one IQ points that existed in this population.
It would be difficult to attribute the cause of this result to anything
other than the effect of an exceedingly poor environment. A genetic hypothesis
of the cumulative deficit effect seems highly unlikely in view of the fact
that it was not found in blacks in the California study, although the sample
size was large enough to detect even a very small effect size at a high level
of statistical significance. Even if the blacks in California had, on average,
a larger amount of Caucasian ancestry than blacks in rural Georgia, the
cumulative deficit effect should have been evident, even if to a lesser
degree, in the California group if genetic factors were involved. Therefore,
the cause of the cumulative deficit, at least as observed in this study, is
most probably of environmental origin. But the specific nature of the
environmental cause remains unknown. The fact that it did not show up in the
California sample suggests that a cumulative deficit does not account for any
appreciable part of the overall W-B IQ difference of about 1s in nationally
representative samples.
The overall W-B IQ difference of 1 .95s in the rural Georgia sample would
be reduced to about ls if the decrement attributable to the cumulative effect
were removed. What aspects of the environment could cause that large a
decrement? It would be worthwhile to apply the sibling method used in these
studies in other parts of the country, and in rural, urban or "inner city,"
and suburban populations of whites and blacks to determine just how widespread
this cumulative deficit effect is in the black population. It is probably the
most promising strategy for discovering the specific environmental factors
involved in the W-B IQ difference.
The Interaction of Race X Sex X Ability. In 1970, it came to my attention
that the level of scholastic achievement was generally higher for black
females than for black males. A greater percentage of black females than of
black males graduate from high school, enter and succeed in college, pass
high-level civil service examinations, and succeed in skilled and professional
occupations. A comparable sex difference is not found in the white population.
To investigate whether this phenomenon could be attributed to a sex difference
in IQ that favored females relative to males in the black population, I
proposed the hypothesis I called the race X sex X ability interaction. It
posits a sex difference in g (measured as IQ), which is expressed to some
extent in all of the "real life" correlates of g. Because of the normal
distribution of g for both sexes, selection on criteria that demand levels of
cognitive ability that are well above the average level of ability in the
population will be most apt to reveal the hypothesized sex difference in g and
all its correlates. Success in passing high-level civil service examinations,
in admission to selective colleges, and in high-level occupations, all require
levels of ability well above the population average. They should therefore
show a large difference in the proportions of each sex that can meet these
high selection criteria, even when the average sex difference in the
population as a whole is relatively small. This hypothesis is shown
graphically in Figure 12.12. For example, if the cutoff score on the criterion
for selection is at the white mean IQ of 100 (which is shown as la above the
black mean IQ of eighty-five), and if the black female-male difference (F-M)
in IQ is only 0.2s (i.e., three IQ points), the F/M ratio above the cutoff
score would be about 1.4 females to 1 male. If the selection cutoff score (X)
is placed 2s above the black mean, the F/M ratio would be 1.6 females to 1
male.
This hypothesis seemed highly worthy of empirical investigation, because if
the sex difference in IQ for the black population were larger than it is for
the white population (in which it is presumed to be virtually zero), the sex
difference could help identify specific environmental factors in the W-B IQ
difference itself. It is well established that the male of every mammalian
species is generally more vulnerable to all kinds of environmental stress than
is the female. There are higher rates of spontaneous abortion and of
stillbirths for male fetuses and also a greater susceptibility to communicable
diseases and a higher rate of infant mortality. Males are also psychologically
less well buffered against unfavorable environmental influences than are
females. Because a higher proportion of blacks than of whites grow up in poor
and stressful environmental conditions that would hinder mental development, a
sex difference in IQ, disfavoring males, would be greater for blacks than for
whites.
I tested this race X sex X ability interaction hypothesis on all of the
test data I could find on white and black samples that provided test
statistics separately for males and females within each racial group. The
analyses were based on a collection of various studies which, in all, included
seven highly g-loaded tests and a total of more than 20,000 subjects, all of
school age and most below age thirteen. With respect to the race X sex
interaction, the predicted effect was inconsistent for different tests and in
different samples. The overall effect for the combined data showed a mean
female-male (F-M) difference for blacks of +0.2s and for whites of +0.ls.
Across various tests and samples, the F-M differences for whites and for
blacks correlated +.54 (p < .0 1), indicating that similar factors for both
races accounted for the slight sex difference, but had a stronger effect for
blacks. With the large sample sizes, even these small sex differences
(equivalent to 3 and 1.5 IQ points for blacks and whites, respectively) are
statistically significant. But they are too small to explain the quite large
differences in cognitively demanding achievements between male and female
blacks. Apparently the sex difference in black achievement must be attributed
to factors other than g per Se. These may be personality or motivational
factors, or sexually differential reward systems for achievement in black
society, or differential discrimination by the majority culture. Moreover,
because the majority of subjects were of elementary school age and because
girls mature more rapidly than boys in this age range, some part of the
observed sex difference in test scores might be attributable to differing
rates of maturation. Add to this the fact that the test data were not
systematically gathered so as to be representative of the whole black and
white populations of the United States, or even of any particular region, and
it is apparent that while this study allows statistical rejection of the null
hypothesis, it does so without lending strong support to the race X sex
interaction hypothesis.
The demise of the hypothesized race X sex interaction was probably assured
by a subsequent large-scale study that examined the national standardization
sample of 2,000 subjects on the WISC-R, the 3,371 ninth-grade students in
Project TALENT who were given an IQ test, and a sample of 152,944 pupils in
grades 5, 8, and 11 in Pennsylvania, who were given a test measuring verbal
and mathematical achievement. The subjects' SES was also obtained in all three
data sets. In all these data, the only significant (p < .05 with an N of
50,000) evidence of a race X sex X ability interaction was on the verbal
achievement test for eleventh graders, and even it is of questionable
significance when one considers the total number of statistical tests used in
this study. In any case, it is a trifling effect. Moreover, SES did not enter
into any significant interaction with race and sex.
Still another large data set used the Vocabulary and Block Design subtests
of the WISC-R administered to a carefully selected national probability sample
of 7,119 noninstitutionalized children aged six to eleven years. The
Vocabulary + Block Design composite of the WISC-R has the highest correlation
with the WISC-R Full Scale IQ of any other pair of subtests, and both
Vocabulary and Block Design are highly g loaded. These data also showed no
effects that are consistent with the race X sex X ability interaction
hypothesis for either Vocabulary or Block Design. Similarly, the massive data
of the National Collaborative Perinatal Project, which measured the IQs of
more than 20,000 white and black children at ages four and seven years,
yielded such a small interaction effect as to make its statistical
significance virtually irrelevant.
Although the race X sex interaction hypothesis must now be discarded, it
has nevertheless raised an important question about the environmental factors
that have biological consequences for mental development as a possible cause
of the W-B difference in g.
NONGENETIC BIOLOGICAL FACTORS IN THE W-B DIFFERENCE
The psychological, educational, and social factors that differ between
families within racial groups have been found to have little, if any, effect
on individual differences in the level of g after childhood. This class of
variables, largely associated with socioeconomic differences between families,
has similarly little effect on the differing average levels of g between
native-born, English-speaking whites and blacks. By late adolescence, the IQs
of black and white infants adopted by middle or upper-middle SES white parents
are, on average, closer to the mean IQ of their respective populations than to
that of either their adoptive parents or their adoptive parents' biological
children. Preschool programs such as Head Start and the much more intensive
and long-term educational interventions (e.g., the Milwaukee Project and the
Abecedarian Project) have been shown to have little effect on g.
It is reasonable, therefore, to look beyond these strictly social and
educational variables and to consider the nongenetic, or environmental,
factors of a biological nature that may have adverse effects on mental
development. These include prenatal variables such as the mother's age,
general health, and life-style during pregnancy (e.g., maternal nutrition,
smoking, drinking, drug habits), number of previous pregnancies, spacing of
pregnancies, blood-type incompatibility (e.g., kernicterus) between mother and
fetus, trauma, and history of X-ray exposure. To these can be added the many
obstetrical and perinatal variables, including premature birth, low birth
weight, duration of labor, forceps delivery, anoxia at birth. Postnatal
factors shown to have adverse effects include neonatal and childhood diseases,
head trauma, and malnutrition during the period of maximum growth of the brain
(from birth to five years of age). Although each of these biological factors
singly may have only a very small average effect on IQ in the population, the
cumulative effect of many such adverse microenvironmental factors on any one
individual can produce a decrement in g that has significant consequences for
that individual's educability. Also, certain variables, though they may have a
large negative effect on later IQ for some individuals, occur with such low
frequency in the population as to have a negligible effect on the total
variance in IQ, either within or between groups.
The largest study of the relationship between these nongenetic factors and
IQ is the National Collaborative Perinatal Project conducted by the National
Institutes of Health. The study pooled data gathered from twelve metropolitan
hospitals located in different regions of the United States. Some 27,000
mothers and their children were studied over a period of several years,
starting early in the mother's pregnancy, through the neonatal period, and at
frequent intervals thereafter up to age four years (when all of the children
were given the Stanford Binet IQ test). Most of this sample was also tested at
age seven years with the Wechsler Intelligence Scale for Children (WISC).
About 45 percent of the sample children were white and 55 percent were black.
The white sample was slightly below the national average for whites in SES;
the black sample was slightly higher in SES than the national black average.
The white mothers and black mothers differed 1.02 on a nonverbal IQ test. The
mean W-B IQ difference for the children was 0.86s at age four years and 1.0ls
at age seven years.
A total of 168 variables (in addition to race) were screened. They measured
family characteristics, family history, maternal characteristics, prenatal
period, labor and delivery, neonatal period, infancy, and childhood. The first
point of interest is that eighty-two of the 168 variables showed highly
significant (p < .001) correlations with IQ at age four in the white or in the
black sample (or in both). Among these variables, 59 (or 72 percent) were also
correlated with race; and among the 33 variables that correlated .10 or more
with IQ, 31 (or 94 percent) were correlated with race.
Many of these 168 variables, of course, are correlated with each other and
therefore are not all independently related to IQ. However, a multiple
regression analysis applied to the set of sixty-five variables for which there
was complete data for all the probands in the study reveals the proportion of
the total variance in IQ that can be reliably accounted for by all sixty-five
variables. The regression analyses were performed separately within groups,
both by sex (male-female) and by race (white-black), yielding four separate
analyses. The percentage of IQ variance accounted for by the sixty-five
independent variables (averaged over the four sex X race groups) was 22.7
percent. This is over one-fifth of total IQ variance.
However, not all of this variance in these sixty-five variables is
necessarily environmental. Some of the IQ variance is attributable to regional
differences in the populations surveyed, as the total subject sample was
distributed over twelve cities in different parts of the country. And some of
the variance is attributable to the mother's education and socioeconomic
status. (This information was not obtained for fathers.) Mother's education
alone accounts for 13 percent of the children's IQ variance, but this is most
likely a genetic effect, since adopted children of this age show about the
same degree of relationship to their biological mothers with whom they have
had no social contact. The proband's score on the Bayley Scale obtained at
eight months of age also should not be counted as an environmental variable.
This yields four variables in the regression analysis that should not be
counted strictly as environmental factors -- region, mother's education, SES,
and child's own test score at eight months. With the effects of these
variables removed, the remaining sixty-one environmental variables account for
3.4 percent of the variance in children's IQ, averaged over the four race X
sex groups. Rather unexpectedly, the proportion of environmental variance in
IQ was somewhat greater in the white sample than in the black (4.2 percent vs.
2.6 percent). The most important variable affecting the probands' IQ
independently of mother's education and SES in both racial groups was mother's
age, which was positively correlated with child's IQ for mothers in the age
range of twelve to thirty-six years.
How can we interpret these percentage figures in terms of IQ points?
Assuming that the total variance in the population consisted only of the
variance contributed by this large set of environmental variables, virtually
all of a biological but nongenetic nature, the standard deviation of
true-score IQs in the population would be 2.7 IQ points. The average absolute
IQ difference between pairs of individuals picked at random from this
population would be three IQ points. This is the average effect that the
strictly biological environmental variables measured in the Collaborative
Project has on IQ. It amounts to about one-fifth of the average mean W-B IQ
difference.
Unfortunately, the authors of the Collaborative Project performed only
within-group regression analyses. They did not enter race as an independent
variable into the multiple regression analysis, stating explicitly that the
independent effect of race was not assessed. A regression analysis in which
race, as an independent variable, was entered after all of the nongenetic
environmental variables could have shown the independent effect of race on IQ
when the effect of the environmental variables was removed. This would have
allowed testing of the strict form of the default hypothesis. It posits that
the environmental variance between groups is the same as the environmental
variance within groups, in which case about three points of the fifteen points
mean W-B IQ difference would be attributable to nongenetic biological
environment, assuming that all of these environmental factors worked in a
harmful direction for blacks.
There are three reasons to suspect that this study underrepresents the
effects of the nongenetic biological environment on the IQ of blacks in the
general populations.
1. The black sample is somewhat above average in SES compared to the black
population as a whole. What today is termed the underclass, which includes
some one-fourth to one-third of the total black population, is
underrepresented in the study sample; much of the U.S. black population is at
or below the zero point on the scale of SES used in this study, as shown in
Figure 12.13. The biological factors that adversely affect IQ almost certainly
have a higher incidence in this poorest segment of the population, which was
underrepresented in the Collaborative Project.
2. The selection of mothers entering the study excluded all women who had
not received care in the prenatal clinic from early in their pregnancies. All
of the subjects in the study, both black and white, received prenatal care,
while many underclass mothers do not receive prenatal care. The Project
mothers also received comparable high-quality obstetrical and perinatal
treatment, followed up with comparable neonatal and infant medical care
provided by the collaborating hospitals. Pregnancies in the underclass are
typically without these medical advantages.
3. Certain environmental factors that in recent years have been studied in
relation to IQ, such as nutrition, breast feeding, fetal alcohol syndrome, and
drug abuse, were not considered in the Collaborative Project conducted three
decades ago. The causal role of these factors should be examined, as should
the increasing incidence of premature delivery and low birth weight. The
latter variables are in fact the strongest correlates of low IQ.
Low Birth Weight (LBW). Infant mortality can be viewed as the extreme point
on a continuum of pathology and reproductive casualty. The rate of neonatal
and infant mortality in a particular population, therefore, serves as an
indicator of other sublethal but nevertheless damaging health conditions,
which negatively affect children's mental development. While the infant
mortality rate has steadily declined in the population as a whole over the
last several decades, it is still about twice as great in the U.S. black
population (17.6 per 1,000 live births) as in the white population (8.5 per
1,000). Other minority populations differ only slightly from whites; of the
groups with lower SES than the white average (such as Hispanics, American
Indians, and native Alaskans) the infant mortality rate averages about 8.6 per
1,000. Asians have by far the lowest average, about 4.3 per 1 ,000.
LBW is defined as a birth weight under 2,500 grams (5.5 pounds). It
represents a region on the risk continuum of which infant death is the end
point. Therefore, the rates of LBW and of infant mortality are highly
correlated across different subpopulations. Although premature birth incurs
its own risks for the neonate's development, it is not the same as LBW,
because a premature baby may have normal weight for its gestational age. LBW
also occurs in full-term babies, who are thereby at increased risk for
retarded mental development and for other developmental problems, such as
behavioral adjustment, learning disabilities, and poor scholastic performance.
Throughout the full range of LBW, all of these developmental risks increase as
birth weight decreases. For present purposes, it is important to note that a
disproportionate number of the babies born to black women are either premature
or of LBW. Although black women have about 17 percent of all the babies born
in the United States today, they have about 32 percent of the LBW babies.
The mother's age is the strongest correlate of LBW and is probably its
chief causal factor. Teenage mothers account for about one-fourth of LBW
babies. Even teenage girls under age eighteen who have had proper health care
during pregnancy are twice as likely to have premature or LBW babies as women
in their twenties. One suggested explanation is that teenage girls are still
in their growing period, which causes some of the nutrients essential for
normal development to be diverted from the growing fetus to the growing
mother. In addition to teenage pregnancy, other significant correlates of LBW
are unmarried status, maternal anemia, substance abuse of various kinds, and
low educational levels. SES per se accounts for only about 1 percent of the
total variance in birth weight, and race (black/white) has a large effect on
birth weight independently of SES. Most of the W-B difference in birth weight
remains unaccounted for by such variables as SES, poverty status, maternal
age, and education. Prenatal medical care, however, has a small effect.
LBW, independently of SES, is related to low maternal IQ. Controlling for
IQ reduces the B-W disparity in the percentage of LBW babies by about
one-half. But even college-educated black women have higher rates of LBW
babies and therefore also higher rates of infant mortality than occur for
white women of similar educational background (10.2 per thousand vs. 5.4 per
thousand live births). When black babies and white babies, both born to
college-educated parents, are statistically equated for birth weight, they
have the same mortality rates in the first year of life. In the general
population, however, black infants who are not of LBW have a mortality rate
almost twice that of white infants.
The cause of the high rate of LBW (and the consequently higher infant
mortality rate) in the black population as compared with other racial or
ethnic groups, including those that are less advantaged than blacks, remains a
mystery. Researchers have been able to account for only about half of the
disparity in terms of the combined obvious factors such as poverty, low levels
of SES, education, health and prenatal care, and mother's age. The
explanations run the gamut from the largely genetic to the purely
environmental. Some researchers regard LBW as an inherent, evolved, genetic
racial characteristic. Others have hypothesized that black mothers may have
subtle health problems that span generations, and some have suggested subtle
but stressful effects of racism as a cause.
Since the specific causes of LBW largely remain unidentified while the
survival rate of LBW babies has been increasing over the past 20 years,
researchers are now focusing on ways to mitigate its risks for developmental
disabilities and to enhance the cognitive and behavioral development of LBW
babies. The experimental treatment was highly similar to that provided in the
Abecedarian Project described in Chapter 10 (pp. 342-44). The largest program
of this kind, conducted with nearly one thousand LBW infants in eight states,
showed large Stanford-Binet IQ gains (compared against a control group) for
LBW children when they were tested at thirty-six months of age. The heavier
LBW probands (BW between 2,001 and 2,500 grams) scored an average of 13.2 IQ
points above the untreated control group (98.0 vs. 84.8); the lighter probands
(<2,000 grams) scored 6.6 IQ points above the controls (91 vs 84.4). Because
IQ measured at thirty-six months is typically unstable, follow-up studies are
crucial to determine if these promising IQ gains in the treated group would
persist into the school years. The data obtained in the first follow-up,
conducted when the children were five years of age, show that the apparent
initial gain in IQ had not been maintained; the intervention group scored no
higher than the control group. There was a further follow-up at age eight, but
its results have not yet been reported.
A study of forty-six LBW black and forty-six LBW white children matched for
gestational age and birth weight (all between 1,000 and 2,500 grams and
averaging 1,276 grams for blacks and 1,263 grams for whites) showed that when
the degree of LBW and other IQ-related background variables were controlled,
the W-B IQ difference, even at three years of age, was nearly the same as that
found for the general population. None of the LBW children in these selected
samples had any chronic illness or neurological abnormality; all were born to
mothers over eighteen years of age and had parents who were married. The black
mothers and white mothers were matched for educational level. (Black mothers
actually had slightly more education than white mothers, although the
difference was statistically insignificant, t < 1). When the children were
tested at thirty-three to thirty-four months, the mean Stanford-B met IQ of
the black and the white groups was 90 and 104, respectively, a difference of
1s. In the same study, groups of middle class black and white children of
normal birth weight and gestational age, matched on maternal education, had a
mean Stanford-Binet IQ of ninety-seven and 111, respectively (a 1.2s
difference).
Nutrition. A most remarkable study conducted at Cambridge University showed
that the average IQ of preterm, LBW babies was strongly influenced by whether
the babies received mother's milk or formula while in hospital. The probands
were 300 babies who weighed under 1,850 grams at birth. While in hospital, 107
of the babies received formula, and 193 received mother's milk. The effects of
breast feeding per se were ruled out (at least while the babies were in
hospital), as all of the babies were fed by tube. At 7.5 to eight years of
age, WISC-R IQs were obtained for all 300 children. Astonishingly, those who
had received maternal milk outscored those who had been formula-fed by 10.2 IQ
points (103.0 vs. 92.8). The Verbal and Performance scales showed identical
effects. After a regression analysis that adjusted for confounding factors
(SES, mother's age and education, birth weight, gestational age, birth rank,
sex, and number of days in respirator), the difference between the two groups
was still a highly significant 8.3 IQ points. Not all of the group who
received mother's milk had it exclusively; some received variable proportions
of mother's milk and formula. It was therefore possible to perform a critical
test of whether the effect was genuinely attributable to the difference
between mother's milk and formula or was attributable to some other factor.
There was in fact a significant linear dose-response relationship between the
amount of mother's milk the babies received and IQ at age 7.5 to eight years.
Whether the milk was from the baby's own mother or from donors, it had a
beneficial effect on IQ compared against the formula. The study did not
attempt to determine whether mother's milk has a similarly advantageous effect
for babies who are full-term and of normal birth weight.
The results, however, would seem to be highly relevant to the IQ of black
children in contemporary U.S. for two reasons: (1) as was already pointed out,
black infants are much more frequently of LBW than are those of other
racial/ethnic groups, and (2) they are much less frequently breast fed.
Surveys of the National Center for Health Statistics show that, as of 1987,
61.1 percent of non-Hispanic white babies and 25.3 percent of non-Hispanic
black babies are breast fed. Black women who breast feed also end nursing
sooner than do white mothers. These data suggest that some part of the average
W-B IQ difference may be attributable to the combined effects of a high rate
of LBW and a low frequency of breast feeding. Nationwide in the 1940s and
1950s, breast feeding declined markedly to less than 30 percent, as greater
numbers of women entered the work force. But since the late 1950s there has
been an overall upward trend in the percentage of babies who are breast fed,
now exceeding 60 percent.
The practice of breast feeding itself is positively correlated with SES,
maternal age and education, and, interestingly, with birth weight. The
frequency of breast feeding for LBW babies (<2,500 grams) is only 38.4 percent
as against 56.1 percent for babies of normal birth weight (>2,500 grams). But
as regards mental development it is probably the LBW babies that stand to
benefit the most from mother's milk. Human milk apparently contains factors
that affect nervous system development, probably long-chain lipids, hormones,
or other nutrients involved in brain growth, that are not present in formulas.
More generally, Eysenck has hypothesized that nutritional deficiencies may
be major nongenetic cause of the W-B IQ difference and that research should be
focused on dietary supplements to determine their effect on children's IQ. He
is not referring here to the type of malnutrition resulting from low caloric
intake and insufficient protein, which is endemic in parts of the Third World
but rare in the United States. Rather, he is referring to more or less
idiosyncratic deficiencies associated with the wide range of individual
differences in the requirements for certain vitamins and minerals essential
for optimal brain development and cognitive functions. These individual
differences can occur even among full siblings reared together and having the
same diet. The dietary deficiency in these cases is not manifested by the
gross outward signs of malnutrition seen in some children of Third World
countries, but can only be diagnosed by means of blood tests. Dietary
deficiencies, mainly in certain minerals and trace elements, occur even in
some middle-class white families that enjoy a normally wholesome diet and show
no signs of malnutrition. Blood samples were taken from all of the children in
such families prior to the supplementation of certain minerals to the diet and
later analyzed. They revealed that only those children who showed a
significant IQ gain (twice the test's standard error of measurement, or nine
IQ points) after receiving the supplements for several months previously
showed deficiencies of one or more of the minerals in their blood. The
children for whom the dietary supplement resulted in IQ gains were called
"responders." The many children who were nonresponders showed little or no
blood evidence of a deficiency in the key nutrients. Most interesting from a
theoretical standpoint is that the IQ gains showed up on tests of fluid g
(Gf), which measures immediate problem-solving ability, but failed to do so on
tests of crystallized g (Gc), such as general information and vocabulary,
which measure the past learning that had taken place before dietary
supplements were begun. Eysenck believes it is more likely that a much larger
percentage of black children than of white children have a deficiency of the
nutritional elements that, when supplemented in the diet, produce the observed
gain in Gf, which eventually, of course, would also be reflected in Gc through
the child's improved learning ability. This promising hypothesis, which has
not yet been researched with respect to raising black children's level of g,
is well worth studying.
Drug Abuse during Pregnancy. Many drugs can be more damaging to the
developing fetus than to an adult, and drug abuse takes a higher toll on the
mental development of newborns in the underclass than it does in the general
population. Among all drugs, prenatal exposure to alcohol is the most frequent
cause of developmental disorders, including varying degrees of mental
retardation. Fetal alcohol syndrome (FAS), a severe form of prenatal damage
caused by the mother's alcohol intake, is estimated to affect about three per
1,000 live births. The signs of FAS include stunted physical development and
characteristic facial features, besides some degree of behavioral impairment
-- at school age about half of such children are diagnosed as mentally
retarded or as learning disabled. The adverse effect of prenatal exposure to
alcohol on the infant's later mental development appears to be a continuous
variable; there is no safe threshold of maternal alcohol intake below which
there is zero risk to the fetus. Therefore the U.S. Surgeon General has
recommended that women not drink at any time during pregnancy. Just how much
of the total population variance in IQ might be attributed to prenatal alcohol
is not known, but in the underclass segment of the population its effect,
combined with other microenvironmental factors that lower IQ, is apt to be
considerable.
After alcohol, use of barbiturates, or sedatives like drugs, by pregnant
women is the most prevalent source of adverse effects on their children's IQ.
Between 1950 and 1970, an estimated twenty-two million children were born in
the United States to women who were taking prescribed barbiturates. Many
others, without prescription, abused these drugs. Two major studies were
conducted in Denmark to determine the effect of phenobarbital, a commonly used
barbiturate, on the adult IQ of men whose mothers had used this drug during
pregnancy. The men's IQs were compared with the IQs of controls matched on ten
background variables that are correlated with IQ, such as proband's age,
family SES when the probands were infants, parents' ages, whether the
pregnancy was "wanted" or "not wanted," etc. Further control of background
variables was achieved statistically by a multiple regression technique. In
the first study, IQ was measured by the Wechsler Adult Intelligence Scale
(WAIS), an individually administered test; the second study used the Danish
Military Draft Board Intelligence Test, a forty-five-minute group test. In
both studies the negative effect of prenatal phenobarbital on adult IQ, after
controlling for background variables, was considerable. In the authors' words:
"The individuals exposed to phenobarbital are not mentally retarded nor did
they have any obvious physical abnormalities. Rather, because of their
exposure more than 20 years previously, they ultimately test at approximately
0.5 SD or more lower on measured intelligence than otherwise would have been
expected. Analysis of various subclasses of the total sample showed that the
negative drug exposure effect was greater among those from lower SES
background, those exposed in the third trimester and earlier, and the
offspring of an unwanted pregnancy.
AD HOC THEORIES OF THE WHITE-BLACK IQ DIFFERENCE
The totality of environmental factors now known to affect IQ within either
the white or the black population taken together cannot account for a larger
amount of the total variance between groups than does the default hypothesis.
The total between-populations variance accounted for by empirically
demonstrable environmental factors does not exceed 20 to 30 percent. According
to the default hypothesis, the remaining variance is attributable to genetic
factors. But one can still eschew genetic factors and instead hypothesize a
second class of nongenetic factors to explain the observed differences --
factors other than those already taken into account as sources of nongenetic
variance within groups. However, exceptionally powerful effects would have to
be attributed to these hypothesized nongenetic factors if they are to explain
fully the between-groups variance that the default hypothesis posits as
genetic.
The explanations so far proposed to account for so large a part of the IQ
variance in strictly nongenetic terms involve subtle factors that seem
implausible in light of our knowledge of the nature and magnitude of the
effects that affect IQ. Many researchers in the branches of behavioral science
related to this issue, as opposed to journalists and commentators, are of the
opinion that the W-B difference in IQ involves genetic factors. A
questionnaire survey conducted in 1987 solicited the anonymous opinions of 661
experts, most of them in the fields of differential psychology, psychometrics,
and behavioral genetics. Here is how they responded to the question: "Which of
the following best characterizes your opinion of the heritability of the
black-white difference in IQ?"
15% said: The difference is entirely due to environmental variation.
1 % said: The difference is entirely due to genetic variation.
45% said: The difference is a product of both genetic and environmental
variation.
24% said: The data are insufficient to support any reasonable opinion.
14% said: They did not feel qualified to answer the question.
Those behavioral scientists who attribute the difference entirely to the
environment typically hypothesize factors that are unique to the historical
experience of blacks in the United States, such as a past history of slavery,
minority status, caste status, white racism, social prejudice and
discrimination, a lowered level of aspiration resulting from restricted
opportunity, peer pressure against "acting white," and the like. The obvious
difficulty with these variables is that we lack independent evidence that they
have any effect on g or other mental ability factors, although in some cases
one can easily imagine how they might adversely affect motivation for certain
kinds of achievement. But as yet no mechanism has been identified that
causally links them to g or other psychometric factors. There are several
other problems with attributing causality to this class of variables:
1. Some of the variables (e.g., a past history of slavery, minority or
caste status) do not explain the W-B ls to 1.5s mean difference on
psychometric tests in places where blacks have never been slaves in a nonblack
society, or where they have never been a minority population, or where there
has not been a color line.
2. These theories are made questionable by the empirical findings for other
racial or ethnic groups that historically have experienced as much
discrimination as have blacks, in America and other parts of the world, but do
not show any deficit in mean IQ. Asians (Chinese, Japanese, East Indian) and
Jews, for example, are minorities (some are physically identifiable) in the
United States and in other countries, and have often experienced
discrimination and even persecution, yet they perform as well or better on
g-loaded tests and in g-loaded occupations than the majority population of any
of the countries in which they reside. Social discrimination per se obviously
does not cause lower levels of g. One might even conclude the opposite,
considering the minority subpopulations in the United States and elsewhere
that show high g and high g-related achievements, relative to the majority
population.
3. The causal variable posited by these theories is unable to explain the
detailed empirical findings, such as the large variability in the size of the
W-B difference on various kinds of psychometric tests. As noted in Chapter 11,
most of this variability is quite well explained by the modified Spearman
hypothesis. It states that the size of the W-B difference on various
psychometric tests is mainly related to the tests' g loadings, and the
difference is increased if the test is also loaded on a spatial factor and it
is decreased if the test is also loaded on a short-term memory factor. It is
unlikely that broad social variables would produce, within the black and white
populations, the ability to rank-order the various tests in a battery in terms
of their loadings on g and the spatial and memory factors and then to
distribute their effort on these tests to accord with the prediction of the
modified Spearman hypothesis. (Even Ph.D. psychologists cannot do this.) Such
a possibility is simply out of the question for three-year-olds, whose
performance on a battery of diverse tests has been found to accord with
Spearman's hypothesis (see Chapter 11, p. 385). It is hard to even imagine a
social variable that could cause systematic variation in the size of the W-B
difference across different tests that is unrelated to the specific
informational or cultural content of the tests, but is consistently related to
the tests' g loadings (which can only be determined by performing a factor
analysis).
4. Test scores have the same validity for predicting educational and
occupational performance for all American-born, English-speaking
subpopulations whatever their race or ethnicity. Blacks, on average, do not
perform at a higher level educationally or on the job, relative to other
groups, than is predicted by g-loaded tests. An additional ad hoc hypothesis
is required, namely, that the social variables that depress blacks' test
scores must also depress blacks' performance on a host of nonpsychometric
variables to a degree predicted by the regression of the nonpsychometric
variables on the psychometric variables within the white population. This
seems highly improbable. In general, the social variables hypothesized to
explain the lower average IQ of blacks would have to simulate consistently all
of the effects predicted by the default hypothesis and Spearman's hypothesis.
To date, the environmental theories of the W-B IQ difference put forward have
been unable to do this. Moreover, it is difficult or impossible to perform an
empirical test of their validity.
A theory that seems to have gained favor among some social anthropologists
is the idea of "caste status" put forth by the anthropologist John Ogbu. He
states the key point of his theory as follows: "The people who have most
difficulty with IQ tests and other forms of cognitive tasks are involuntary or
nonimmigrant minorities. This difficulty arises because their cultures are not
merely different from that of the dominant group but may be in opposition to
the latter. Therefore, the tests acquire symbolic meanings for these
minorities, which cause additional but as yet unrecognized problems. It is
more difficult for them to cross cognitive boundaries.
Ogbu' s answer to criticism number 2 (above) is to argue that cultural
factors that depress IQ do so only in the case of involuntary or nonimmigrant
minorities and their descendants. In the United States this applies only to
blacks (who were brought to America involuntarily to be sold as slaves) and
native Americans (who score, on average, intermediate between blacks and
whites on tests of fluid g). This theory does not account for the relatively
high test scores and achievements of East Indians in Africa, whose ancestors
were brought to Africa as indentured laborers during the nineteenth century,
but Ogbu could reply that the indentured Indians were not truly involuntary
immigrants. American blacks, in Ogbu's theory, have the status of a caste that
is determined by birth and from which there is no mobility. Lower-caste
status, it is argued, depresses IQ. Ogbu cites as evidence the Harijans
(untouchables) of India and the Burakumi in Japan as examples. (The Burkumi
constitute a small subpopulation of Asian origin that engages in work the
Japanese have traditionally considered undesirable, such as tanning leather.)
Although it is true that these "lower-caste" groups generally do have lower
test scores and perform less well in school than do higher-status groups in
India or Japan, the body of psychometric evidence is much less than that for
American blacks. We know hardly anything regarding the magnitude or
psychometric nature or the degree of genetic selection for g in the origins of
these caste-like groups in India and Japan.
Ogbu also argues that conventional IQ tests measure only those types of
cognitive behavior that are culturally valued by Western middle-class
societies, and IQ tests therefore inevitably discriminate against minorities
within such societies. But since such tests have equal predictive validity for
blacks and whites, this would have to imply that performance on the many
practical criteria predicted by the tests is also lowered by involuntary but
not voluntary minority status. According to Ogbu, the "Western intelligence"
measured by our psychometric tests represents only a narrow set of specialized
cognitive abilities and skills. These have been selected on the basis of
Western values from the common species pool of capabilities for adaptation to
specific environmental circumstances. It logically follows, then, that the g
factor and the spatial factor themselves represent specialized Western
cognitive skills. The question that Ogbu neither asks nor answers is why this
set of Western-selected abilities has not been acquired to the same degree by
a population of African descent that has been exposed to a Western society for
many generations, while first-generation immigrants and refugees in America
who came from the decidedly non-Western Oriental and East Indian cultures soon
perform on a par with the dominant population of European descent.
A similar view of racial and ethnic IQ differences has been expressed by
the economist Thomas Sowell. He does not offer a formal or explanatory theory,
but rather a broad analogy between American blacks and other ethnic and
national groups that have settled in the United States at different times in
the past. Sowell points out that many immigrant groups performed poorly on
tests at one time (usually soon after their arrival in America) and had
relatively low educational standing, which limited their employment to
low-paying jobs. The somewhat lower test scores of recent immigrants are
usually attributable to unfamiliarity with the English language, as evidenced
by their relatively superior performance on nonverbal tests. Within a single
generation, most immigrant groups (typically those from Europe or Asia)
performed on various intellectual criteria at least on a par with the
established majority population. Sowell views the American black population as
a part of this same general phenomenon and expects that in due course it, too,
will rise to the overall national level. Only one generation, he points out,
has grown up since inception of the Civil Rights movement and the end of de
jure segregation.
But Sowell's analogy between blacks and other immigrant groups seems
strained when one examines the performance of comparatively recent arrivals
from Asia. The W-B difference in IQ (as distinguished from educational and
socioeconomic performance) has not decreased significantly since World War I,
when mental tests were first used on a nationwide scale. On the other hand,
the children of certain recent refugee and immigrant groups from Asia, despite
their different language and culture, have scored as high as the native white
population on nonverbal IQ tests and they often exceed the white average in
scholastic performance. Like Ogbu, Sowell does not deal with the detailed
pattern of psychometric differences between blacks and whites. He attributes
the lower black performance on tests involving abstract reasoning ability to
poor motivation, quoting a statement by observers that black soldiers tested
during World War I tended to "lapse into inattention and almost into sleep"
during abstract tests." Spearman, to the contrary, concluded on the basis of
factor analyzing more than 100 varied tests that "abstractness" is one of the
distinguishing characteristics of the most highly g-loaded tests.
Recently, a clearly and specifically formulated hypothesis, termed
stereotype threat, has been proposed to explain at least some part of the
black shortfall on cognitive tests. It should not be classed as a Factor X
theory, because specific predictions can be logically derived from the
hypothesis and tested empirically. Its authors have done so, with positive,
though somewhat limited, results.
Stereotype threat is defined as the perceived risk of confirming, as
self-characteristic, a negative stereotype about one's group. The phenomenon
has been demonstrated in four independent experiments. Groups of black and
white undergraduates at Stanford University took mentally demanding verbal
tests under preliminary instructions that were specifically intended to elicit
stereotype threat. This was termed the diagnostic condition, since the
instructions emphasized that the student's score (which they would be given)
would be a true indicator of their verbal ability and of their limitations.
Their test performance was statistically compared with that of a control
group, for whom the preliminary instructions were specifically intended to
minimize stereotype threat by making no reference to ability and telling the
subjects that the results were being used only for research on difficult
verbal problems. This was termed the nondiagnostic condition. Under both
conditions, subjects were asked to do their best. The theoretically predicted
outcome is that the difference in test performance between the diagnostic and
the nondiagnostic conditions will be greater for blacks than for whites. With
the black and white groups statistically equated for SAT scores, the
hypothesis was generally borne out in the four studies, although the predicted
interaction (race X condition) in two of the experiments failed to reach the
conventional 5 percent level of confidence.
Standard deviations were not reported for any of the performance measures,
so the effect size of the stereotype threat cannot be precisely determined.
From the reported analysis of variance, however, I have estimated the effect
size to be about 0.3s, on average. Applied to IQ in the general population,
this would be equivalent to about five IQ points. Clearly, the stereotype
threat hypothesis should be further studied using samples of blacks and whites
that are less highly selected for intellectual ability than are the students
at Stanford. One wonders if stereotype threat affects the IQ scores even of
preschool-age children (at age three), for whom the W-B difference is about
ls. Do children at this age have much awareness of stereotypes?
In fact, the phenomenon of stereotype threat can be explained in terms of a
more general construct, test anxiety, which has been studied since the early
days of psychometrics. Test anxiety tends to lower performance levels on tests
in proportion to the degree of complexity and the amount of mental effort they
require of the subject. The relatively greater effect of test anxiety in the
black samples, who had somewhat lower SAT scores, than the white subjects in
the Stanford experiments constitutes an example of the Yerkes-Dodson law. It
describes the empirically observed nonlinear relationship between three
variables: (1) anxiety (or drive) level, (2) task (or test) complexity and
difficulty, and (3) level of test performance. According to the Yerkes-Dodson
law, the maximal test performance occurs at decreasing levels of anxiety as
the perceived complexity or difficulty level of the test increases (see Figure
12.14). If, for example, two groups, A and B, have the same level of test
anxiety, but group A is higher than group B in the ability measured by the
test (so group B finds the test more complex and difficult than does group A),
then group B would perform less well than group A. The results of the Stanford
studies, therefore, can be explained in terms of the Yerkes-Dodson law,
without any need to postulate a racial group difference in susceptibility to
stereotype threat or even a difference in the level of test anxiety. The
outcome predicted by the Yerkes-Dodson law has been empirically demonstrated
in large groups of college students who were either relatively high or
relatively low in measured cognitive ability; increased levels of anxiety
adversely affected the intelligence test performance of low ability students
(for whom the test was frustratingly difficult) but improved the level of
performance of high-ability students (who experienced less difficulty).
This more general formulation of the stereotype threat hypothesis in terms
of the Yerkes-Dodson law suggests other experiments for studying the
phenomenon by experimentally manipulating the level of test difficulty and by
equating the tests' difficulty levels for the white and black groups by
matching items for percent passing the item within each group. Groups of
blacks and whites should also be matched on true-scores derived from g-loaded
tests, since equating the groups statistically by means of linear covariance
analysis (as was used in the Stanford studies) does not adequately take
account of the nonlinear relationship between anxiety and test performance as
a function of difficulty level.
Strong conclusions regarding the stereotype threat hypothesis are
unwarranted at present, as the total evidence for it is based on fairly small
samples of high-ability university students, with results of marginal
statistical significance. Research should be extended to more representative
samples of the black and white populations and using standard mental test
batteries under normal testing conditions except, of course, for the
preliminary instructions needed to manipulate the experimental variable (that
is, the inducement of stereotype threat). Further, by conducting the same type
of experiment using exclusively white (or black) subjects, divided into lower-
and higher-ability groups, it might be shown that the phenomenon attributed to
stereotype threat has nothing to do with race as such, but results from the
interaction of ability level with test anxiety as a function of test
complexity.
In contrast to these various ad hoc hypotheses intended to explain the
average W-B population difference in cognitive ability, particularly g, the
default hypothesis has the attributes of simplicity, internal coherence, and
parsimony of explanation. Further, it does not violate Occam's razor by
treating one particular racial population as a special case that is culturally
far more different from any other populations. The size of the cultural
difference that needs to be hypothesized by a purely environmental theory of
the W-B difference is far greater than the relatively small genetic difference
implied by our evolution from common human ancestors.
The default hypothesis explains differences in g between populations in
terms of quantitative variation in the very same genetic and environmental
factors that influence the neural substrate of g and cause individual
variation within all human populations. This hypothesis is consistent with a
preponderance of psychometric, behavior-genetic, and evolutionary lines of
evidence. And like true scientific hypotheses generally, it continually
invites empirical refutation. It should ultimately be judged on the same
basis, so aptly described by the anthropologist Owen Lovejoy, for judging the
Darwinian theory of human evolution: "Evolutionary scenarios must be evaluated
much in the same way that jury members must judge a prosecutor's narrative.
Ultimately they must make their judgment not on the basis of any single fact
or observation, but on the totality of the available evidence. Rarely will any
single item of evidence prove pivotal in determining whether a prosecutor's
scenario or the defense's alternative is most likely to be correct. Many
single details may actually fail to favor one scenario over another. The most
probable account, instead, is the one which is the most internally consistent
-- the one in which all the facts mesh together most neatly with one another
and with the motives in the case. Of paramount importance is the economy of
explanation. There are always alternative explanations of any single isolated
fact. The greater the number of special explanations required in a narrative,
however, the less probable its accuracy. An effective scenario almost always
has a compelling facility to explain a chain of facts with a minimum of such
special explanations. Instead the pieces of the puzzle should fall into
p1ace."
Notes:
4. One often hears it said that the genetic differences within racial
groups (defined as statistically different breeding populations) is much
greater than the differences between racial groups. This is true, however,
only if one is comparing the range of individual differences on a given
characteristic (or on a number of characteristics) within each population with
the range of the differences that exist between the means of each of the
separate populations on the given characteristic. In fact, if the differences
between the means of the various populations were not larger than the mean
difference between individuals within each population, it would be impossible
to distinguish different populations statistically. Thinking statistically in
terms of the analysis of variance, if we obtained a very large random sample
of the world's population and computed the total variance (i.e., the total sum
of squares based on individuals) of a given genetic character, we would find
that about 85 percent of the total genetic variance exists within the several
major racial populations and 15 percent exists between these populations. But
when we then divide the sum of squares (SS) between populations by its degrees
of freedom to obtain the mean square (MS) and we do the same for the sum of
squares within populations, the ratio of the two mean squares, i.e., Between
MS/Within MS, (known as the variance ratio, or F ratio, named for its
inventor, R.A. Fischer) would be an extremely large value and, of course,
would be highly significant statistically, thus confirming the population
differences as an objective reality.
5. Among the genetically conditioned physical differences in central
tendency, nearly all attributable to natural selection, that exist between
various contemporary breeding populations in the world are: pigmentation of
skin, hair, and eyes, body size and proportions, endocranial capacity, brain
size, cephalic index (100 X head-width/head-length), number of vertebrae and
many other skeletal features, bone density, hair form and distribution, size
and shape of genitalia and breasts, testosterone level, various facial
features, interpupillary distance, visual and auditory acuity, color
blindness, myopia (nearsightedness), number and shape of teeth, fissural
patterns on the surfaces of teeth, age at eruption of permanent teeth,
consistency of ear wax, blood groups, blood pressure, basal metabolic rate,
finger and palm prints, number and distribution of sweat glands, galvanic skin
resistance, body odor, body temperature, heat and cold tolerance, length of
gestation period, male/female birth ratio, frequency of dizygotic twin births,
degree of physical maturity at birth, physical maturation rate, rate of
development of alpha (brain) waves in infancy, congenital anomalies, milk
intolerance (after childhood), chronic and genetic diseases, resistance to
infectious diseases. Modern medicine has recognized the importance of racial
differences in many physical characteristics and in susceptibilities to
various diseases, chronic disorders, birth defects, and the effective dosage
for specific drugs. There are textbooks that deal entirely with the
implications of racial differences for medical practice. Forensic pathologists
also make extensive use of racial characteristics for identifying skeletal
remains, body parts, hair, blood stains, etc.
6. Two of the most recent and important studies of genetic distances and
human evolution are: (a) Cavalli-Sforza et al., 1994; (b) Nei & Roychoudhury,
1993. Although these major studies measured genetic distances by slightly
different (but highly correlated) quantitative methods based on somewhat
different selections of genetic polymorphisms, and they did not include all of
the same subpopulations, they are in remarkably close agreement on the genetic
distances between the several major clusters that form what are conventionally
regarded as the world's major racial groups.
Transtopia
- Main
- Pierre Teilhard De Chardin
- Introduction
- Principles
- Symbolism
- FAQ
- Transhumanism
- Cryonics
- Island Project
- PC-Free Zone