Review
Nature Reviews Genetics 7, 9-20 (January 2006) | doi:10.1038/nrg1747
The eloquent ape: genes, brains and the evolution of language
Simon E. Fisher1 and Gary F. Marcus2 About the authors
The human capacity to acquire complex language seems to be without
parallel in the natural world. The origins of this remarkable trait
have long resisted adequate explanation, but advances in fields that
range from molecular genetics to cognitive neuroscience offer new
promise. Here we synthesize recent developments in linguistics,
psychology and neuroimaging with progress in comparative genomics,
gene-expression profiling and studies of developmental disorders. We
argue that language should be viewed not as a wholesale innovation, but
as a complex reconfiguration of ancestral systems that have been
adapted in evolutionarily novel ways.
View At a Glance
Human language seems to be unique in the natural world. Non-human
communication is predominantly restricted to simple messages such as
alarm calls and identification signals, with little in the way of
complex structure1, 2. By contrast, the average human has access to a
vocabulary of tens of thousands of words and can, guided by an
intricate set of structural rules, assemble them into a potentially
infinite number of meaningful sentences, referring not only to the here
and now, but also to the past, the future and the abstract3, 4.
Remarkably, this rich linguistic system is usually acquired without
conscious effort or formal instruction. The drive to acquire language
is so robust that a lack of aural input does not necessarily abate it;
deaf babies who are exposed to sign language babble using their hands5,
and deaf children who have had little access to sign-language input can
develop language-like gesture systems6. In comparison, no other living
primate naturally acquires more than a few signals, and these are
combined in rudimentary ways2.
Given such sharp distinctions between communication in humans and that
found in other species, language has often been investigated as an
isolated phenomenon. Experts in linguistics have studied aspects of
language that include its sound systems (phonetics and phonology), the
ways in which words can be put together from smaller meaningful units
(morphology), and the principles that govern sentence construction
(syntax) and meaning (semantics), with little or no reference to the
biology or psychology of non-human species. Similarly, neuroscientists
who seek to understand the neural basis of human communication have
tended to focus their attention on two regions of the cerebral cortex
that were thought to provide specialized human-specific substrates for
processing language - Broca's area (commonly described as the seat of
grammar) and Wernicke's area (described as the seat of meaning and
sound structure)7, 8.
These efforts have proved effective for many purposes, such as
clarifying the nature of language (Box 1) and probing
electrophysiological activity in the brain during the production or
comprehension of a sentence. Even so, although researchers have made
progress by studying language purely on its own terms, it does not
follow that language should be studied in this way. Few if any
phenotypic traits are entirely without precedent. The avian wing, for
example, can be thought of as a specially modified version of the basic
vertebrate forelimb - an idea that is supported by a well-described
molecular and developmental basis9. As suggested by Darwin over a
century ago10, the behavioural and cognitive peculiarities of Homo
sapiens - including our extraordinary capacity for language -
should be similarly explicable as the product of descent with
modification17. Here we argue that with recent progress across many
disciplines - including genetics and genomics, which are the focus of
this article - the scientific community is finally approaching a
position in which it can fulfil Darwin's promise.
Box 1 | What is language?
Full box
Approaching language evolution
The search for the origins of language is far from new. A whole host of
different (often conflicting) hypotheses have been proposed11, which
have been framed with respect to a wide range of questions. Can
language be explained by the same kind of adaptive evolution that has
shaped other traits12? Was language itself subject to selective forces,
or did it emerge as a secondary by-product of other properties, such as
a larger and more complex brain, with greater computational
resources13? Is language the consequence of a single radical
macromutation14, 15, or was it honed in successive steps12? What
selective advantages might be associated with this trait? Suggestions
have ranged from enhanced communication of information12 to improved
organization of internal thought16, sexual selection18 and increased
social cohesion19. What came first - a means for the fine
articulation of the vocal tract (speech) or a means for combining
individual communicative elements and coordinating them with meaning
(language)? Or did the two co-evolve20?
Until recently, relevant empirical investigations were mainly
restricted to three domains - archaeological studies, linguistic
reconstructions of intermediate forms of language and computational
modelling of constraints on language evolution. These approaches have
yielded interesting findings, but each has been hampered by
uncertainty. Archaeological approaches are limited because cognitive
systems do not leave any direct physical fossil record. Although
studies of fossilized hominin skeletons have provided evidence about
the position of the larynx21, degree of tongue innervation22 and
sophistication of breathing control23 during human evolution, the
significance of these changes for the emergence of language remains
highly controversial24, 25, 26. Putative precursors of language systems
- which are based on studying aspects of modern usage27, 28 and the
ways that newly formed languages develop27, 29 - are not open to
independent verification. Mathematical and computational approaches30,
31 face similar problems. For example, studies have identified
circumstances under which a language that has a lexicon but no rules
for combining words into sentences could evolve into a system that
contains rules for constructing new sentences to describe novel
situations30. However, at present there is no way to validate the core
assumption that lexicons evolved before grammar.
Against this daunting backdrop we see several reasons to be optimistic.
First, contemporary data have highlighted the flaws in traditional
views of the neurological bases of language8, 32, 33 (Box 2). Because
the classical model that revolves around Broca's and Wernicke's areas
invokes neural substrates that are unique to language and to humans, it
unduly minimizes the possibility of understanding language origins
through studies of animals or other cognitive systems. However, neither
Broca's nor Wernicke's area is devoted entirely to language
processing34, 35 and, in fact, these substrates might not be
human-specific36, 37, 38. It is also now generally accepted that
language capacity involves a complex network of cortical and
subcortical circuits that is broadly distributed across the brain32, 39
(Box 2).
Box 2 | Evolving views of the neurological basis of language
Full box
Second, although non-human primate communication shows qualitative
differences from human language, studies have established that most
components of language show some degree of continuity with other
species. For example, the human vocal tract supports a wider repertoire
of speech sounds than could be produced by other primates26, but the
capacity to create richly modulated formants is not unique to humans40.
Likewise, many animals and birds can distinguish different human speech
sounds, and adult tamarin monkeys can discriminate between the
distinctive rhythmic properties of different languages41. Debate
continues about exactly how much of the machinery of language is
species- or language-specific; for example, opinion is divided over
whether recursion represents the only component that is genuinely new
to the human species42, 43. Nevertheless, views that consider language
to be fully independent of ancestral systems are no longer tenable, and
there is a growing recognition that cognitive, physiological,
neuroanatomical and genetic data from non-speaking species can greatly
inform our understanding of the nature and evolution of language32, 42,
43, 44.
A third principal reason for optimism comes from developments in
molecular genetics, including large-scale comparative genomics45,
investigations of gene expression46 and explorations of specific genes
that have been suggested by studies of developmental disorders47. As
described below, these advances collectively offer new types of
empirical data for addressing hypotheses about how humans diverged from
other primates.
Comparing primate genomes
One new avenue seeks to investigate the origins of language by
comparing the genomic sequences of humans and other closely related
species. Although we currently lack adequate genetic material from
extinct hominin species48 (but see Ref. 49), the complete draft genome
sequence of Pan troglodytes, the closest extant primate cousin of H.
sapiens, yields a catalogue of almost every sequence difference that
distinguishes a human from a chimpanzee50. Furthermore, genomic
sequencing of the rhesus macaque and orangutan is well underway.
Surprisingly, it seems that most human-chimpanzee genomic differences
have accumulated through genetic drift during the several million years
since the two lineages diverged50. To determine which specific changes
have shaped the distinctive features of human biology, researchers have
begun by searching coding regions for evidence of accelerated
amino-acid change that has exceeded that expected from local mutation
rates (Box 3). These studies have found that genes which are
specifically or maximally expressed in the brain tend to show a much
lower degree of amino-acid divergence than other genes50, 51, 52. A
probable explanation is that the proteins involved in nervous-system
biology are usually subject to strong functional constraint52. Despite
the overall pattern of reduced divergence, a subset of genes that are
implicated in brain development seem to have evolved more rapidly in
primates than would be expected on the basis of studies of rodent
species50, 52, 53. Further work is needed to determine whether this
observation reflects the action of positive selection, but detailed
investigation of such outliers might provide candidates for involvement
in human brain evolution.
Box 3 | Signatures of selection
Full box
The combination of primate sequences and within-species diversity data
from human populations offers a potential route for identifying
signatures of recent selection anywhere in the genome (Box 3). Using
chimpanzee data it is often possible to determine which allele of a
human SNP (single nucleotide polymorphism) represents the state that
was present in the common human-chimpanzee ancestor. Preliminary
analyses of 120,000 validated human SNPs50 highlighted 7 large genomic
regions that show a reduced rate of diversity54 and a large proportion
of high-frequency, non-ancestral alleles55. These features are
suggestive of a selective sweep during human evolution, and so could be
another source of candidate genes for exploring the emergence of traits
such as language.
At the same time, it is important to realize that access to a complete
chimpanzee genome sequence is no panacea. Although early reports of a
high degree of genetic overlap still stand56, the genomes of humans and
chimpanzees are replete with differences. By itself, the
between-species 1.23%% substitution rate in single-copy genomic regions
corresponds to over 35 million changes50, and it is accompanied by at
least 5 million indel events, which represent another 3%% or so of
genomic divergence50. Further sources of change that might prove to be
important include differences in non-coding RNAs57, DNA methylation58,
chromosomal rearrangements50 and altered gene-copy numbers that result
from lineage-specific duplications or losses59, 60. Our ability to
distinguish functional adaptive change from the considerable amount of
background noise remains limited, particularly for non-coding
regions61, and defining the role of any given gene represents a major
undertaking.
In terms of understanding the origins of language, it is also worth
noting that many of the adaptive changes in the human genome might be
unrelated to this trait; human biology is distinctive for a range of
anatomical, metabolic, biomedical and behavioural characteristics45.
Some - such as bipedalism, increased relative brain size and advanced
tool use - might be defining features of the human condition.
However, other traits - such as the human-specific susceptibility to
malaria - simply involve adaptive differences of a kind that are
commonly found between closely related species45. Identifying the
genomic contributions to language evolution will ultimately depend not
only on evidence of positive selection, but also on a demonstration
that variation in a given candidate gene affects linguistic capacities.
Expressive apes and expression arrays
Humans and chimpanzees differ not just in the genes that they carry,
but also in how these genes are expressed. With the advent of
high-throughput techniques for characterizing gene-expression profiles,
such as hybridization of RNA to oligonucleotide microarrays46,
species-wide (and organism-wide) differences in spatial and temporal
regulation of gene expression can now be directly investigated. This
approach is in its infancy, but recent studies of neural tissue from
equivalent regions of different primate species, or distinct regions of
the same species, support several preliminary conclusions.
First, convergent data from many studies indicate an acceleration of
neural gene-expression changes during human evolution62, 63, 64, 65,
66. It should be emphasized that human and chimpanzee brains show
considerably less absolute divergence than other tissues, such as the
liver and the heart, both in terms of the number of genes that are
differentially expressed and the magnitudes of the differences63, 64,
65, which is probably due to higher functional constraint in neural
tissue52. However, alteration of neural gene expression on the branch
leading to humans seems to have been more dramatic than that found for
the chimpanzee branch during the same period. A parallel (although less
significant) tendency for human-lineage acceleration has been proposed
for rates of amino-acid change in brain-expressed genes52, 53. So,
human cognitive evolution might have involved a complex combination of
changes in the regulation of gene expression and in protein structure.
Second, most studies report that the above acceleration of expression
differences is associated with a general upregulation of expression in
the human cortex compared with that found in the chimpanzee63, 64, 65
(although see Refs 66,67 for alternative interpretations). Upregulated
genes tend to be enriched in genomic regions that have been recently
duplicated in human evolution68, but the functional importance of this
finding is not known.
Third, coincident with cortical asymmetries in language function, a
recent investigation identified marked expression differences between
the left and right hemispheres in the early development of human
embryonic brains, preceding the emergence of morphological
distinctions69. In particular, LIM domain only 4 (LMO4), a gene that is
linked to cortical patterning, is more highly expressed in part of the
right cortex than the equivalent region of the left cortex at 12-14
weeks of gestation. Intriguingly, although Lmo4 expression in the mouse
cortex shows moderate asymmetry in individual brains, there is no
consistent lateralization to one or other hemisphere; therefore,
altered regulation of this gene during evolution might be relevant to
the emergence of human asymmetries. Given that asymmetries of brain
structure have been suggested for other primates36, 37, and that
functional asymmetry is associated with several aspects of cognition
(such as spatial and facial recognition70), it is unclear how relevant
these results are to language, but they clearly represent another
avenue that is worth pursuing.
Fourth, studies have yet to identify any specific human gene (or set of
genes) that is uniquely expressed in language-related regions of adult
brains. An investigation of three human brains found that
language-related substrates in the cerebral cortex showed similar
expression profiles to those of other cortical areas68; differences
between individuals tended to outweigh differences between regions
within any single individual. Expression profiles did not differ
significantly between Broca's area and the corresponding area of the
right hemisphere68, despite well-documented asymmetries of
cytoarchitecture71 and function72. Moreover, comparison to homologous
brain regions in chimpanzees indicated that the vast majority of
expression differences between these species are common to all cortical
regions, rather than being localized to areas that are related to
linguistic function68.
So far, neural expression profiling in humans and primates has lacked
the power to detect effects that involve subsets of cells, which is
problematic as each region of the cortex comprises a highly complicated
mix of distinctive cell types. In addition, it is difficult to identify
clearly which expression changes were adaptive and which were
selectively neutral67. Finally, for ethical reasons, these
investigations have exploited only tissues from autopsies, as it is
currently unfeasible to characterize dynamic expression profiles in the
functioning human brain. As we discuss elsewhere in this review,
comparative analyses of neural expression patterns in other species,
particularly song-birds73, 74, 75, are not subject to the same
limitations as human-primate studies, and might provide further entry
points into language-related mechanisms.
Gene-driven strategies
An alternative approach to large-scale comparative studies begins by
pinpointing genes that are judged likely to influence human language,
and involves targeting these genes for detailed evolutionary
investigation. The exact nature of the neuromolecular pathways that
underlie language remains mysterious, so discovering these genes is far
from trivial, but some progress has been made through positional
cloning studies of human neurodevelopmental disorders47. If mutation of
a particular gene is implicated in neural abnormalities, then sequence
variation in the gene can, in principle, influence relevant
developmental outcomes76. Therefore, a close examination of between-
and within-species diversity is warranted, as it can reveal whether
alteration of the gene in question was involved in human evolution. For
example, studies of primary microcephaly77 - a disorder of reduced
brain size - have suggested mechanisms that could have contributed to
cortical expansion during primate evolution78, 79, 80, 81, 82 (Box 4),
whereas other rare syndromes (such as Joubert syndrome 3 (JBTS3)83 and
bilateral frontoparietal polymicrogyria (BFPP)84) might provide clues
to adaptive changes in brain organization (Table 1). Genetically
mediated increases in brain size and overall organization have clearly
been important in human evolution, but they do not themselves
adequately explain language origins. For more direct insights it is
worth focusing on neurodevelopmental disorders that primarily affect
speech and/or language skills47.
Table 1 | Insights into primate brain evolution from genetic studies of
human neurodevelopmental disorders
Full table
Figures and tables index
Download Power Point slide (461 KB)
Box 4 | Big brains and language evolution
Full box
Insights from complex disorders. Although language acquisition seems to
be universal across the human species4, a significant proportion of
children have language-related deficits that cannot be explained by a
known cause, such as mental retardation, cerebral palsy, autism or
hearing impairments85. There is considerable phenotypic heterogeneity
among these children and the diagnosis of subtypes remains
controversial85, but genetic factors make a strong contribution86.
Identification of the relevant genes is proving to be challenging,
especially given that most speech and language disorders have a
multifactorial basis47. Nevertheless, advances in genotyping technology
and statistical methods, coupled with sophisticated phenotypic
characterization, are delivering encouraging results. For common forms
of disorder, genetic studies point to several chromosomal intervals
that might harbour risk variants that are involved in specific language
impairment (SLI; intervals SLI1-SLI3) (Refs 47,87,88) (Table 1) and
in speech-sound disorders (SSD)89. Quantitative trait analyses of SLI1
show that this region strongly influences a child's ability to repeat
pronounceable nonsense words88 - a skill that represents a highly
heritable behavioural marker of SLI86. Intriguingly, the SLI1 and SLI2
regions contain genes that show increased copy number in the human
lineage59, and that can therefore be prioritized for study (Table 1).
It is too early to tell whether this is merely coincidence, but this
does highlight how the integration of data from positional cloning
efforts and comparative genomics can generate new testable hypotheses,
even before identifying actual risk genes.
Other heritable neurodevelopmental syndromes, such as developmental
dyslexia (DD) and autistic disorder (AD), are relevant to language,
although they too are characterized by genetic complexity, with
multiple chromosomal regions highlighted by mapping studies90, 91
(Table 1). Dyslexia is not a linguistic disorder per se - it is
diagnosed when a child with overtly normal language has unexplained
difficulties with learning to read and/or spell92. (Reading and
spelling, unlike spoken language, do not develop naturally or
universally, and instead require extensive tuition.) However, most
dyslexic people have subtle underlying deficits in language processing,
particularly with handling phonemes93. As such, emerging genetic
discoveries about the aetiology of dyslexia (for example, see Refs
94-97) might inform our understanding of the basis of language.
Similarly, although autism is not primarily a language disorder,
deficits in the area of communication represent an important diagnostic
feature, along with problems in social interaction and repetitive or
stereotyped behaviours98. Many autistic children are completely
non-verbal, and those that do acquire language competence almost always
show pervasive deficits in their use of pragmatics98. So, the relevant
susceptibility genes, once identified, could be informative for
understanding the evolution of social cognition and how this relates to
language origins. Some of the putative loci that are linked to autism
have been proposed as being relevant to language99, 100; these studies
have been based on analysing subsets of children with language delay,
using language-related measures as endophenotypes and/or observing
concordant mapping in other disorders.
Insights from a Mendelian disorder: the FOXP2 gene. The first direct
evidence of a specific gene that influences speech and language
acquisition has come not from complex traits, but from an unusual
autosomal dominant form of communication disorder101 that is caused by
mutation of the forkhead box P2 (FOXP2) gene, which encodes a forkhead
box transcription factor102. The consequences of FOXP2 disruption
differ from typical SLI in that they include prominent difficulties in
learning and producing sequences of movements that involve the mouth
and lower part of the face103. Affected individuals have problems with
speech articulation (developmental verbal dyspraxia or DVD), which are
accompanied by wide-ranging deficits in many aspects of language and
grammar104, 105. Crucially, although general intelligence varies among
individuals who carry the same FOXP2 mutation, speech and language
deficits are always evident, even for children with normal non-verbal
intelligence105. Moreover, the associated problems with processing
language and grammar are not exclusively tied to speech - they are
evident in the written domain and occur for comprehension as well as
expression105. (For more detailed discussion see Refs 47,106,107.)
The link between FOXP2 and disordered language was initially identified
through genetic studies of a large three-generational family (known as
KE)108, 109, in which affected members carry a heterozygous missense
mutation that alters the DNA-binding domain of the FOXP2 protein102
(Fig. 1a). The KE substitution markedly affects the function of the
encoded protein (J. Nic