Harnad, Stevan (2008) 'Why and
How the Problem of the Evolution of Universal Grammar (UG) is Hard' Behavioral
and Brain Sciences
(forthcoming); Commentary on Christiansen, Morten H.
and Chater, Nick (2008) "Language as Shaped by the Brain" Behavioral
and Brain Sciences
(forthcoming) http://www.bbsonline.org/Preprints/Christiansen-12292006/
[Note: This is the long version. The
published version is shorter.]
Why and How the Problem of the Evolution of
Universal Grammar (UG) is Hard
Stevan Harnad
Chaire de recherche du Canada
Institut des sciences cognitives
Universite du Quebec a Montreal
Montreal, Quebec, Canada H3C 3P8
Department of Electronics and Computer Science
University of Southampton
Highfield, Southampton
SO17 1BJ UNITED KINGDOM
http://www.ecs.soton.ac.uk/~harnad/
Abstract: Universal
Grammar (UG) is a complicated set
of grammatical rules that underlies our grammatical capacity. We all
follow the
rules of UG, but we were never taught them, and we could not have
learned them
from trial and error experience either (not enough data, or time). So
UG must
be inborn. But for similar reasons, it seems implausible that UG was
'learned'
by trial and error evolution either: What was the variation and
competition?
And what were UG's adaptive advantages? So this leaves the hard problem
of
explaining where our brain's UG capacity came from. Christiansen &
Chater
(C&C) suggest an answer: Language is an organism, like us, and our
brains
were not selected for UG capacity; rather, languages were selected for
learnability with minimal trial and error experience by our brains.
This
explanation is circular: Where did our brains' selective capacity to
learn all
and only UG-compliant languages come from? Chomsky suggests it might be
a
combination of optimality and logical necessity.
It might be a
good idea to remind ourselves exactly why
and how
the problem of the evolutionary origins of Universal Grammar (UG) is hard (Harnad 1976) -- and hence perhaps
not quite as readily solvable as Christiansen & Chater (C&C)
suggest it
might be.
Universal
Grammar (UG). First, what is UG?
It is a
surprisingly complicated and abstract set of grammatical rules: not the
grammar
rules you learned in school (or figured out from hearing and reading),
but
rules that no one even knew existed until Noam Chomsky discovered them.
And UG
remains a set of rules that most of us (including me!) still don't know
to this
day -- don't know explicitly,
that is, in the sense that we were never taught them, we are not aware
them, we cannot put them into words, and we would not recognize them
(or even
understand them, without considerable technical training) if they were
explicitly
told to us in words (and symbols) by a professional grammarian.
Implicit
Knowledge. Yet we all in a sense
do 'know' all
those rules of UG "implicitly," because they are the rules that make
us capable of speaking grammatically at all -- able to produce all and
only the
sentences that are grammatically well-formed, according to UG, and able
to
recognize and reject all the sentences that are grammatically
ill-formed
according to UG. It's rather as if we all knew implicitly how to play
chess --
we could make all and only the legal moves -- yet we had no explicit
idea what
rules we were following.
Learning. The reason it makes little sense to imagine
being able to play chess without being able to say what rules we are
following,
and without somehow having learned them from experience -- either by
being told
them explicitly, or by figuring them out through a combination of
watching and
imitating others play and ourselves playing by trial and error, with
our wrong
moves corrected by those who know -- is that the rules of chess are
simple, we
all learned them either one way or the other (if we know how to play
chess at
all), and we can all perfectly well verbalize them, or at least
recognize them
if someone else verbalizes them.
Not so for
UG. UG's rules are abstract, complex
and technical. Since Chomsky first discovered their existence,
linguists have
gradually been figuring them out through decades of careful analysis,
through
hypothesis, trial, and error, based on consulting the grammatical
intuitions we
all share about what can and cannot be said, and then trying to
construct for
those a set of rules that will allow all and only the sentences we all
immediately recognize as well-formed and disallow all those we
recognize as
ill-formed. That set of rules -- not yet complete even today, but
already able
to explain a good-sized chunk of our grammatical capacity -- is UG, and
it
turned out to have some surprising properties:
Universality. First, UG turned out to be universal: All
languages have turned out to obey the very same set of rules. The
allowable
grammatical differences between languages are all there in UG too, as
available
"parameter settings" on the set of rules. When children learn a
particular language, they learn how to adjust the parameters on the
rules of UG
to configure them for that particular language. But the most surprising
thing
of all was that children do not learn the rules of UG itself.
Unlearnability. Children cannot learn the rules of UG because
-- unlike with chess -- the rules of UG are too complicated and
abstract to
learn by observation and trial and error on the basis of the
information
available to the language-learning child. (It took UG linguists many
years to
first 'learn' them from data by trial and error – many more years and
much more data than those available to the language-learning child.)
And, as
noted, those rules are not taught or learned by explicit instruction
either --
because, before Chomsky and the field of UG linguistics he created, no
one even
knew the rules, let alone taught them, even though our species had been
speaking language for a hundred thousand years.
'Poverty
of the Stimulus'.
We have now reached the point where I can state exactly why the problem
of the
evolution of UG is so hard, and why C&C's solution is too weak to
solve it:
The reason the child does not and cannot learn the rules of UG by
observation,
trial and error, and error-correction (let's forget about instruction,
because,
as noted, before Chomsky no one even knew what the rules of UG were,
let alone
tried to teach them to a child) is that the data on the basis of which the rules of UG would
have to be learned by the child do not contain anywhere near enough of
the
information that the child (or any learning system at all) would need
to have in
order to be able to infer the rules from them. (This is called the
'poverty of
the stimulus' or the computational 'underdetermination' of the rules of
UG by
the database from which it would have to be learned, if it was learned
at all.)
Error-Correction. To put it very simply: In order to
be learned at all, the rules of UG would have to be learnable through
trial and
error, with error-correction -- exactly as chess-rules have to be, in
order to
be learnable without explicit instruction: I try to move my bishop in a
certain
way, and you tell me, no, that's not a legal move, this is, and so on.
Well, in
a nutshell, children cannot learn the rules of UG that way because they
basically never make (or hear) any UG errors ('wrong moves'); hence
children
never get or hear any UG error-corrections.
It is not
that children speak flawlessly from birth. We all know they cannot do
that. But
the observation, imitation, and error correction that the child does
experience
during the relatively brief period of transition from being unable to
speak to
being able to speak does not involve any errors (or
error-corrections) in
the rules of UG,
either from the child or from the speakers that the child hears. There
are
grammatical errors and corrections aplenty, to be sure, but they are
corrections pertaining to the old-fashioned grammatical rules that we
all know
or can know explicitly, not the complex, abstract, implicit rules of
UG. Those
UG rules are never violated by the child, nor by anyone the child ever
hears
(unless its parent is a Chomskian linguist, working aloud at home!).
At first
the child cannot speak at all. Then it begins producing agrammatical or
grammatically simple utterances alongside its rote imitations. And then
it is
speaking perfectly UG-compliantly. Insofar as the rules of UG are
concerned,
the child has learned only the parameter settings. The rules themselves
were
never broken, never corrected, hence never "learned": Therefore they
must already have been inborn.
Evolutionary
Trial and Error.
Having made it explicit exactly why the UG problem is hard, I now turn
to why
and how I think C&C fail to solve it: The problem of the origin of
UG is
hard for the grammar-learning theorist because, owing to the poverty of
the
stimulus, UG is unlearnable by the child. The provisional solution
there is to
conclude that the child must therefore be born with the rules of UG
already
encoded in its brain. But that just raises the further problem of the
evolutionary origin of those inborn, genetically coded rules. That is
in fact
an even bigger problem, because in a sense evolution faces the same
learning
problem the child does. Evolution has more time available than the
child, but
it has an even more impoverished database: It is not at all clear what
would
serve as error-correction, and what would count as right and wrong, in
order to
shape UG in the usual Darwinian way: through trial and error genetic
variation,
and adaptive selection on the basis of advantages in survival and
reproduction.
The
Adaptive Advantage of UG? In the
case of the evolution of other biological structures, such as
fins, wings or eyes, or the evolution of biological functions such as
the
capacity to see, learn, or reason, there is no problem in principle for
the
usual kind of evolutionary trial-and-error explanation, even in the
cases where
the adaptive explanation has not yet been fully worked out in practice.
But
with UG there is a deep problem in principle. The problem arises not
merely
because of UG's complexity (for many organs are complex, and evolution,
unlike
the language-learning child, has a lot of time available to 'shape'
them
through trial and error variation and selection). The hard problem
arises
because UG has no apparent adaptive advantages. For although a professional grammarian's
lifetime is long enough to work out most of UG's rules explicitly by
trial and
error induction, it turns out that (with the possible exception of a
few small
portions of UG) no logical or practical advantage has yet been
discerned
that favors what UG allows over what it disallows, or over an
altogether
different set of grammatical rules
(perhaps even a much simpler and learnable set). The absence of a
biological advantage for UG is an even greater handicap than the
poverty of the
stimulus. It means that even with all of evolutionary time at its
disposal,
there is no ordinary evolutionary explanation for how or why UG would
have been
selected (if its basis is the
usual genetic variation, selectively propagated through the
survival/reproduction advantages it confers on its bearers).
The
Circularity of C&C's
Co-Evolutionary Hypothesis.
C&C rightly express skepticism about alternative 'piggy-back'
theories of the evolutionary origin of UG, because there is simply no
credible
precursor structure or function, one that had a separate prior,
plausible
adaptive advantage of its own, for some other biological purpose, that
could
then have been co-opted to do the duties of UG as well: Nothing
homologous to
the complex and abstract formal rules of UG exists in brain or
behavior. But
C&C's alternative proposal is no more convincing: They say that
language,
too, is an 'organism,' like people and animals, that it too varies
across
generations, historically, and that the shape that language took was
selectively determined by the shape the brain already had, in that only
the
languages that were learnable by our brains successfully 'survived and
reproduced.'
The trouble
with this hypothesis is that it is circular: We were looking for the
evolutionary origin of the complex and abstract rules of UG. C&C
say (based
on their computer simulations of far simpler rule systems, not bound by
the
poverty of the stimulus): Don't ask how the UG rules evolved in the
brain. The
rules are in language, which is another organism, not the brain. The
brain
simply helped shape the language, in that the variant languages that
were not
learnable by the brain simply did not 'survive.'
This
hypothesis begs the question of why and how the brain acquired an
evolved
capacity to learn all and only UG-compliant languages in the first
place,
despite the poverty of the stimulus – which was the hard problem we
started out with in the first place! It would be like saying that the
reason we
are born already knowing the rules of chess without ever having to
learn them
by trial and error is that, in our evolutionary past, there was
variation in
the games (likewise 'organisms') that we organisms tried to play, and
only
those games that we could play without having to learn them by trial
and error
survived! (That still would not even begin to explain what it is about
our
brains that makes them able to play chess without trial and error!)
The
Adaptive Advantage of Language.
This circularity is partly a result of a vagueness about what exactly
is the target of language evolution theory. Pinker & Bloom (1990)
had
already begun the misleading practice of freely conflating
evolutionarily
unproblematic questions (such as the origins of phonology, learnable
aspects of
grammar, vocabulary, 'parity') with the one hard problem of the origins
of UG,
which specifically concerns the evolutionary origins of complex
rules that
are unlearnable because of the poverty of the stimulus. Language, after all, is not just
grammar, let alone just UG. If, on the one hand, the adaptive value of
language
itself (Cangelosi & Harnad 2001; Harnad 2005, 2007) could have been
achieved with a much simpler grammar then UG (perhaps even a learnable
one),
then the evolutionary origin and adaptive function of UG becomes all
the harder
to explain, with C&C's historical variation in the language
'organism'
occurring far too late in the day to be of any help. If, on the other
hand, the
adaptive advantages of language were impossible without UG, then we are
still
left with the hard problem of explaining how and why not.
UG As A
Necessity for Thought? Chomsky
(2005) himself has suggested that UG may be a necessary (i.e., Platonic) property of being
able to think at all: A fundamental computational capacity in the form
of a
single (implicit) formal operation called 'unbounded Merge,' carried by
a
single mutation a hundred thousand years ago, conferred on our species
all the
power and adaptive advantages of thought -- and Merge carried UG with
it as a
necessary constraint, much the way the power to add carries with it the
necessary constraint that 2+2=4 rather than 5.
Chomsky has
been right on so much else, that this possibility definitely needs to
be taken
seriously. But to solve the hard problem, the Merge-mutation theory
will need
to explain exactly how UG is
a matter of logical or functional necessity in order to be able
to think at all.
REFERENCES
Cangelosi,
A. & Harnad, S. (2001) The Adaptive Advantage of Symbolic Theft
Over
Sensorimotor Toil:Grounding Language in Perceptual Categories. Evolution
of
Communication 4(1)
117-142 http://cogprints.org/2036/
http://www.linguistics.stonybrook.edu/events/nyct05/abstracts/Chomsky.pdf
Harnad,
Stevan (1976) Induction, evolution and accountability, In: Origins and
Evolution of Language and Speech (Harnad, Stevan, Steklis , Horst
Dieter and
Lancaster, Jane B., Eds.), 58-60. Annals of the New York Academy of
Sciences. http://cogprints.org/0863
Harnad,
S. (2005) To Cognize is to Categorize: Cognition is Categorization, in
Lefebvre, C. and Cohen, H., Eds. Handbook of Categorization. Elsevier. http://eprints.ecs.soton.ac.uk/11725/
Harnad,
S. (2007) From Knowing How To Knowing That: Acquiring Categories By
Word of
Mouth. Presented at Kaziemierz Naturalized Epistemology Workshop
(KNEW),
Kaziemierz, Poland, 2 September 2007. http://eprints.ecs.soton.ac.uk/14517/
Pinker,
S. & Bloom, P. (1990) Natural language and natural selection. Brain
and
Behavioral Sciences 13:707–27.
http://www.bbsonline.org/Preprints/OldArchive/bbs.pinker.html