Harnad, Stevan (2002) Symbol grounding and the origin of language.

Symbol Grounding and the Origin of Language

Stevan Harnad

Cognitive Sciences Center
Southampton University
Highfield, Southampton
SO17 1BJ United Kingdom

ABSTRACT: What language allows us to do is to "steal" categories quickly and effortlessly through hearsay instead of having to earn them the hard way, through risky and time-consuming sensorimotor "toil" (trial-and-error learning, guided by corrective feedback from the consequences of miscategorisation). To make such linguistic "theft" possible, however, some, at least, of the denoting symbols of language must first be grounded in categories that have been earned through sensorimotor toil (or else in categories that have already been "prepared" for us through Darwinian theft by the genes of our ancestors); it cannot be linguistic theft all the way down. The symbols that denote categories must be grounded in the capacity to sort, label and interact with the proximal sensorimotor projections of their distal category-members in a way that coheres systematically with their semantic interpretations, both for individual symbols, and for symbols strung together to express truth-value-bearing propositions.

Meaning: Narrow and Wide

Philosophers tell us that there are two different approaches to explaining what our thoughts are about, one wide and one narrow. According to the wide approach, the object I am thinking about, that physical thing out there in the world, is to be reckoned as part of the meaning of that thought of mine about it; meaning is wider than my head. According to the narrow approach, the locus of any meaning of my thoughts is inside my head; meaning can be no bigger than my head. The question is: should psychologists who aspire to study and explain the meaning of thoughts adopt a wide or a narrow approach?

Here is an advantage of a wide approach: As wide meaning encompasses both the internal thought and its external object, a complete explanation of wide meaning would leave nothing out. Once you have arrived at a successful explanation, there are no further awkward questions to be asked about how thoughts get "connected" to what they mean, because what they mean is in a sense already part of what thoughts are. But there are disadvantages for a psychologist adopting this approach, because it would require him to be so much more than a psychologist: He would have to be an authority not only on what goes on in the head, but also on what goes on in the world, in order to be able to cover all the territory over which thoughts can range.

We can illustrate the plight of the wide psychological theorist with an analogy to the roboticist: If one were trying to do "wide robotics," accounting not only for everything that goes on inside the robot, but also for what goes on in the world, one would have to model both the robot and the world. One would have to design a virtual world and then describe the robot's states as encompassing both its internal states and the states of the virtual world in which it was situated. In this ecumenical era of "situated" robotics (Hallam & Malcolm 1994) this might not sound like such a bad thing, but in fact wide robotics would be quite contrary to the spirit and method of situated robotics, which emphasises using the real world to test and guide the design of the robot precisely because it is too hard to second-guess the world in any virtual-world model of it (the world is its own best model). At some point the "frame problem" (Pylyshyn 1987; Harnad 1993) always arises; this is where one has failed to second-guess enough about the real world in designing the robot's virtual world, and hence a robot that functions perfectly well in the virtual world proves to be helpless in the real world.

So perhaps it's better to model only the robot, and leave the modeling of the real world to cosmologists. The counterpart of this moral for the psychologist would be that he should restrict himself to the narrow domain of what's going on in the head, and likewise leave the wide world to the cosmologists.

The Symbol Grounding Problem

But there are disadvantages to the narrow approach too, for, having explained the narrow state of affairs in the head, there is the problem of ensuring that these states connect with the wide world in the right way. One prominent failure in this regard is the symbolic model of the mind, according to which cognition is computation: thinking is symbol manipulation (Pylyshyn 1984). After the initial and promising successes of Artificial Intelligence, and emboldened by the virtually universal power that was ascribed to computation by the Church-Turing Thesis (according to which just about everything in the world can be simulated computationally), computationalism seems to have run aground precisely on the problem of meaning: the symbol grounding problem (Harnad 1990a). For in a narrow computational or language-of-thought theory of meaning, thoughts are just strings of symbols, and thinking is merely the rule-governed manipulation of those symbols, as in a computer.

But the symbols in such a symbol system, although they are systematically interpretable as meaning what they mean, nevertheless fail to "contain" their meanings; instead, the meanings reside in the heads of outside interpreters who use the computations as tools. As such, symbol systems are not viable candidates for what is going on in the heads of those outside interpreters, on pain of infinite regress.

So cognition must be something more than computation. Was the fatal failing of computation that it was a narrow theory, operating only on the arbitrary shapes of its internal symbols, and leaving out their external meaning? Do we have no choice but to subsume the wide world after all?

Subsuming the wide world has not been the response of the cognitive modeling community. Some have chosen to abandon symbols and their attendant grounding problems altogether (Brooks 1991), and turn instead to other kinds of internal mechanisms, nonsymbolic ones, such as neural nets or pther dynamical systems (Van Gelder 1998).

Others have held onto symbol systems and emphasised that the solution to the grounding problem is to connect symbol systems to the world in the right way (Fodor 1987, 1994). The problem with this approach is that it provides no clue as to how one is to go about connecting symbol systems to the world in the right way, the way that links symbols to their meanings. Hence one does have reason to suspect that this strategy is rather like saying that the problem of a robot that has succeeded in its virtual world but failed in the real world is that it has not been connected with the real world in the right way: The notion of the "right connection" may conceal a multitude of sins of omission that in the end amount to a failure of the model rather than the connection. (Or, to put it another way, the hardest problems of cognitive modelling may be in designing the robot's internal states in such a way that they do manage to connect to the world in the "right way.")

But let us grant that if the symbolic approach ever succeeds in connecting its meaningless symbols to the world in the right way, this will amount to a kind of wide theory of meaning, encompassing the internal symbols and their external meanings via the yet-to-be-announced "causal connection." Is there a narrow approach that holds onto symbols rather than giving up on them altogether, but attempts to ground them on the basis of purely internal resources?

There is a hybrid approach which in a sense internalises the problem of finding the connection between symbols and their meanings; but instead of looking for a connection between symbols and the wide world, it looks only for a connection between symbols and the sensorimotor projections of the kinds of things the symbols designate: It is not a connection between symbols and the distal objects they designate but a connection between symbols and the proximal "shadows" that the distal objects cast on the system's sensorimotor surfaces (Harnad 1987; Harnad et al., 1991).

Categorization: A Robotic Capacity

Such hybrid systems place great emphasis on one particular capacity that we all have, and that is the capacity to categorise, to sort the blooming, buzzing confusion that reaches our sensorimotor surfaces into the relatively orderly taxonomic kinds marked out by our differential responses to it -- including everything from instrumental responses such as eating, fleeing from, or mating with some kinds of things and not others, to assigning a unique, arbitrary name to some kinds of things and not others (Harnad 1987).

It is easy to forget that our categorisation capacity is indeed a sensorimotor capacity. In the case of instrumental responses, based on the Gibsonian invariants "afforded" by our sensorimotor interactions with the world, what we tend to forget is that these nonarbitrary but differential responses are actually acts of categorisation too, partitioning inputs into those you do this with and those you do that with. And in the case of unambiguously categorical acts of naming, what we forget is that this is a sensorimotor transaction too, albeit one subtended by an arbitrary response (a symbol) rather than a nonarbitrary one.

So the capacity to sort our sensorimotor projections into categories on the basis of sensorimotor interactions with the distal objects of which they are the proximal projections is undeniably a sensorimotor capacity; indeed, we might just as well call it a robotic capacity. The narrow hybrid approach to symbol grounding to which I referred attempts to connect the proximal sensorimotor projections of distal objects and events to either the instrumental responses or the arbitrary names that successfully sort them according to what is adaptive for the hybrid system. The instrumental part is just an adaptive robot; but the arbitrary category names (symbols) open up for the system a new world of possibilities whose virtues are best described as those of theft over honest toil:

Acquiring Categories by Sensorimotor Toil and Darwinian Theft

Before defining sensorimotor toil vs. symbolic theft, let me define a more primitive form of theft, which I will call "Darwinian theft." In each case, what are being either stolen or earned by honest toil are sensorimotor categories. It is undeniable that we have the capacity to detect the proximal sensorimotor projections of some distal categories without ever having to "earn" that capability in any way, because we are born that way. Just as the frog is born with its Darwinian legacy of bug-detectors, we also arrive with a "prepared" repertoire of invariance-detectors that pick out certain salient shapes or sounds from the otherwise blooming, buzzing confusion reaching our sensorimotor surfaces without our first having to learn them by trial and error: For sensorimotor trial and error, guided by corrective feedback from the consequences of miscategorising things, is what I am calling "honest toil." [FOOTNOTE 1 HERE]

We must infer that our brains, in addition to whatever prepared category-detectors they may be born with, are also born with the capacity to learn to distinguish the sensorimotor projections of members from those of nonmembers of countless categories through supervised learning, i.e., through trial and error, guided by corrective feedback from the consequences of categorising correctly and incorrectly (Harnad 1996b). The learning mechanism that learns the invariants in the sensorimotor projection that eventually allow the system to sort correctly is not known, but neural nets are a natural candidate: Learning to categorise by honest toil is what nets seem to do best (Harnad 1993b). [FOOTNOTE 2]

Does a system such as the one described so far -- one that is able to sort its proximal projections, partly by Darwinian theft, partly by sensorimotor toil -- suggest a narrow or a wide explanation of meaning (insofar as it suggests anything about meaning at all)? It seems clear that whatever distal objects such a system may be picking out, the system is working entirely on the basis of their proximal projections, and whatever connection those may have with their respective distal objects is entirely external to the system; indeed, the system would have no way of distinguishing distal objects if the distinction were not somehow preserved in their proximal projections -- at least no way without the assistance either of sensorimotor aids of some kind, or of another form of theft, a much more important one, to which I will return shortly.

Categories Are Context-Dependent, Provisional, and Approximate

So for the time being we will note that the categories of such a system are only approximate, provisional, context-dependent ones: They depend on the sample of proximal projections of distal objects the system happens to have encountered during its history of sensorimotor toil (or what its ancestors happen to have encountered, as reflected by the detectors prepared by Darwinian theft): whatever systematic relation there might be between them and their distal objects is external and inaccessible to the system.

Philosophers tend, at this juncture, to raise the so-called "problem of error": If the system interacts only with its own proximal projections, how can its categories be said to be right or wrong? The answer is inherent in the very definition of sensorimotor toil: The categories are learned on the basis of corrective feedback from the consequences of miscategorisation. It is an error to eat a toadstool, because it makes you sick. But whether the proximal projection of what looks like a safely edible mushroom is really an edible mushroom, or a rare form of toadstool with an indistinguishable proximal projection that does not make you sick but produces birth defects in your great-grandchildren, is something of which a narrow system of the kind described so far must be conceded to be irremediably ignorant. What is the true, wide extension of its mushroom category, then? Does its memebership include only edible mushrooms that produce no immediate or tardive negative effects? Or is it edible mushrooms plus genetically damaging toadstools? This is a distal difference that makes no detectable proximal difference to a narrow system such as this one. Does it invalidate a proximal approach to meaning?

Let's ask ourselves why we might think it might invalidate proximal meaning. First, we know that two different kinds of distal objects can produce the same proximal projections for a system such as this one. So we know that its narrow detector has not captured this difference. Hence, by analogy with the argument we made earlier against ungrounded symbol systems, such a narrow system is not a viable candidate for what is going on in our heads, because we ourselves are clearly capable of making the distal distinction that such a narrow system could not make.

Distinguishing the Proximally Indistinguishable

But in what sense are we able to make that distal distinction? I can think of three senses in which we can. One of them is based, trivially, on sensorimotor prostheses: an instrument could detect whether the mushroom was tardively carcinogenic, and its output could be a second proximal projection to me, on whose basis I could make the distinction after all. I take it that this adds nothing to the wide/narrow question, because I can sort on the basis of enhanced, composite sensorimotor projections in much the same way I do on the basis of simple ones.

The second way in which the distal distinction might be made would be in appealing to what I have in mind when I think of the safe mushroom and its tardively toxic lookalike, even if it were impossible to provide a test that could tell them apart. I don't have the same thing in mind when I think of these things, so how can a narrow system that would assign them to the same category be a proper model for what I have in mind? Let us set aside this objection for the moment, noting only that it is based on a qualitative difference between what I have in mind in the two cases, one that it looks as if a narrow model could not capture. I will return to this after I discuss the third and most important sense in which we could make the distinction, namely, verbally, through language.

We could describe the difference in words, thereby ostensibly picking out a wide difference that could not be picked out by the narrow categorisation model described so far.

Acquiring Asocial Categories Asocially, by Sensorimotor Toil

Let me return to the narrow model with the reminder that it was to be a hybrid symbolic/dynamic model. So far, we have considered only its dynamic properties: It can sort its proximal projections based on error-correcting feedback. But, apart from being capable of instrumental sensorimotor interactions such as eating, fleeing, mating or manipulating on the basis of invariants it has learned by trial and error to detect in its proximal projections, such a system could, as I had noted, simply assign a unique arbitrary name on the basis of those same proximal invariants.

What would the name refer to? Again, it would be an approximate, provisional subset of proximal projections that the system had learnt to detect as being members of the same category on the basis of sensorimotor interaction, guided by error-correcting feedback. What was the source of the feedback? Let us quickly set aside (contra Wittgenstein 1953) the idea that acquiring such a "private lexicon" would require interactions with anything other than objects such as mushrooms and toadstools: In principle, no verbal, social community of such hybrid systems is needed for it to produce a lexicon of arbitrary category names, and reliably use them to refer to the distal objects subtended by some proximal projections and not others (cf. Steels and Kaplan 1999; Steels 2001); there would merely be little point in doing so asocially. For it is not clear what benefit would be conferred by such a redundant repertoire of names (except possibly as a mnemonic rehearsal aid in learning), because presumably it is not the arbitrary response of naming but the instrumental response of eating, avoiding, etc., that would "matter" to such a system -- or rather, such an asocial system would function adaptively when it ate the right thing, not when it assigned it the right arbitrary name.

But if the system should happen to be not the only one of its kind, then in principle a new adaptive road is opened for it and its kind, one that saves them all a lot of honest toil: the road of theft, linguistic theft.

The Advantages of Acquiring Categories by Symbolic Theft

At this point some of the inadvertent connotations of my theft/toil metaphor threaten to obscure the concept the metaphor was meant to highlight: Acquiring categories by honest toil is doing it the hard way, by trial and error, which is time-consuming and sometimes perhaps too slow and risky. Getting them any "other" way is getting them by theft, because you do not expend the honest toil.

This is transparent in the case of Darwinian theft (which is perhaps better described as "inherited wealth"). In the case of symbolic theft, where someone else who has earned the category by sensorimotor toil simply tells, you what's what, "theft" is also not such an apt metaphor, for this seems to be a victimless crime: Whoever tells you has saved you a lot of work, but he himself has not really lost anything in so doing; so you really haven't stolen anything from him. "Gift," "barter," or "reciprocal altruism" might be better images, but we are getting lost in the irrelevant details of the trope here. The essential point is that categories can be acquired by "nontoil" through the receipt of verbal information (hearsay) as long as the symbols in the verbal message are already grounded  (either by sensorimotor toil or, recursively, by prior grounded verbal messages) -- and of course as long as there is someone around who has the category already and is able and willing to share it with you.

Nonvanishing Intersections

Language, according to the model being described here, is a means of acquiring categories by theft instead of honest toil. I will illustrate with some examples I have used before (Harnad 1987, 1996a), by way of a reply to a philosophical objection to sensorimotor grounding models, the "vanishing intersections" objection: Meaning cannot be grounded in shared sensorimotor invariants, because the intersection of the sensorimotor projections of many concrete categories and all abstract categories is empty: Perhaps the sensorimotor projections of all "triangles" and all "red things" have something in common -- though one wonders about triangles at peculiar angles, say, perpendicular to the viewer, reducing them to a line, or red things under peculiar lighting or contrast conditions where the reflected light is not in the red spectral range -- but surely the intersections of the sensorimotor projections of all "plane geometric shapes" vanish, as do those of "coloured things," or "chairs," "tables," "birds," "bears," or 'games," not to mention the sensorimotor projections of all that is "good," or "true" or "beautiful"!

By way of response I make two suggestions:

(1) Successful Sorting Capacity Must Be Based on Detectable Invariance.
The theorist who wishes to explain organisms' empirical success in sorting sensorimotor projections by means other than a detectable invariance shared by those projections (an invariance that of course need not be positive, monadic and conjunctive, but could also be negative, disjunctive, Plyadic conditional, probabilitstic, constructive -- i.e., the result of any operation performed on the projection, including invariance under a projective transformation or under a change in relative luminance -- indeed, any complex boolean operation) has his work cut out for him if he wishes to avoid recourse to miracles, something a roboticist certainly cannot afford to do. Darwinian theft (innateness) is no help here, as Darwinian theft is as dependent on nonvanishing sensorimotor intersections as life-time toil is. It seems a reasonable methodological assumption that if the projections can be successfully partitioned, then an objective, nonvanishing basis for that success must be contained within them.
(2) The Invariance Can Be Learned Via Experience or Via Hearsay.
This is also the point at which linguistic theft comes into its own. Consider first the mushroom/toadstool problem: In a mushroom world I could earn these two important survival categories the hard way, through honest toil, sampling the sensorimotor projections and trying to sort them based on feedback from sometimes getting sick and sometimes getting nourished (Cangelosi & Harnad 2000). Assuming the problem is soluble (i.e., that the projections are successfully sortable), then if I have the requisite learning capacity, and there is enough time in the day, and I don't kill myself or die of hunger first, I will sooner or later get it right, and the basis of my success will be some sort of invariance in the projections that some internal mechanism of mine has laboriously learned to detect. Let's simplify and say that the invariant is the boolean rule "if it's white and has red spots, it's a toxic toadstool, otherwise it's an edible mushroom."

Life is short, and clearly, if you knew that rule, you could have saved me a lot of toil and risk if you simply told me that that was the invariant: A "toadstool" is a "mushroom" that is "white" with "red spots." Of course, in order to be able to benefit from such a symbolic windfall, I would have had to know what "mushroom" and "white" and "red" and "spots," were, but that's no problem: symbolic theft is recursive, but not infinitely regressive: Ultimately, the vocabulary of theft must be grounded directly in honest toil (and/or Darwinian theft); as mentioned earlier, it cannot be symbolic theft all the way down.

So far we have not addressed the vanishing-intersections problem, for I pre-emptively chose a concrete sensory case in which the intersection was stipulated to be nonvanishing. Based on (1), above, we can also add that empirical success in sensorimotor categorisation is already a posteriori evidence that a nonvanishing intersection must exist. So it is probably reasonable to assume that a repertoire of unproblematic concrete categories like "red" and "spotted" exists. The question is about the more problematic cases, like chairs, bears, games, and goodness. What could the sensorimotor projections of all the members of each of these categories possibly share, even in a Boolean sense?

The "Peekaboo Unicorn"

By way of a reply, I will pick an even more difficult case, one in which it is not only their intersection that fails to exist, but the sensorimotor projections themselves. Enter my "Peekaboo Unicorn": A Peekaboo Unicorn is a Unicorn, which is to say, it is a horse with a single horn, but it has the peculiar property that it vanishes without a trace whenever senses or measuring instruments are trained on it. So, not just in practice, but in principle, you can never see it; it has no sensorimotor projections. Is "Peekaboo Unicorn" therefore a meaningless term?

I want to suggest that not only is Pekaboo Unicorn perfectly meaningful, but it is meaningful in exactly the same sense that "a toadstool is a white mushroom with red spots" or "a Zebra is a horse with black and white stripes" are meaningful. Via those sentences you could learn what "toadstool" and "zebra" meant without having to find out the hard way -- though you could certainly have done it the hard way in principle. With the Peekaboo Unicorn, you likewise learn what the term means without having to find out the hard way, except that you couldn't have found out the hard way; you had to have the stolen dope directly from God, so to speak.

Needless to say, most linguistic theft is non-oracular (though it certainly opens the possibility of getting meaningful, unverifiable categories by divine revelation alone); all it requires is that the terms in which it is expressed should themselves be grounded, either directly by honest toil (or Darwinian theft), or indirectly, by symbolic theft, whose own terms are grounded either by... etc.; but ultimately it must all devolve on terms grounded directly in toil (or Darwin).

In particular, with the Peekaboo Unicorn, it is "horse," "horn," sense," "measuring instrument" and "vanish," that must be grounded. Given that, you would be as well armed with the description of the Peekaboo Unicorn as you would be with the description of the toadstool or the zebra to correctly categorise the first sensorimotor projection of one that you ever encountered -- except that in this case the sensorimotor projection will never be encountered because it does not exist (a "quark" or "superstring" might be examples of unobservable-in-principle things that do exist). I hope it is transparent that the ontic issue about existence is completely irrelevant to the cognitive issue of the meaningfulness of the term. (So much for wide meaning, one is tempted to say, on the strength of this observation alone.)

Meaning or Grounding?

At this point I should probably confess, however, that I don't believe that I have really provided a theory of meaning here at all, but merely a theory of symbol grounding, a theory of what a robot needs in order to categorise its sensorimotor projections, either as a result of direct trial and error learning or as a result of having received a string of grounded symbols: A symbol is grounded if the robot can pick out which category of sensorimotor projections it refers to. Grounding requires an internal mechanism that can learn by both sensorimotor toil and symbolic theft. I and others have taken some steps toward proposing hybrid symbolic/nonsymbolic models for toy bits of such a capacity (Harnad et al. 1991; Tijsseling & Harnad 1997; Cangelosi, Greco & Harnad 2000), but the real challenge is of course to scale up the sensorimotor and symbolic capacity to Turing-Test scale (Harnad 2000d).

Imagine a robot that can do these things indistinguishably from the way we do. What might such a robot be missing? This is the difference between grounding and meaning, and it brings us back to a point I deferred earlier in my paper, the point about what I have in mind when I mean something.

Mind and Meaning

According to the usual way that the problem of mind and the problem of meaning are treated in contemporary philosophy, there are not one but two things we can wonder about with respect to that Turing-Scale Robot:

(i) Is it conscious? Is there anything it feels like to be that robot? Is there someone home in there? Or does it merely behave exactly as-if it were conscious, but it's all just our fantasy, with no one home in there?

(ii) Do its internal symbols or states mean anything? Or are they merely systematically interpretable exactly as-if they meant something, but it's all just our fantasy, with no meaning in there?

For (ii), an intermediate case has been pointed out: Books, and even computerised encyclopedias, are like Turing Robots in that their symbols are systematically interpretable as meaning this and that, and they can bear the weight of such a systematic interpretation (which is not a trivial cryptographic feat). Yet we don't want to say that books and computerised enyclopedias mean something in the sense that our thoughts mean something, because the meanings of symbols in books is clearly parasitic on our thoughts; they are mediated by an external interpretation in the head of a thinker. Hence, on pain of infinite regress, they are not a viable account of the meaning in the head of the thinker.

But what is the difference between a Turing Robot and a computerised encyclopedia? The difference is that the robot, unlike the encyclopedia, is grounded: The connection between its symbols and what they are interpretable as being about is not mediated by an external interpreter; it is direct and autonomous. One need merely step aside, if one is sceptical, and let the robot interact with the world indistinguishably from us, for a lifetime, if need be.

Meaning and Feeling

What room is there for further uncertainty? The only thing left, to my mind, is (i) above, namely, while the robot interacts with the world, while its symbols connect with what they are grounded in, does it have anything in mind? There is, in other words, something it feels like to mean something; a thought means something only if (a) it is directly grounded in what it is otherwise merely interpretable-by-others as meaning and (b) it feels like something for the thinker to think that thought (Harnad 2000, 2001).

If you are not sure, ask yourself these two questions: (i) Would you still be sceptical about a grounded Turing-scale Robot's meanings if you were guaranteed that the robot was conscious (feeling)? And (ii) Would you have any clue as to what the difference might amount to if there were two Turing Robots, both guaranteed to be unconscious zombies, but one of them had meaning and grounding, whereas the other had only grounding. Exercise: flesh out the meaning, if any, of that distinction! (Explain, by hearsay, what it is that the one has that the other lacks, if it's neither grounding nor feeling; if you cannot, then the distinction is just a nominal one, i.e., you have simply chosen to call the same thing by two diferent names.)

Where does this leave our narrow theory of meaning? The onus is on the Turing Robot that is capable of toil and theft indistinguishable from our own. As with us, language gives it the power to steal categories far beyond the temporal and spatial scope of its sensorimotor experience. That is what I would argue is the functional value of language in both of us. Moreover, it relies entirely on its proximal projections, and the mechanisms in between them for grounding. Its categories are all provisional and approximate; ontic considerations and distal connections do not figure in them, at least not for the roboticist. And whether it has mere grounding or full-blown meaning is a question to which only the robot can know the answer -- but unfortunately that's the one primal sensorimotor category that we cannot pilfer with symbols (Harnad 1991).


Andrews, J., Livingston, K. & Harnad, S. (1998). Categorical Perception Effects Induced by Category Learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 24(3) 732-753.

Brooks, R. A. (1991) Intelligence Without Representation. Artificial Intelligence Journal 47: 139--159. http://www.ai.mit.edu/people/brooks/papers/representation.pdf

Cangelosi, A. & Harnad, S. (2000) The Adaptive Advantage of Symbolic Theft Over Sensorimotor Toil: Grounding Language in Perceptual Categories. Evolution of Communication (Special Issue on Grounding) http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.language.html

Cangelosi A., Greco A. & Harnad S. (2000). From robotic toil to symbolic theft: Grounding transfer from entry-level to higher-level categories. Connection Science 12(2) 143-162. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/cangelosi-connsci2.ps

Damper, R.I. & Harnad, S. (2000) Neural Network Modeling of Categorical Perception. Perception and Psychophysics 62(4): 843-867 http://www.bib.ecs.soton.ac.uk/records/513

Fodor, Jerry A. (1987) Psychosemantics: The Problem of Meaning in the Philosophy of Mind, Cambridge, MA: MIT Press.

Fodor, Jerry A. (1994). The Elm and the Expert: Mentalese and Its Semantics, Cambridge, Massachusetts: MIT Press.

Hallam J.C.T, Malcolm C.A. (1994) Behavior - Perception, Action and Intelligence - The View from Situated Robotics. Philosophical Transactions of the Royal Society A 349: (1689) 29-42.

Harnad, S. (1976) Induction, evolution and accountability. Annals of the New York Academy of Sciences 280: 58 - 60. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad76.induction.htm

Harnad, S. (1982) Consciousness: An afterthought. Cognition and Brain Theory 5: 29 - 47. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad82.consciousness.html

Harnad, S. (1987) The induction and representation of categories. In: Harnad 1987a. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad87.categorization.html

Harnad, S. (1990a) The Symbol Grounding Problem. Physica D 42: 335-346. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad90.sgproblem.html

Harnad, S. (1990b) Symbols and Nets: Cooperation vs. Competition. Review of: S. Pinker and J. Mehler (Eds.) (1988) Connections and Symbols. Connection Science 2: 257-260. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad88.symbols.nets.htm

Harnad, S. (1991) Other bodies, Other minds: A machine incarnation of an old philosophical problem. Minds and Machines 1: 43-54. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad91.otherminds.html

Harnad, S. (1992) Connecting Object to Symbol in Modeling Cognition. In: A. Clark and R. Lutz (Eds) Connectionism in Context Springer Verlag, pp 75 - 90. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad92.symbol.object.html

Harnad, S. (1993a) Problems, Problems: The Frame Problem as a Symptom of the Symbol Grounding Problem. PSYCOLOQUY 4(34) frame-problem.11 http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad93.frameproblem.html http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?4.34

Harnad, S. (1993b) Grounding Symbols in the Analog World with Neural Nets. Think 2(1) 12 - 78 (Special issue on "Connectionism versus Symbolism," D.M.W. Powers & P.A. Flach, eds.). http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad93.symb.anal.net.html http://cwis.kub.nl/~fdl/research/ti/docs/think/2-1/index.stm

Harnad, S. (1994/1996) Computation Is Just Interpretable Symbol Manipulation: Cognition Isn't. Special Issue on "What Is Computation" Minds and Machines 4:379-390 http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad94.computation.cognition.html

Harnad, S. (1995) Why and How We Are Not Zombies. Journal of Consciousness Studies 1: 164-167. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad95.zombies.html

Harnad, S. (1996a) The Origin of Words: A Psychophysical Hypothesis In Velichkovsky B & Rumbaugh, D. (Eds.) "Communicating Meaning: Evolution and Development of Language. NJ: Erlbaum: pp 27-44. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad96.word.origin.html

Harnad, S. (1996b) Experimental Analysis of Naming Behavior Cannot Explain Naming Capacity. Journal of the Experimental Analysis of Behavior 65: 262-264. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad96.naming.html

Harnad, S. (2000a) Turing Indistinguishability and the Blind Watchmaker. In: Mulhauser, G. (ed.) "Evolving Consciousness" Amsterdam: John Benjamins (in press) http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad98.turing.evol.html

Harnad, S. (2000b) Correlation vs. Causality: How/Why the Mind/Body Problem Is Hard. [Invited Commentary of Humphrey, N. "How to Solve the Mind-Body Problem"] Journal of Consciousness Studies 7(4): 54-61. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.mind.humphrey.html

Harnad, Stevan (2000c) From Sensorimotor Praxis and Pantomine to Symbolic Representations. The Evolution of Language. Proceedings of 3rd International Conference. Paris 3-6 April 2000: Pp 118-125. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.praxis.htm

Harnad, S. (2000d) Minds, Machines and Turing: The Indistinguishability of Indistinguishables, Journal of Logic, Language, and Information 9(4): 425-445. (special issue on "Alan Turing and Artificial Intelligence") http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.turing.html

Harnad, S. (2001) What's Wrong and Right About Searle's Chinese Room Argument? In: M. Bishop & J. Preston (eds.) Essays on Searle's Chinese Room Argument. Oxford University Press. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.searle.html

Harnad, S., Hanson, S.J. & Lubin, J. (1991) Categorical Perception and the Evolution of Supervised Learning in Neural Nets. Presented at Symposium on Symbol Grounding: Problems and Practice, Stanford University, March 1991 In: Proceedings of the AAAI Spring Symposium on Machine Learning of Natural Language and Ontology (DW Powers & L Reeker, Eds.) Document D91-09, Deutsches Forschungszentrum fur Kuenstliche Intelligenz GmbH Kaiserslautern FRG, pp. 65-74. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad91.cpnets.html

Hayes, P., Harnad, S., Perlis, D. & Block, N. (1992) Virtual Symposium on Virtual Mind. Minds and Machines 2: 217-238. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad92.virtualmind.html

Held, R. and Hein, A. (1963) Movement-produced stimulation in the development of visually guided behavior. Journal of Comparative and Physiological Psychology 56(5): 872-876

Pevtzow, R. & Harnad, S. (1997) Warping Similarity Space in Category Learning by Human Subjects: The Role of Task Difficulty. In: Ramscar, M., Hahn, U., Cambouropolos, E. & Pain, H. (Eds.) Proceedings of SimCat 1997: Interdisciplinary Workshop on Similarity and Categorization. Department of Artificial Intelligence, Edinburgh University: 189 - 195. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad97.textures.html

Pylyshyn, Z.W. (1984) Computation and Cognition: Toward a Foundation for Cognitive Science. MIT Press.

Pylyshyn, Zenon (1987) (ed.) The Robot's Dilemma: The Frame Problem in Artificial Intelligence. Norwood, New Jersey: Ablex Publishing company.

Steels, L. (2001) Language games for autonomous robots. IEEE Intelligent Systems 16(5) 16-22.

Steels, L. and Kaplan, F. (1999) Bootstrapping Grounded Word Semantics. In: Briscoe, T. (ed.) (1999) Linguistic evolution through language acquisition: formal and computational models. Cambridge University Press.

Tijsseling, A. & Harnad, S. (1997) Warping Similarity Space in Category Learning by Backprop Nets. In: Ramscar, M., Hahn, U., Cambouropolos, E. & Pain, H. (Eds.) Proceedings of SimCat 1997: Interdisciplinary Workshop on Similarity and Categorization. Department of Artificial Intelligence, Edinburgh University: 263 - 269. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad97.cpnets.html

Van Gelder, T. (1998) The dynamical Hypothesis in Cognitive Science. Behavioral and Brain Sciences 21:615-628.

Wittgenstein, L. (1953) Philosophical Investigations, translated by G. E. M. Anscombe, 3rd edition, 1967, Oxford: Blackwell.



1 There are no doubt intermediate cases, in which 'prepared' category detectors must first be "activated" through some early sensorimotor exposure to and interaction with their biologically "expected" members (Held and Hein 1963) before we can help ourselves to the category they pick out. There are also ways of quantifying how interconfusable the initial sensorimotor projections are, and how much toil it takes to resolve the confusion. For the most part, however, the distinction between prepared categories (Darwinian theft) and unprepared ones learned by trial and error (honest toil) can be demonstrated empirically.

2 Nets are also adept at unsupervised learning, in which there is no external feedback indicating which sensorimotor projections belong in the same category; in such cases, successful sorting can only arise from the pattern of structural similarities among the sensorimotor projections themselves -- the natural sensorimotor landscape, so to speak. Some of this may be based on proximal boundaries already created by Darwinian theft (inborn feature detectors that already sort projections in a certain way), but others may be based on natural gaps between the shapes of distal objects, as reflected in their proximal projections: sensorimotor variation is not continuous. We do not encounter a continuum of intermediate forms between the shape of a camel and the shape of a giraffe. This already reduces some of the blooming, buzzing confusion we must contend with, and unsupervised nets are particularly well suited to capitalising upon it, by heightening contrasts and widening gaps in the landscape through techniques such as competitive learning and lateral inhibition. Perhaps the basis for this should be dubbed 'cosmological theft' (Harnad 1976). There are also structural similarity gradients to which unsupervised nets are responsive, but, being continuous, such gradients can only be made categorical by enhancing slight disparities within them. This too amounts to either Darwinian or cosmological theft. In contrast, honest toil is supervised by the error-correcting feedback arising from the consequences to the system having miscategorized something, rather than from the a priori structure of the proximal sensorimotor projection.