Harnad, S. (1992) The Turing Test Is Not A Trick: Turing Indistinguishability Is A Scientific Criterion. SIGART Bulletin 3(4) (October 1992) pp. 9 - 10. [Appears preceded by an Editorial on the Turing Test by Lewis Johnson, pp. 7 - 9, and followed by another commentary by Stuart Shapiro, p. 10]

THE TURING TEST IS NOT A TRICK: TURING INDISTINGUISHABILITY IS A SCIENTIFIC CRITERION

Stevan Harnad
Department of Psychology
Princeton University
Princeton NJ 08544

Cognition et Mouvement URA CNRS 1166
Universite d'Aix Marseille II
13388 Marseille cedex 13, France

harnad@cogsci.soton.ac.uk

It is important to understand that the Turing Test (TT) is not, nor was it intended to be, a trick; how well one can fool someone is not a measure of scientific progress. The TT is an empirical criterion: It sets AI's empirical goal to be to generate human-scale performance capacity. This goal will be met when the candidate's performance is totally indistinguishable from a human's. Until then, the TT simply represents what it is that AI must endeavor eventually to accomplish scientifically.

Pen-Pals Versus Robots

In my own papers I have tried to explain how trickery, deception and impersonation have nothing at all to do with the scientific import of Turing's criterion (Harnad 1989, 1991). AI is not a party game. The game was just a metaphor. The real point of the TT is that if we had a pen-pal whom we had corresponded with for a lifetime, we would never need to have seen him to infer that he had a mind. So if a machine pen-pal could do the same thing, it would be arbitrary to deny it had a mind just because it was a machine. That's all there is to it!

This entirely valid methodological point of Turing's is based on the "other minds" problem (the problem of how I can know that anyone else but me actually has a mind, actually thinks, actually has intelligence or knowledge -- these all come to the same thing): It is arbitrary to ask for more from a machine than I ask from a person, just because it's a machine (especially since no one knows yet what either a person or a machine REALLY is). So if the pen-pal TT is enough to allow us to correctly infer that a real person has a mind, then it must by the same token be enough to allow us to make the same inference about a computer, given that the two are totally indistinguishable to us (not just for a 5-minute party trick or an annual contest, but, in principle, for a lifetime). Neither the appearance of the candidate nor any facts about biology play any role in my judgment about my human pen pal, so there is no reason the same should not be true of my TT-indistinguishable machine pen-pal.

Now, although I too am critical of the TT, I think it is important that its logic -- which was only implicit in Turing's actual writing -- should be made explicit, as I have tried to make it here and in my other writings, so we can see clearly the methodological basis for his proposed criterion. Elsewhere I have gone on to take issue with the TT on the basis of the fact that humans also happen to have a good deal more performance capacity over and above their pen-pal capacity. It is hence arbitrary and equivocal to focus only on pen-pal capacity; but Turing's basic intuition is still correct that the only available basis for inferring a mind is Turing-indistinguishable performance capacity. For TOTAL performance indistinguishability, however, one needs TOTAL, not partial, performance capacity, and that happens to call for all of our robotic performance capacities too: The Total Turing Test (TTT). And, as a bonus, the robotic capacities can be used to GROUND the pen-pal (symbolic) capacities, thereby solving the "symbol grounding problem" (Harnad 1990), which afflicts the pen-pal version of the TT, but not the robotic TTT.[1]

In fact, one of the reasons no computer has yet passed the TT may be that even successful TT capacity has to draw upon robotic capacity. A TT computer pen-pal alone could not even tell you the color of the flower you had enclosed with its birthday letter -- or indeed that you had enclosed a flower at all, unless you mention it in your letter. An infinity of possible interactions with the real world, interactions of which each of us is capable, is completely missing from the TT (and again, "tricks" have nothing to do with it).

Is the Total Turing Test Total Enough?

Note that all talk about "percentages" in judging TT performance is just numerology. Designing a machine to exhibit 100% Turing indistinguishable performance capacity is an empirical goal, like designing a plane with the capacity to fly. Nothing short of the TTT or "total" flight, respectively, meets the goal. For once we recognize that Turing-indistinguishable performance capacity is our mandate, the Totality criterion comes with the territory. Subtotal "toy" efforts are interesting only insofar as they contain the means to scale up to life-size. A "plane" that can only fall, jump, or taxi on the ground is no plane at all; and gliding is pertinent only if it can scale up to autonomous flight.

The Loebner Prize Competition is accordingly trivial from a scientific standpoint. The scientific point is not to fool some judges, some of the time, but to design a candidate that REALLY has indistinguishable performance capacities (respectively, pen-pal performance [TT] or pen-pal + robotic performance [TTT]); indistinguishable to any judge, and for a lifetime, just as yours and mine are. No tricks! The real thing!

The only open questions are (1) whether there is more than one way to design a candidate to pass the TTT, and if so, (2) do we then need a stronger test, the TTTT (neuromolecular indistinguishability), to pick out the one with the mind? My guess is that the constraints on the TTT are tight enough, being roughly the same ones that guided the Blind Watchmaker who designed us (evolutionary adaptations -- survival and reproduction -- are largely performance matters; Darwinian selection can no more read minds than we can).

Let me close with the suggestion that the problem under discussion is not one of definition. You don't have to be able to define intelligence (knowledge, understanding) in order to see that people have it and today's machines don't. Nor do you need a definition to see that once you can no longer tell them apart, you will no longer have any basis for denying of one what you affirm of the other.

References

Harnad, S. (ed.) (1987) Categorical Perception: The Groundwork of Cognition. New York: Cambridge University Press.

Harnad, S. (1989) Minds, Machines and Searle. Journal of Theoretical and Experimental Artificial Intelligence 1: 5-25.

Harnad, S. (1990) The Symbol Grounding Problem. Physica D 42: 335-346.

Harnad, S. (1991) Other bodies, Other minds: A machine incarnation of an old philosophical problem. Minds and Machines 1: 43-54.

Harnad, S., Hanson, S.J. & Lubin, J. (1991) Categorical Perception and the Evolution of Supervised Learning in Neural Nets. In: Working Papers of the AAAI Spring Symposium on Machine Learning of Natural Language and Ontology (DW Powers & L Reeker, Eds.) pp. 65-74. Presented at Symposium on Symbol Grounding: Problems and Practice, Stanford University, March 1991; also reprinted as Document D91-09, Deutsches Forschungszentrum fur Kuenstliche Intelligenz GmbH Kaiserslautern FRG.

Harnad, S. (1992) Connecting Object to Symbol in Modeling Cognition. In: A. Clarke and R. Lutz (Eds) Connectionism in Context Springer Verlag.

Footnote

1. In a nutshell, the symbol grounding problem can be stated as follows: Computers manipulate meaningless symbols that are systematically INTERPRETABLE as meaning something. The problem is that the interpretations are not intrinsic to the symbol manipulating system; they are made by the mind of the external interpreter (as when I interpret the letters from my TT pen-pal as meaningful messages). This leads to an infinite regress if we try to assume that what goes on in MY mind is just symbol manipulation too, because the thoughts in my mind do not mean what they mean merely because they are interpretable by someone ELSE's mind: Their meanings are intrinsic. One possible solution would be to ground the meanings of a system's symbols in the system's capacity to discriminate, identify, and manipulate the objects that the symbols are interpretable as standing for (Harnad 1987), in other words, to ground its symbolic capacities in its robotic capacities. Grounding symbol-manipulating capacities in object-manipulating capacities is not just a matter of attaching the latest transducer/effector technologies to a computer, however. Hybrid systems may need to make extensive use of analog components and perhaps also neural nets, in order to connect symbols to their objects (Harnad et al. 1991; Harnad 1992).