The University of Southampton
University of Southampton Institutional Repository

Language writ large: LLMs, ChatGPT, meaning, and understanding

Language writ large: LLMs, ChatGPT, meaning, and understanding
Language writ large: LLMs, ChatGPT, meaning, and understanding
Apart from what (little) OpenAI may be concealing from us, we all know (roughly) how Large Language Models (LLMs) such as ChatGPT work (their vast text databases, statistics, vector representations, and huge number of parameters, next-word training, etc.). However, none of us can say (hand on heart) that we are not surprised by what ChatGPT has proved to be able to do with these resources. This has even driven some of us to conclude that ChatGPT actually understands. It is not true that it understands. But it is also not true that we understand how it can do what it can do. I will suggest some hunches about benign “biases”—convergent constraints that emerge at the LLM scale that may be helping ChatGPT do so much better than we would have expected. These biases are inherent in the nature of language itself, at the LLM scale, and they are closely linked to what it is that ChatGPT lacks, which is direct sensorimotor grounding to connect its words to their referents and its propositions to their meanings. These convergent biases are related to (1) the parasitism of indirect verbal grounding on direct sensorimotor grounding, (2) the circularity of verbal definition, (3) the “mirroring” of language production and comprehension, (4) iconicity in propositions at LLM scale, (5) computational counterparts of human “categorical perception” in category learning by neural nets, and perhaps also (6) a conjecture by Chomsky about the laws of thought. The exposition will be in the form of a dialogue with ChatGPT-4.
ChatGPT, LLMs, categorical perception, category learning, chomsky, cognitive science, deep learning, definitions, dictionaries, language, propositions, symbol grounding, feature abstraction, indirect verbal grounding, meaning and understanding, direct sensorimotor grounding, ChatGPT and LLMs
2624-8212
Harnad, Stevan
442ee520-71a1-4283-8e01-106693487d8b
Harnad, Stevan
442ee520-71a1-4283-8e01-106693487d8b

Harnad, Stevan (2025) Language writ large: LLMs, ChatGPT, meaning, and understanding. Frontiers in Artificial Intelligence, 7, [1490698]. (doi:10.3389/frai.2024.1490698).

Record type: Article

Abstract

Apart from what (little) OpenAI may be concealing from us, we all know (roughly) how Large Language Models (LLMs) such as ChatGPT work (their vast text databases, statistics, vector representations, and huge number of parameters, next-word training, etc.). However, none of us can say (hand on heart) that we are not surprised by what ChatGPT has proved to be able to do with these resources. This has even driven some of us to conclude that ChatGPT actually understands. It is not true that it understands. But it is also not true that we understand how it can do what it can do. I will suggest some hunches about benign “biases”—convergent constraints that emerge at the LLM scale that may be helping ChatGPT do so much better than we would have expected. These biases are inherent in the nature of language itself, at the LLM scale, and they are closely linked to what it is that ChatGPT lacks, which is direct sensorimotor grounding to connect its words to their referents and its propositions to their meanings. These convergent biases are related to (1) the parasitism of indirect verbal grounding on direct sensorimotor grounding, (2) the circularity of verbal definition, (3) the “mirroring” of language production and comprehension, (4) iconicity in propositions at LLM scale, (5) computational counterparts of human “categorical perception” in category learning by neural nets, and perhaps also (6) a conjecture by Chomsky about the laws of thought. The exposition will be in the form of a dialogue with ChatGPT-4.

Text
Harnad_Lang_Writ_Large_Feb12_2025 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (113kB)
Text
frai-1-1490698 - Version of Record
Available under License Creative Commons Attribution.
Download (550kB)

More information

Accepted/In Press date: 20 December 2024
Published date: 12 February 2025
Keywords: ChatGPT, LLMs, categorical perception, category learning, chomsky, cognitive science, deep learning, definitions, dictionaries, language, propositions, symbol grounding, feature abstraction, indirect verbal grounding, meaning and understanding, direct sensorimotor grounding, ChatGPT and LLMs

Identifiers

Local EPrints ID: 499055
URI: http://eprints.soton.ac.uk/id/eprint/499055
ISSN: 2624-8212
PURE UUID: 5b62766a-7938-4964-be8f-94687cb6121f
ORCID for Stevan Harnad: ORCID iD orcid.org/0000-0001-6153-1129

Catalogue record

Date deposited: 07 Mar 2025 17:42
Last modified: 22 Aug 2025 01:39

Export record

Altmetrics

Contributors

Author: Stevan Harnad ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×