Steven Harnad: Symbol grounding and the structure of dictionaries

Bill Benzon

Cross-posted from New Savanna.

Stevan Harnad: AI's Symbol Grounding Problem, The Gradient podcast, August 31, 2023

Stevan Harnad is professor of psychology and cognitive science at Université du Québec à Montréal, adjunct professor of cognitive science at McGill University, and professor emeritus of cognitive science at the University of Southampton. His research is on category learning, categorical perception, symbol grounding, the evolution of language, and animal and human sentience (otherwise known as “consciousness”). He is also an advocate for open access and an activist for animal rights.

Outline:

(00:00) Intro
(05:20) Professor Harnad’s background: interests in cognitive psychobiology, editing Behavioral and Brain Sciences
(07:40) John Searle submits the Chinese Room article
(09:20) Early reactions to Searle and Prof. Harnad’s role
(13:38) The core of Searle’s argument and the generator of the Symbol Grounding Problem, “strong AI”
(19:00) Ways to ground symbols
(20:26) The acquisition of categories
(25:00) Pantomiming, non-linguistic category formation
(27:45) Mathematics, abstraction, and grounding
(36:20) Symbol manipulation and interpretation language
(40:40) On the Whorf Hypothesis
(48:39) Defining “grounding” and introducing the “T3” Turing Test
(53:22) Turing’s concerns, AI and reverse-engineering cognition
(59:25) Other Minds, T4 and zombies
(1:05:48) Degrees of freedom in solutions to the Turing Test, the easy and hard problems of cognition
(1:14:33) Over-interepretation of AI systems’ behavior, sentience concerns, T3 and evidence sentience
(1:24:35) Prof. Harnad’s commentary on claims in The Vector Grounding Problem
(1:28:05) RLHF and grounding, LLMs’ (ungrounded) capabilities, syntactic structure and propositions
(1:35:30) Multimodal AI systems (image-text and robotic) and grounding, compositionality
(1:42:50) Chomsky’s Universal Grammar, LLMs and T2
(1:50:55) T3 and cognitive simulation
(1:57:34) Outro

The podcast site also has links to Harnad’s webpages and to five selected articles. One of them in particular, about the structure of dictionaries, interested me. Here’s the citation, abstract, and a link:

Philippe Vincent-Lamarre, Alexandre Blondin Massé, Marcos Lopes,Mélanie Lord, Odile Marcotte, Stevan Harnad. The Latent Structure of Dictionaries. Topics in Cognitive Science 8 (2016) 625–659. DOI: 10.1111/tops.12211. (Open Access)

Abstract: How many words—and which ones—are sufficient to define all other words? When dictionaries are analyzed as directed graphs with links from defining words to defined words, they reveal a latent structure. Recursively removing all words that are reachable by definition but that do not define any further words reduces the dictionary to a Kernel of about 10% of its size. This is still not the small- est number of words that can define all the rest. About 75% of the Kernel turns out to be its Core, a “Strongly Connected Subset” of words with a definitional path to and from any pair of its words and no word’s definition depending on a word outside the set. But the Core cannot define all the rest of the dictionary. The 25% of the Kernel surrounding the Core consists of small strongly connected subsets of words: the Satellites. The size of the smallest set of words that can define all the rest— the graph’s “minimum feedback vertex set” or MinSet—is about 1% of the dictionary, about 15% of the Kernel, and part-Core/part-Satellite. But every dictionary has a huge number of MinSets. The Core words are learned earlier, more frequent, and less concrete than the Satellites, which are in turn learned earlier, more frequent, but more concrete than the rest of the Dictionary. In principle, only one MinSet’s words would need to be grounded through the sensorimotor capacity to recognize and categorize their referents. In a dual-code sensorimotor/symbolic model of the mental lexicon, the symbolic code could do all the rest through recombinatory definition.

Finally, somewhere latish in the conversation Harnad made an incisive remark about the vexed issue of whether or not LLMs really understand language. The issue, he remarked, is not whether or not they understand language as we do, but how they can do so much without such understanding. YES, a thousand times yes.

He also noted that he enjoys working with, what was it? ChatGPT. So do I, so do I. And I haven’t the slightest suspicion, worry, or hope that it might be sentient. It is what it is.

[-]Capybasilisk1y10

Just listened to this.

It's sounds like Harnad is stating outright that there's nothing an LLM could do that would make him believe it's capable of understanding.

At that point, when someone is so fixed in their worldview that no amount of empirical evidence could move them, there really isn't any point in having a dialogue.

It's just unfortunate that, being a prominent academic, he'll instill these views into plenty of young people.

[-]Harnad3mo24

Yes, there's an empirical way to make me (or anyone) believe an LLM is understanding: Ground it in the capacity to pass the robotic version of the Turing Test: i.e., walk the walk, not just talk the talk, Turing indistinguishable from a real, understanding person (for a lifetime, if need be). A mere word-bag in a vat, no matter how big, can't do that.

[-]Bill Benzon1y10

I think he was just taking about ChatGPT at that point, but I don't recall exactly what he said.