ejb - LessWrong

I'm familiar with semiotics and language models, but I don't understand why you're calling this "semiotic physics" instead of "computational linguistics".

Linguistics vs. semiotics - I think you might say, that with text tokens, we're not just talking about natural language, we're talking about arbitrary symbol systems. If an LLM can program, do math, learn dyck languages, or various icons like emojis, one might say that it's working with more than just natural language. However, I'd argue that (a) this is still a vastly limited subset of sign systems, and any sign system in an LLM is being encoded into a completely symbolic (i.e. text tokens, not icons or indexes) format, and (b) "linguistics" and "language" are widely-accepted terminology for NLP and LLMs, so this seems like a barrier of communication and extra translation cost for many readers. I'd also note that even psycholinguists that study human language production also account for non-symbolic sign systems in language use such as prosody (e.g. voice loudness is or can be iconic).

"As linguistically capable creatures, we experience the simulator's outputs as semantic. The tokens in the generated trajectory carry meaning, and serve as semiotic signs. This is why we refer to the simulator's physics-analogue as semiotic physics." Why did we jump from "linguistics" to "sign systems"? It's not because of semantics, and none of the analyses in the simulators seminar sequence seem to rely on terms or distinctions specific to semiotics and not linguistics, e.g. symbol-icon-index, or sign-object-interpretant.

I'm also aware there are deep historical and theoretical connections between linguistics and semiotics, e.g. Saussure's Course of General Linguistics, but you're not mentioning any of that here.
Physics vs. computational - I don't understand what makes this "physics of X" instead of "computational modeling of X". I get that you're talking about learning dynamics, but there's tons of "computational" work that does just this in computational linguistics (e.g. Partha Niyogi's work on computational models of language evolution is particularly "physics-y"), cognitive science, neuroscience, and A.I., and it doesn't seem to lose anything with using "computational" instead of "physics"

You say "In this analogical sense, a simulator such as GPT implements a "physics" whose "elementary particles" are linguistic tokens.". Linguistics is the subfield of science where the primitive units are linguistic tokens... Maybe I'm missing something here, but it seems like you can have these nice analogies with simulation and multiverses without calling it "physics".

How does this relate to computational semiotics?

A note on 'semiotic physics'

ejb1y20

semiotic physics alludes to a lower level analysis: more analogous to studying neural firing dynamics on the human side than linguistics

Many classic debates in cognitive science and AI, e.g. between symbolism and connectionism, translate to claims about neural substrates. Most work with LLMs that I've seen abstracts over many such details, and seems in some ways more akin to linguistics, describing structure in high-level behavior, than neuroscience. It seems like there's lots of overlap between what you're talking about and Conceptual Role Semantics - here's a nice, modern treatment of it in computational cognitive science.

I think I kind of get the use of "semiotics" more than "physics". For example, with multi-modal LLMs the symbol/icon barrier begins to dissolve, so GPT-4 can reason about diagrams to some extent. The wikipedia entry for social physics provides some relevant context:

"More recently there have been a large number of social science papers that use mathematics broadly similar to that of physics, and described as "Computational social science"

LESSWRONG
LW

Posts

Wiki Contributions

Comments