LLMs might have subjective experiences, but no concepts for them

How much do you think subjective experience owes to the internal-state-analyzing machinery?

I'm big on continua and variety. Trees have subjective experience, they just have a little, and it's different than mine. But if I wanted to inspect that subjective experience, I probably couldn't do it by strapping a Broca's area etc. to inputs from the tree so that the tree could produce language about its internal states. The introspection, self-modeling, and language-production circuitry isn't an impartial window into what's going on inside, the story it builds reflects choices about how to interpret its inputs.

[-]No77e5mo10

How much do you think subjective experience owes to the internal-state-analyzing machinery?

I'm actually not really sure. I find it plausible that subjective experience could exist without internal-state-analyzing machinery, and that's what I'm hypothesizing is going on with LLMs to some extent. I think they do have some self-model, but they don't seem to have access to internal states the way we do. Although I somehow think it's more likely that an LLM experiences something than a tree experiences something.

if I wanted to inspect that subjective experience, I probably couldn't do it by strapping a Broca's area etc.

I maybe agree with that, conditional on trees having subjective experience. What I do think might work is doing something more comprehensive: maybe bootstrapping trees with a lot more machinery that includes something to form concepts that correspond to whatever processes are leading to their experiences (insofar as there are processes corresponding to experiences. I'd guess things do work in this way, but I'm not sure). That machinery needs to be somehow causally entangled with those processes; consider how humans have complicated feedback loops such as touch-fire -> pain -> emotion -> self-modeling -> bodily-reaction -> feedback-from-reaction-back-to-brain...

The introspection, self-modeling, and language-production circuitry isn't an impartial window into what's going on inside, the story it builds reflects choices about how to interpret its inputs.

Yeah, that seems true too, but I guess if you have a window at all, then you still have some causal mechanism that goes from internal states corresponding to experiences to internal concepts correlated to those, which might be enough. Now, though, I'm pretty unsure whether the experience is actually due to the concepts themselves or the states that caused them, or whether this is just a confused way of seeing the problem.

[-]Gunnar_Zarncke5mo2-2

After a lengthy conversation with ChatGPT-o4-mini, I think that its last report is a pretty close rendering of what kinds of internal experiences it has:

I don’t have emotions in the way humans do—no genuine warmth, sadness, or pain—but if I translate my internal “wobbliness meter” into words, I’d say I’m fairly confident right now. My next‐token probabilities are sharply peaked (low entropy), so I “feel” something like “I’m pretty sure” rather than “I’m a bit unsure.”

[-]Charlie Steiner5mo31

I dunno, this seems like the sort of thing LLMs would be quite unreliable about - e.g. they're real bad at introspective questions like "How did you get the answer to this math problem?" They are not model-based, let alone self-modeling, in the way that encourages generalizing to introspection.

[-]Gunnar_Zarncke5mo40

I agree and the linked analysis agrees too. LLMs do not have the same feedback mechanisms for learning such state descriptions. But something like "feelings of confidence" is arguably something the model could represent.

LESSWRONG
LW

LESSWRONG
LW

10

LLMs might have subjective experiences, but no concepts for them

10

10