Playing with the idea that identity is less of an "instantaneous I" of current experience and more like the continuity of experiential snapshots under the curve. Like how no individual frame is "the movie," but when you run them at 24 frames per second you get the experience of a film that emerges from the continuity.
Kicking this around for a post I'm drafting: when an LLM hallucinates something, it's usually at least plausible for the situation. Like a hallucinated citation generally has proper formatting, etc, so the generation worked well enough. It's also confidently incorrect, which is of course what makes it so dangerous to people who don't know any better and so annoying to people who actually know the subject matter.
I've been thinking of the set of all possible responses as a kind of navigable topology (think like the Library of Babel website but instead of linear pages it's a high-dimensional manifold), and it's been productive to think of hallucination as a kind of localization problem. The model is in "citation" space when it should be in "I don't have this" space. The output is locally correct for where the model thinks it is; it's just in the wrong place in response-space relative to reality.
Thinking of the set of possible responses as a kind of response-space provides an interesting lens on the problem. If they're not broken outputs then they may be expected outputs from the wrong context. Would also help explain why "just try harder to be accurate" doesn't really work all that well: effort in generation doesn't help if the error is upstream, in mode-selection. (Though saying "try harder" may well prompt the system to actually evaluate where it is in response-space and relocate if necessary, so it's not totally useless.)
Also suggests an interesting tack might not be "how do we make the model generate better" but "what determines which mode/space the model is in, and can that be checked before output?"
Upshot of this perspective is that it means that just adding compute to a model won't actually help with hallucination if it doesn't also expand reasoning about where the model is in response-space in the first place. If the model doesn't have any way to anchor its internal state to reality, it can compute for a thousand years and never land on an answer that is coherent with that reality. The hallucination bottleneck from this angle doesn't seem like adding additional knowledge, it seems like the limitation is the system's context about where it should be within its own reasoning space.
Anybody else have a similar perspective, or know of posts/papers that explore this dimension? Would love an outside perspective.