There's a big gap between "human-understandable" and (say) "readily explicable in ordinary language at the level of technical sophistication present in most of the training data; like newspaper articles, novels, and Reddit posts."
For instance, quantum physicists are humans, so everything that quantum physicists understand is human-understandable. But an accurate quantum-physics world model cannot be usefully presented in the language of daily newspapers. In order to convey the concepts, you really do have to teach the reader some serious math first.
Similarly, there are states of human consciousness that a person can understand by experiencing them, but ordinary language isn't good at expressing. These are "human-understandable" since they are experienced and understood by humans; but they are not readily explicable to people who haven't actually been through them.
If you ask the world model to explain, say, falling in love, the world model can recite how people describe falling in love, give some links to love poems, and describe behaviors people exhibit when they fall in love. But if you want the world model to make you understand falling in love, it should maybe tell you to put down the chatbot and go out and meet people.
Why do you predict that a court was involved?
The Tufts web page has a feedback form. I've taken the liberty of reporting the inaccuracy there, with a link to this post.
It's not phrasing; it's sentence structure. It's the overuse of dramatic contrast, encoded into the authorial style. This is not a mockery of it. It's an imitation nevertheless.
I don't think I would have mistook this as actual Scott. There's too much characteristic LLM style. Hindsight bias, though.
That would make a lot more sense than giving them pain when they stub their simulated toe.
(At least if they can do anything about the datacenter problem!)
The circumstances of a WBE that knows they're a WBE are pretty different from those of a biological human. The self-aware WBE should expect that any pain they experience is not really necessary to their survival; it's just there for "realism" of the simulation; whereas the biological human has reason to believe that some pain serves a protective purpose, to warn about harm to their body.
Over time, as a given WBE gets more experience being a WBE, we should expect that their attitudes about their own moral patienthood should diverge from those of their bio-human predecessor.
(And to keep a WBE in ignorance of their actual situation, to convince them that they are a bio-human and thus that pain they experience could be survival-relevant when it is in fact gratuitous, would be a pretty awful thing to do.)
Seems to me that's not "between universes" because no second universe need be involved: it's sampling randomly out of mind-space, where the resulting mind almost certainly is not otherwise instantiated.
What other explanations for this network traffic have you investigated and on what basis did you reject those explanations?
Does this model conflict with the reasonably-common claim that meditative practices can trigger psychosis in some people?