esthle Amitace

A sentence written by an LLM is said by no one, to no one, for no reason, with no agentic mental state behind it, with no assertor to participate in the ongoing world co-creation that assertions are usually supposed to be part of.

As both janus in Simulators and later nostalgebraist in the void have shown, a text written by a LLM is always written by (a simulated) someone. LLMs cannot write without internally (re)constructing the personality of an author who could have written these words - indeed, often having... (read more)

-1

Replying toSimulators

esthle Amitace2y*Review for 2022 Review

Simulators

This post is not only a groundbreaking research into the nature of LLMs but also a perfect meme. Janus's ideas are now widely cited at AI conferences and papers around the world. While the assumptions may be correct or incorrect, the Simulators theory has sparked huge interest among a broad audience, including not only AI researchers. Let's also appreciate the fact that this post was written based on the author's interactions with non-RLHFed GPT-3 model, well before the release of ChatGPT or Bing, and it has accurately predicted some quirks in their behaviors.

For me, the most important implication of the Simulators theory is that LLMs are neither agents nor tools. Therefore, the alignment/safety measures developed within the Bostromian paradigm are not applicable to them, a point Janus later beautifully illustrated in the Waluigi Effect post. This leads me to believe that AI alignment has to be a practical discipline and cannot rely purely on theoretical scenarios.

LESSWRONG
LW

LESSWRONG
LW

esthle Amitace

esthle Amitace

esthle Amitace

esthle Amitace

esthle Amitace

esthle Amitace