Storytelling Makes GPT-3.5 Deontologist: Unexpected Effects of Context on LLM Behavior
TL;DR When prompted to make decisions, do large language models (LLMs) show power-seeking behavior, self-preservation instincts, and long-term goals? Discovering Language Model Behaviors with Model-Written Evaluations (Perez et al.) introduced a set of evaluations for these behaviors, along with other dimensions of LLM self-identity, personality, views, and decision-making. Ideally, we’d...