x

LESSWRONG

LW

Amritanshu Prasad — LessWrong

Amritanshu Prasad

Amritanshu Prasad

Message

54

9mo

Amritanshu Prasad

54

9mo

Understanding when and why agents scheme

by Mia Hopman, Jannes Elstner, Maria Avramidou, Amritanshu Prasad, David Lindner, and LASR Labs

TL;DR * To understanding the conditions under which LLM agents engage in scheming behavior, we develop a framework that decomposes the decision to scheme into agent factors (model, system prompt, tool access) and environmental factors (stakes, oversight, outcome influence) * We systematically vary these factors in four realistic settings, each...

Current LLM agents need strong pressure to engage in scheming behavior

by Mia Hopman, Jannes Elstner, Maria Avramidou, Amritanshu Prasad, David Lindner, and LASR Labs

This is an interim report produced as part of the Summer 2025 LASR Labs cohort, supervised by David Lindner. For the full version, see our paper on the LASR website. As this is an ongoing project, we would like to use this post both as an update, and as a...

Nov 20, 2025•24