x

LESSWRONG
LW

cmck

Subscribe

Message

10

1

5y

cmck

Subscribe

Message

10

1

5y

Modeling incentives at scale using LLMs

7

Bruno Marnette, pzahn, cmck

2y

The goal of this post is to collect feedback on a new project idea. This work would be a collaboration between the AI Objectives Institute (AOI), MetaGov and other partner institutions.

Project PIs: Bruno Marnette (AOI) and Philipp Zahn (MetaGov, 20squares).
Special thanks to Colleen McKenzie, Matija Franklin, Timothy Telleen-Lawton, Gaia Dempsay, Ping Yee, Justin Stimatze, Cleo Nardo, Tushant Jha, Deger Turan and others for feedback and review.

Motivations

Humans routinely model what the incentives of other humans may be. Historians look at the incentives of powerful individuals to explain why they made specific decisions. Journalists look at the incentives of large companies to uncover collusions and conflicts of interests. Economists use incentive models to understand markets. Lawmakers look into existing incentives before introducing new ones.

There is however a natural limit...

(Continue Reading - 3619 more words)

The Waluigi Effect (mega-post)

cmck3y65

Describing the waluigi states as stable equilibria and the luigi states as unstable equilibria captures most of what you're describing in the last paragraph here, though without the amplitude of each.

Reply