LESSWRONG
LW

Experiments in instrumental convergence

Oct 12, 2022 by Edouard Harris

This sequence investigates instrumental convergence and power-seeking through a series of experiments in multi-agent RL.

The key question we explore: If humans build AIs that learn faster than we do, will those AIs compete with us by default?

33Instrumental convergence in single-agent systems
Ω
Edouard Harris, simonsdsuo
3y
Ω
4
21Misalignment-by-default in multi-agent systems
Ω
Edouard Harris, simonsdsuo
3y
Ω
8
22Instrumental convergence: scale and physical interactions
Ω
Edouard Harris, simonsdsuo
3y
Ω
0
29POWERplay: An open-source toolchain to study AI power-seeking
Ω
Edouard Harris
3y
Ω
0