LESSWRONG
LW

Wikitags

Selection Theorems

Edited by DragonGod last updated 25th Dec 2022

A Selection Theorem tells us something about what agent type signatures will be selected for in some broad class of environments. Two important points:

  • The theorem need not directly talk about selection - e.g. it could state some general property of optima, of “broad” optima, of “most” optima, or of optima under a particular kind of selection pressure (like natural selection or financial profitability).
  • Any given theorem need not address every question about agent type signatures; it just needs to tell us something about agent type signatures.

For instance, the subagents argument says that, when our “agents” have internal state in a coherence-theorem-like setup, the “goals” will be pareto optimality over multiple utilities, rather than optimality of a single utility function. This says very little about embeddedness or world models or internal architecture; it addresses only one narrow aspect of agent type signatures. And, like the coherence theorems, it doesn’t directly talk about selection; it just says that any strategy which doesn’t fit the pareto-optimal form is strictly dominated by some other strategy (and therefore we’d expect that other strategy to be selected, all else equal).

From: Selection Theorems: A Program For Understanding Agents

Subscribe
1
Subscribe
1
Discussion1
Discussion1
Posts tagged Selection Theorems
128Selection Theorems: A Program For Understanding Agents
Ω
johnswentworth
4y
Ω
28
71What Selection Theorems Do We Expect/Want?
Ω
johnswentworth
4y
Ω
11
56Some Existing Selection Theorems
Ω
johnswentworth
4y
Ω
5
41Understanding Selection Theorems
adamk
3y
3
33Epistemic Strategies of Selection Theorems
Ω
adamShimi
4y
Ω
1
63Clarifying the Agent-Like Structure Problem
Ω
johnswentworth
3y
Ω
19
118Why The Focus on Expected Utility Maximisers?
QΩ
DragonGod, Scott Garrabrant
3y
QΩ
84
54Lessons from Convergent Evolution for AI Alignment
Ω
Jan_Kulveit, rosehadshar
2y
Ω
9
36Selection processes for subagents
Ryan Kidd
3y
2
155Fixing The Good Regulator Theorem
Ω
johnswentworth
5y
Ω
39
74Project Intro: Selection Theorems for Modularity
Ω
CallumMcDougall, Avery, Lucius Bushnaq
3y
Ω
20
67An Illustrated Summary of "Robust Agents Learn Causal World Model"
Ω
Dalcy
9mo
Ω
2
60How Do Selection Theorems Relate To Interpretability?
Ω
johnswentworth
3y
Ω
14
34AXRP Episode 15 - Natural Abstractions with John Wentworth
Ω
DanielFilan
3y
Ω
1
28Proof Explained for "Robust Agents Learn Causal World Model"
Dalcy
8mo
0
Load More (15/19)
Add Posts