This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
AI Control
Edit
History
Subscribe
Discussion
(0)
Help improve this page (2 flags)
Edit
History
Subscribe
Discussion
(0)
Help improve this page (2 flags)
AI Control
Random Tag
Contributors
Posts tagged
AI Control
Most Relevant
2
69
AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
Ω
DanielFilan
4mo
Ω
10
2
68
How useful is "AI Control" as a framing on AI X-Risk?
habryka
,
ryan_greenblatt
4mo
4
2
47
Critiques of the AI control agenda
Ω
Jozdien
5mo
Ω
14
2
35
Protocol evaluations: good analogies vs control
Ω
Fabien Roger
5mo
Ω
10
2
28
Games for AI Control
Ω
charlie_griffin
,
Buck
16d
Ω
0
1
47
How to safely use an optimizer
Ω
Simon Fischer
4mo
Ω
21
1
28
Auditing LMs with counterfactual search: a tool for control and ELK
Jacob Pfau
5mo
6
1
7
Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller
Ω
Henry Cai
1mo
Ω
0
1
2
Would a scope-insensitive AGI be less likely to incapacitate humanity?
Q
Jim Buhler
6d
Q
3