This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Aligned AI Proposals
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Aligned AI Proposals
Random Tag
Contributors
Posts tagged
Aligned AI Proposals
Most Relevant
3
60
How to Control an LLM's Behavior (why my P(DOOM) went down)
Ω
RogerDearnaley
5d
Ω
28
2
156
A list of core AI safety problems and how I hope to solve them
Ω
davidad
3mo
Ω
22
2
45
an Evangelion dialogue explaining the QACI alignment plan
Ω
Tamsin Leake
6mo
Ω
15
2
26
We have promising alignment plans with low taxes
Ω
Seth Herd
23d
Ω
9
2
23
The (partial) fallacy of dumb superintelligence
Ω
Seth Herd
1mo
Ω
3
2
11
Two paths to win the AGI transition
Nathan Helm-Burger
5mo
8
2
8
Desiderata for an AI
Nathan Helm-Burger
4mo
0
1
72
An Open Agency Architecture for Safe Transformative AI
Ω
davidad
1y
Ω
22
1
18
Annotated reply to Bengio's "AI Scientists: Safe and Useful AI?"
Roman Leventov
7mo
2
1
18
Proposal: Align Systems Earlier In Training
OneManyNone
7mo
0
1
16
An LLM-based “exemplary actor”
Ω
Roman Leventov
6mo
Ω
0
1
14
AI Alignment: A Comprehensive Survey
Stephen McAleer
1mo
0
1
12
Aligning an H-JEPA agent via training on the outputs of an LLM-based "exemplary actor"
Ω
Roman Leventov
6mo
Ω
10
1
4
Supplementary Alignment Insights Through a Highly Controlled Shutdown Incentive
Justausername
4mo
1
1
4
Is Interpretability All We Need?
RogerDearnaley
19d
0