This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Aligned AI Proposals
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Random Tag
Contributors
Posts tagged
Aligned AI Proposals
Most Relevant
5
64
How to Control an LLM's Behavior (why my P(DOOM) went down)
Ω
RogerDearnaley
4mo
Ω
30
5
46
Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor
Ω
RogerDearnaley
3mo
Ω
8
5
35
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
3mo
4
5
22
Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?
RogerDearnaley
3mo
4
5
20
Requirements for a Basin of Attraction to Alignment
Ω
RogerDearnaley
1mo
Ω
6
5
4
Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis
RogerDearnaley
2mo
15
4
30
Interpreting the Learning of Deceit
Ω
RogerDearnaley
3mo
Ω
8
2
157
A list of core AI safety problems and how I hope to solve them
Ω
davidad
7mo
Ω
23
2
114
AI Alignment Metastrategy
Ω
Vanessa Kosoy
3mo
Ω
11
2
47
Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Ω
Chipmonk
3mo
Ω
3
2
45
an Evangelion dialogue explaining the QACI alignment plan
Ω
Tamsin Leake
10mo
Ω
15
2
30
We have promising alignment plans with low taxes
Ω
Seth Herd
5mo
Ω
9
2
27
The (partial) fallacy of dumb superintelligence
Ω
Seth Herd
5mo
Ω
5
2
11
Two paths to win the AGI transition
Nathan Helm-Burger
9mo
8
2
8
Desiderata for an AI
Nathan Helm-Burger
8mo
0