This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Aligned AI Proposals
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Aligned AI Proposals
Random Tag
Contributors
Posts tagged
Aligned AI Proposals
Most Relevant
6
55
A "Bitter Lesson" Approach to Aligning AGI and ASI
Ω
RogerDearnaley
2mo
Ω
39
5
64
How to Control an LLM's Behavior (why my P(DOOM) went down)
Ω
RogerDearnaley
10mo
Ω
30
5
47
Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor
Ω
RogerDearnaley
8mo
Ω
8
5
38
Requirements for a Basin of Attraction to Alignment
Ω
RogerDearnaley
7mo
Ω
9
5
37
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
8mo
4
5
34
Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?
RogerDearnaley
8mo
4
5
13
Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis
RogerDearnaley
8mo
15
4
30
Interpreting the Learning of Deceit
Ω
RogerDearnaley
9mo
Ω
10
2
163
A list of core AI safety problems and how I hope to solve them
Ω
davidad
1y
Ω
29
2
114
AI Alignment Metastrategy
Ω
Vanessa Kosoy
9mo
Ω
13
2
51
an Evangelion dialogue explaining the QACI alignment plan
Ω
Tamsin Leake
1y
Ω
15
2
47
Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Ω
Chipmonk
8mo
Ω
3
2
32
We have promising alignment plans with low taxes
Ω
Seth Herd
10mo
Ω
9
2
29
The (partial) fallacy of dumb superintelligence
Ω
Seth Herd
1y
Ω
5
2
11
Two paths to win the AGI transition
Nathan Helm-Burger
1y
8