LESSWRONG
LW

317
Wikitags

AI Success Models

Edited by plex last updated 17th Nov 2021

AI Success Models are proposed paths to an existential win via aligned AI. They are (so far) high level overviews and won't contain all the details, but present at least a sketch of what a full solution might look like. They can be contrasted with threat models, which are stories about how AI might lead to major problems.

Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged AI Success Models
63Solving the whole AGI control problem, version 0.0001
Ω
Steven Byrnes
4y
Ω
7
220An overview of 11 proposals for building safe advanced AI
Ω
evhub
5y
Ω
37
81A positive case for how we might succeed at prosaic AI alignment
Ω
evhub
4y
Ω
46
114Conversation with Eliezer: What do you want the system to do?
Orpheus16
3y
38
58Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
Ω
Evan R. Murphy
3y
Ω
0
8Any further work on AI Safety Success Stories?
Q
Krieger
3y
Q
6
128AI Safety "Success Stories"
Ω
Wei Dai
6y
Ω
27
112Four visions of Transformative AI success
Ω
Steven Byrnes
2y
Ω
22
85Success without dignity: a nearcasting story of avoiding catastrophe by luck
HoldenKarnofsky
3y
17
85Various Alignment Strategies (and how likely they are to work)
Logan Zoellner
3y
34
80An Open Agency Architecture for Safe Transformative AI
Ω
davidad
3y
Ω
22
60Conditioning Generative Models for Alignment
Ω
Jozdien
3y
Ω
8
59Gradient Descent on the Human Brain
Ω
Jozdien, gaspode
1y
Ω
5
52Against blanket arguments against interpretability
Dmitry Vaintrob
8mo
4
32How Would an Utopia-Maximizer Look Like?
Ω
Thane Ruthenis
2y
Ω
23
Load More (15/32)
Add Posts