x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
"Why Not Just..." — LessWrong
"Why Not Just..."
A compendium of rants about alignment proposals, of varying charitability.
163
Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc
Ω
johnswentworth
4y
Ω
56
174
Godzilla Strategies
Ω
johnswentworth
4y
Ω
72
111
Rant on Problem Factorization for Alignment
Ω
johnswentworth
4y
Ω
53
150
Interpretability/Tool-ness/Alignment/Corrigibility are not Composable
Ω
johnswentworth
4y
Ω
13
211
How To Go From Interpretability To Alignment: Just Retarget The Search
Ω
johnswentworth
4y
Ω
34
126
Oversight Misses 100% of Thoughts The AI Does Not Think
Ω
johnswentworth
4y
Ω
49
83
Human Mimicry Mainly Works When We’re Already Close
Ω
johnswentworth
4y
Ω
16
229
Worlds Where Iterative Design Fails
Ω
johnswentworth
4y
Ω
31
188
Why Not Just... Build Weak AI Tools For AI Alignment Research?
Ω
johnswentworth
3y
Ω
18
160
Why Not Just Outsource Alignment Research To An AI?
Ω
johnswentworth
3y
Ω
50
150
OpenAI Launches Superalignment Taskforce
Zvi
3y
40
56
Why Not Just Train For Interpretability?
johnswentworth
5mo
12