LESSWRONG"Why Not Just..."
LW

"Why Not Just..."

Aug 08, 2022 by johnswentworth

A compendium of rants about alignment proposals, of varying charitability.

133Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/EtcΩ
johnswentworth
1y
Ω
52
144Godzilla StrategiesΩ
johnswentworth
1y
Ω
66
75Rant on Problem Factorization for AlignmentΩ
johnswentworth
10mo
Ω
48
119Interpretability/Tool-ness/Alignment/Corrigibility are not ComposableΩ
johnswentworth
10mo
Ω
12
157How To Go From Interpretability To Alignment: Just Retarget The SearchΩ
johnswentworth
10mo
Ω
32
90Oversight Misses 100% of Thoughts The AI Does Not ThinkΩ
johnswentworth
10mo
Ω
49
71Human Mimicry Mainly Works When We’re Already CloseΩ
johnswentworth
10mo
Ω
16
175Worlds Where Iterative Design FailsΩ
johnswentworth
10mo
Ω
27
149Why Not Just... Build Weak AI Tools For AI Alignment Research?Ω
johnswentworth
3mo
Ω
17
123Why Not Just Outsource Alignment Research To An AI?Ω
johnswentworth
3mo
Ω
45