This is a linkpost for https://banburismus.substack.com/p/safety-as-a-scientific-pursuit
Very much appreciate the link post - I’d been trying to write a summary/contextualisation for LW and this is a much better one than I’d come up with.
I’d be very grateful for the LW community’s thoughts (especially any pushback). I expect this will be the source of the strongest counterarguments.
Assumptions are | Continuous | Discountinuous |
---|---|---|
Inductive[1] | Most of ML | Not sure? Maybe Gould, physicists |
Deductive | Christiano, Shulman, OpenPhil | Yudkowsky, most of MIRI |
I like it better than "rationalist" and "empiricist" ↩︎
Thanks! I really like inductive vs deductive and would probably have used them if I’d thought of it.
Tom McGrath, until recently a Research Scientist at Deepmind, has written up why he's not excited about theoretical AI safety. It's similar to Aaronson and Barak's "reform" alignment. It's argued in good faith and pretty constructive.
The key provocation for you is probably his view on the safety implications of open-sourcing models.
The author knows the field has moved in this direction already, and is trying to establish common knowledge and more of that. He also knows that it's a sore point to imply that modern rationalists aren't empiricists. For your blood pressure I recommend that you mentally prepend "AI-" to every mention of rationalism in the post.
My addition: take the Drexler grey goo story. It's usually told as an own (haha stupid pessimistic doomers) but I think it reflects very well on E.D:
1. We identify a possible problem
2. We raise the alarm and do more research
3. We get evidence and update
This seems like the optimal policy to me. McGrath is saying "you have to actually do (3)!"
See also another candidate explanation for intractable disagreement here.