CEO at Redwood Research.
AI safety is a highly collaborative field--almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I'm saying this here because it would feel repetitive to say "these ideas were developed in collaboration with various people" in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.
Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.
If we are ever arguing on LessWrong and you feel like it's kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I'll probably be willing to call to discuss briefly.
@ryan_greenblatt and I are going to record another podcast together. We'd love to hear topics that you'd like us to discuss. (The questions people proposed last time are here, for reference.)
(I haven't spelled out what I think about decision theory publicly, so presumably this won't be completely informative to you. A quick summary is that I think that questions related to decision theory and anthropics are very important for how the future goes, and relevant to some aspects of the earliest risks from misaligned power-seeking AI.)
My impression is that people at AI companies who have similar opinions to me about risk from misalignment tend to also have pretty similar opinions on decision theory etc. AI company staff with more different opinions tend to have not thought much about decision theory etc.
Fwiw I’m not sure this is right; I think that a lot of questions about decision theory become pretty obvious once you start thinking about digital minds and simulations. And my guess is that a lot of FDT-like ideas would have become popular among people like the ones I work with, once people were thinking about those questions.
I really enjoyed speaking! The students had lots of great questions.
if you like to think about this kind of thing, I recommend actually reading Schelling’s “the strategy of conflict”; I thought it had a bunch of interesting points that haven’t made their way into the water supply.
I think I disagree—doing research like this (especially several such projects) is really helpful for getting hired!
Yeah, I think control is unlikely to work for galaxy brained superintelligences. It's unclear how superintelligent they have to be before control is totally unworkable.
Were you suggesting something other than "remove the parentheses?" Or did it seem like I was thinking about it in a confused way? Not sure which direction you thought the mistake was in.
I think that it is worth conceptually distinguishing AIs that are uncontrollable from AIs that are able to build uncontrollable AIs, because the way you should handle those two kinds of AI are importantly different.
Note that temperature does not have this property! Objects generically have different heat capacities at different temperatures.