I guess actually the goal is just to get something aligned enough to do a pivotal act. I don't see though why an approach that tries to maintain a relatively-sufficient level of alignment (relative to current capabilities) as capabilities scale couldn't work for that.
Different views about the fundamental difficulty of inner alignment seem to be a (the?) major driver of differences in views about how likely AI X risk is overall.
I strongly disagree with inner alignment being the correct crux. It does seem to be true that this is in fact a crux for many people, but I think this is a mistake. It is certainly significant. But I think optimism about outer alignment and global coordination ("Catch-22 vs. Saving Private Ryan") is much bigger factor, and optimists are badly wrong on both points here.
Are there people in the AI alignment / x-safety community who are still major "Deep Learning skeptics" (in terms of capabilities)? I know Stuart Russell is... who else?
IMO, the outer alignment problem is still the biggest problem in (technical) AI Alignment. We don't know how to write down -- or learn -- good specifications, and people making strong AIs that optimize for proxies is still what's most likely to get us all killed.
I'm torn because I mostly agree with Eliezer that things don't look good, and most technical approaches don't seem very promising. But the attitude of unmitigated doomyness seems counter-productive.And there's obviously things worth doing and working on and people getting on with it.It seems like Eliezer is implicitly focused on finding an "ultimate solution" to alignment that we can be highly confident solves the problem regardless of how things play out. But this is not where the expected utility is. The expected utility is mostly in buying time and increasing the probability of success in situations where we are not highly confident that we've solved the problem, but we get lucky.Ideally we won't end up rolling the dice on unprincipled alignment approaches. But we probably will. So let's try and load the dice. But let's also remember that that's what we're doing.
There has been continued progress at about the rate I would've expected -- maybe a bit faster. I think GPT-3 has helped change people's views somewhat, as have further appreciation of other social issues of AI.
turns out DNNs also can't necessarily interpolate properly, even:https://arxiv.org/pdf/2107.08221.pdf
ML prof here.Universal fn approx assumes bounded domain, which basically means it is about interpolation.
The link doesn't work for me.
Interesting! But I downvoted since it's a comment, not an answer.