capybaralet

Wiki Contributions

Comments

Discussion with Eliezer Yudkowsky on AGI interventions

I guess actually the goal is just to get something aligned enough to do a pivotal act.  I don't see though why an approach that tries to maintain a relatively-sufficient level of alignment (relative to current capabilities) as capabilities scale couldn't work for that.

Comments on Carlsmith's “Is power-seeking AI an existential risk?”

Different views about the fundamental difficulty of inner alignment seem to be a (the?) major driver of differences in views about how likely AI X risk is overall. 

I strongly disagree with inner alignment being the correct crux.  It does seem to be true that this is in fact a crux for many people, but I think this is a mistake.  It is certainly significant.  

But I think optimism about outer alignment and global coordination ("Catch-22 vs. Saving Private Ryan") is much bigger factor, and optimists are badly wrong on both points here. 

capybaralet's Shortform

Are there people in the AI alignment / x-safety community who are still major "Deep Learning skeptics" (in terms of capabilities)?  I know Stuart Russell is... who else?

capybaralet's Shortform

IMO, the outer alignment problem is still the biggest problem in (technical) AI Alignment.  We don't know how to write down -- or learn -- good specifications, and people making strong AIs that optimize for proxies is still what's most likely to get us all killed.

Discussion with Eliezer Yudkowsky on AGI interventions

I'm torn because I mostly agree with Eliezer that things don't look good, and most technical approaches don't seem very promising. 

But the attitude of unmitigated doomyness seems counter-productive.
And there's obviously things worth doing and working on and people getting on with it.

It seems like Eliezer is implicitly focused on finding an "ultimate solution" to alignment that we can be highly confident solves the problem regardless of how things play out.  But this is not where the expected utility is. The expected utility is mostly in buying time and increasing the probability of success in situations where we are not highly confident that we've solved the problem, but we get lucky.

Ideally we won't end up rolling the dice on unprincipled alignment approaches.  But we probably will.  So let's try and load the dice.  But let's also remember that that's what we're doing.
 

Yoshua Bengio on AI progress, hype and risks

There has been continued progress at about the rate I would've expected -- maybe a bit faster.  I think GPT-3 has helped change people's views somewhat, as have further appreciation of other social issues of AI. 

The Extrapolation Problem

turns out DNNs also can't necessarily interpolate properly, even:
https://arxiv.org/pdf/2107.08221.pdf

The Extrapolation Problem

ML prof here.
Universal fn approx assumes bounded domain, which basically means it is about interpolation.

Load More