while it’s easy to agree with some abstract version of “upgrade” (as in try to channel AI capability gains into our ability to align them), the main bottleneck to physical upgrading is the speed difference between silicon and wet carbon: https://www.lesswrong.com/posts/Ccsx339LE9Jhoii9K/slow-motion-videos-as-ai-risk-intuition-pumps
yup, i tried invoking church-turing once, too. worked about as well as you’d expect :)
looks great, thanks for doing this!
one question i get every once in a while and wish i had a canonical answer to is (probably can be worded more pithily):
"humans have always thought their minds are equivalent to whatever's their latest technological achievement -- eg, see the steam engines. computers are just the latest fad that we currently compare our minds to, so it's silly to think they somehow pose a threat. move on, nothing to see here."
note that the canonical answer has to work for people whose ontology does not include the concepts of "computation" nor "simulation". they have seen increasingly universal smartphones and increasingly realistic computer games (things i've been gesturing at in my poor attempts to answer) but have no idea how they work.
the potentially enormous speed difference (https://www.lesswrong.com/posts/Ccsx339LE9Jhoii9K/slow-motion-videos-as-ai-risk-intuition-pumps) will almost certainly be an effective communications barrier between humans and AI. there’s a wonderful scene of AIs vs humans negotiation in william hertling’s “A.I. apocalypse” that highlights this.
i agree that there's the 3rd alternative future that the post does not consider (unless i missed it!):
3. markets remain in an inadequate equilibrium until the end of times, because those participants (like myself!) who consider short timelines remain in too small minority to "call the bluff".
see the big short for a dramatic depiction of such situation.
great post otherwise. upvoted.
yeah, this seems to be the crux: what will CEV prescribe for spending the altruistic (reciprocal cooperation) budget on. my intuition continues to insist that purchasing the original star systems from UFAIs is pretty high on the shopping list, but i can see arguments (including a few you gave above) against that.
oh, btw, one sad failure mode would be getting clipped by a proto-UFAI that’s too stupid to realise it’s in a multi-agent environment or something,
ETA: and, tbc, just like interstice points out below, my “us/me” label casts a wider net than “us in this particular everett branch where things look particularly bleak”.
roger. i think (and my model of you agrees) that this discussion bottoms out in speculating what CEV (or equivalent) would prescribe.
my own intuition (as somewhat supported by the moral progress/moral circle expansion in our culture) is that it will have a nonzero component of “try to help out the fellow humans/biologicals/evolved minds/conscious minds/agents with diminishing utility function if not too expensive, and especially if they would do the same in your position”.
yeah, as far as i can currently tell (and influence), we’re totally going to use a sizeable fraction of FAI-worlds to help out the less fortunate ones. or perhaps implement a more general strategy, like mutual insurance pact of evolved minds (MIPEM).
this, indeed, assumes that human CEV has diminishing returns to resources, but (unlike nate in the sibling comment!) i’d be shocked if that wasn’t true.
sure, this is always a consideration. i'd even claim that the "wait.. what about the negative side effects?" question is a potential expected value spoiler for pretty much all longtermist interventions (because they often aim for effects that are multiple causal steps down the road), and as such not really specific to software.
having seen the “kitchen side” of the letter effort, i endorse almost all zvi’s points here. one thing i’d add is that one of my hopes urging the letter along was to create common knowledge that a lot of people (we’re going to get to 100k signatures it looks like) are afraid of the thing that comes after GPT4. like i am.
thanks, everyone, who signed.
EDIT: basically this: https://twitter.com/andreas212nyc/status/1641795173972672512