LESSWRONG
LW

1163
wdmacaskill
953Ω3210460
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"
wdmacaskill7d6218

Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.

Some areas where the criticism seems reasonable:

  • I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
  • I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: "I definitely think it will be extremely valuable to have the option to slow down AI development in the future,” as well as “the current situation is f-ing crazy”. (Though there was also a further vibe on twitter of “we should be uniting rather than disagreeing” which I think is a bad road to go down.)

Now, clarifying my position:

Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):

The book argues that building a superhuman AI would be predictably fatal for humanity and therefore urges an immediate, globally enforced halt to AI escalation—consolidating and monitoring compute under treaty, outlawing capability‑enabling research, and, if necessary, neutralizing rogue datacenters—while mobilizing journalists and ordinary citizens to press leaders to act.

And what readers will think the book is about (again written by GPT5-Pro):

A “shut‑it‑all‑down‑now” manifesto warning that any superintelligent AI will wipe us out unless governments ban frontier AI and are prepared to sabotage or bomb rogue datacenters—so the public and the press must demand it.

The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.

Here’s a different perspective:

AI takeover x-risk is high, but not extremely high (e.g. 1%-40%). The right response is an “everything and the kitchen sink” approach — there are loads of things we can do that all help a bit in expectation (both technical and governance, including mechanisms to slow the intelligence explosion), many of which are easy wins, and right now we should be pushing on most of them. 

This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.

(I think the whole strategic picture would include:

There are a lot of other existential-level challenges, too (including human coups / concentration of power), and ideally the best strategies for reducing AI takeover risk shouldn’t aggravate these other risks.

But I think that’s fine not to discuss in a book focused on AI takeover risk.)

This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.

With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:

  • If it means more people don’t pursue the better strategy of focusing on the easier wins.
  • Or they end up making the wrong tradeoffs. (e.g. intense centralisation of AI development in a way that makes misaligned human takeover risk more likely)
  • Or people might lapse into defeatism: “Ok we’re doomed, then: a decades-long international ban will never happen, so it’s pointless to work on AI x-risk.” (We already see this reaction to climate change, given doomerist messaging there. To be clear, I don’t think that sort of effect should be a reason for being misleading about one’s views.)

Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world. 

I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.

Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:

he's more or less calling on governments to sit back and let it happen

I really don’t think that!

He thinks feedback loops like “AIs do AI capabilities research” won’t accelerate us too much first.

I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.

But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl. 

(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)

Will thinks government compute monitoring is a bad idea

Definitely disagree with this one! In general, society having more options and levers just seems great to me. 

​​​he's sufficiently optimistic that the people who build superintelligence will wield that enormous power wisely and well, and won't fall into any traps that fuck up the future

Definitely disagree! 

Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)

He's proposing that humanity put all of its eggs in this one basket

Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.

It seems hard to be more than 90% confident in the whole conjunction, in which case there's a double-digit chance that the everyone-races-to-build-superintelligence plan brings the world to ruin.

But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)

a conclusion like “things will be totally fine as long as AI capabilities trendlines don’t change.”

Also not true, though I am more optimistic than many on the takeover side of things. 

to advocate that we race to build it as fast as possible

Also not true - e.g. I write here about the need to slow the intelligence explosion.

There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.

(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)

In general, I think that how the IE happens and is governed is a much bigger deal than when it happens. 

like, I still associate Will to some degree with the past version of himself who was mostly unconcerned about near-term catastrophes and thought EA's mission should be to slowly nudge long-term social trends.

Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.

I think Will was being unvirtuously cagey or spin-y about his views

Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).

 

Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).

Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades.  E.g.

As Joe Collman notes, a common straw version of the If Anyone Builds It, Everyone Dies thesis is that "existing AIs are so dissimilar" to a superintelligence that "any work we do now is irrelevant," when the actual view is that it's insufficient, not irrelevant.

But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that. 

(I think I’m unlikely to respond further, but thanks again for the engagement.)

Reply
How hard to achieve is eutopia?
wdmacaskill2mo10

Thanks! I think these questions do matter, though it's certainly exploratory work, and at the moment tractability is still unclear. I talk about some of the practical upshots in the final essay.

Reply
Should we aim for flourishing over mere survival? The Better Futures series.
wdmacaskill2mo21

Thanks. On the EA Forum Ben West points out the clarification: I'm not assuming x-risk is lower than 10%; in my illustrative example I suggest 20%. (My own views are a little lower than that, but not an OOM lower, especially given that this % is about locking into near-0 value futures, not just extinction.)  I wasn't meaning to be placing a lot of weight on the superforecasters' estimates.

Actually, because I ultimately argue that Flourishing is only 10% or less, for this part of the argument to work (i.e., for Flourishing to be be greater in scale than Surviving), I only need x-risk this century to be less than 90%! (Though the argument gets a lot weaker the higher your p(doom)).

The point is that we're closer to the top of the x-axis than we are to the top of the y-axis.


Reply
Preparing for the Intelligence Explosion
wdmacaskill7mo40

Ah, by the "software feedback loop" I mean: "At the point of time at which AI has automated AI R&D, does a doubling of cognitive effort result in more than a doubling of output? If yes, there's a software feedback loop - you get (for a time, at least) accelerating rates of algorithmic efficiency progress, rather than just a one-off gain from automation." 

I see now why you could understand "RSI" to mean "AI improves itself at all over time". But even so, the claim would still hold - even if (implausibly) AI gets no smarter than human-level, you'd still get accelerated tech development,  because the quantity of AI research effort would increase at a growth rate much faster than the quantity of human research effort. 

Reply
Preparing for the Intelligence Explosion
wdmacaskill7mo83

There's definitely a new trend towards custom-website essays. Forethought is a website for lots of research content, though (like Epoch), not just PrepIE.

And I don't think it's because of people getting more productive because of reasoning models - AI was helpful for PrepIE but more like 10-20% productivity boost than 100% boost, and I don't think AI was used much for SA, either.

Reply1
Preparing for the Intelligence Explosion
wdmacaskill7mo20

Thanks - appreciate that! It comes up a little differently for me, but still an issue - we've asked the devs to fix. 

Reply
Donating to MIRI vs. FHI vs. CEA vs. CFAR
wdmacaskill12y40

Argh! Original post didn't go through (probably my fault), so this will be shorter than it should be:

First point:

I know very little about CEA, and a brief check of their website leaves me a little unclear on why Luke recommends them, aside from the fact that they apparently work closely with FHI.

CEA = Giving What We Can, 80,000 Hours, and a bit of other stuff

Reason -> donations to CEA predictably increase the size and strength of the EA community, a good proportion of whom take long-run considerations very seriously and will donate to / work for FHI/MIRI, or otherwise pursue careers with the aim of extinction risk mitigation. It's plausible that $1 to CEA generates significantly more than $1's worth of x-risk-value [note: I'm a trustee and founder of CEA].

Second point:

Don't forget CSER. My view is that they are even higher-impact than MIRI or FHI (though I'd defer to Sean_o_h if he disagreed). Reason: marginal donations will be used to fund program management + grantwriting, which would turn ~$70k into a significant chance of ~$1-$10mn, and launch what I think might become one of the most important research institutions in the world. They have all the background (high profile people on the board; an already written previous grant proposal that very narrowly missed out on being successful). High leverage!

Reply
Donating to MIRI vs. FHI vs. CEA vs. CFAR
wdmacaskill12y320

CEA and CFAR don't do anything, to my knowledge, that would increase these odds, except in exceedingly indirect ways.

People from CEA, in collaboration with FHI, have been meeting with people in the UK government, and are producing policy briefs on unprecedented risks from new technologies, including AI (the first brief will go on the FHI website in the near future). These meetings arose as a result of GWWC media attention. CEA's most recent hire, Owen Cotton-Barratt, will be helping with this work.

Reply
'Effective Altruism' as utilitarian equivocation.
wdmacaskill12y00

your account of effective altruism seems rather different from Will's: "Maybe you want to do other things effectively, but >then it's not effective altruism". This sort of mixed messaging is exactly what I was objecting too.

I think you've revised the post since you initially wrote it? If so, you might want to highlight that in the italics at the start, as otherwise it makes some of the comments look weirdly off-base. In particular, I took the initial post to aim at the conclusion:

  1. EA is utilitarianism in disguise which I think is demonstrably false.

But now the post reads more like the main conclusion is:

  1. EA is vague on a crucial issue, which is whether the effective pursuit of non-welfarist goods counts as effective altruism. which is a much more reasonable thing to say.
Reply
'Effective Altruism' as utilitarian equivocation.
wdmacaskill12y70

I think the simple answer is that "effective altruism" is a vague term. I gave you what I thought was the best way of making it precise. Weeatquince, and Luke Muelhauser wanted to make it precise in a different way. We could have a debate about which is the more useful precisifcation, but I don't think that here is the right place for that.

On either way of making the term precise, though, EA is clearly not trying to be the whole of morality, or to give any one very specific conception of morality. It doesn't make a claim about side-constraints; it doesn't make a claim about whether doing good is supererogatory or obligatory; it doesn't make a claim about the nature of welfare. EA is broad tent, and deliberately so: very many different ethical perspectives will agree, for example, that it's important to find out which charities do the most to improve the welfare of those living in extreme poverty (as measured by QALYs etc), and then encouraging people to give to those charities. If so, then we've got an important activity that people of very many different ethical backgrounds can get behind - which is great!

Reply
Load More
19How to make the future better (other than by reducing extinction risk)
2mo
1
41The trajectory of the future could soon get set in stone
2mo
2
22Will morally motivated actors steer us towards a near-best future?
2mo
0
22How hard to achieve is eutopia?
2mo
0
17How hard to achieve is eutopia?
2mo
2
63Should we aim for flourishing over mere survival? The Better Futures series.
2mo
8
40Three Types of Intelligence Explosion
Ω
7mo
Ω
8
45Intelsat as a Model for International AGI Governance
7mo
0
20Forethought: a new AI macrostrategy group
7mo
0
78Preparing for the Intelligence Explosion
7mo
17
Load More