Rob Bensinger

Communications lead at MIRI. Unless otherwise indicated, my posts and comments here reflect my own views, and not necessarily my employer's.

Sequences

Late 2021 MIRI Conversations

Wiki Contributions

Comments

Christiano, Cotra, and Yudkowsky on AI progress

Transcript error fixed -- the line that previously read

[Yudkowsky][17:40]  

I expect it to go away before the end of days

but with there having been a big architectural innovation, not Stack More Layers

[Christiano][17:40]  

I expect it to go away before the end of days

but with there having been a big architectural innovation, not Stack More Layers

[Yudkowsky][17:40]  

if you name 5 possible architectural innovations I can call them small or large

should be

[Yudkowsky][17:40]  

I expect it to go away before the end of days

but with there having been a big architectural innovation, not Stack More Layers

[Christiano][17:40]  

yeah

whereas I expect layer stacking + maybe changing loss (since logprob is too noisy) is sufficient

[Yudkowsky][17:40]  

if you name 5 possible architectural innovations I can call them small or large

Yudkowsky and Christiano discuss "Takeoff Speeds"

It feels like this bet would look a lot better if it were about something that you predict at well over 50% (with people in Paul's camp still maintaining less than 50%).

My model of Eliezer may be wrong, but I'd guess that this isn't a domain where he has many over-50% predictions of novel events at all? See also 'I don't necessarily expect self-driving cars before the apocalypse'.

My Eliezer-model has a more flat prior over what might happen, which therefore includes stuff like 'maybe we'll make insane progress on theorem-proving (or whatever) out of the blue'. Again, I may be wrong, but my intuition is that you're Paul-omorphizing Eliezer when you assume that >16% probability of huge progress in X by year Y implies >50% probability of smaller-but-meaningful progress in X by year Y.

Christiano, Cotra, and Yudkowsky on AI progress

One may ask: why aren't elephants making rockets and computers yet?

But one may ask the same question about any uncontacted human tribe.

Seems more surprising for elephants, by default: elephants have apparently had similarly large brains for about 20 million years, which is far more time than uncontacted human tribes have had to build rockets. (~100x as long as anatomically modern humans have existed at all, for example.)

Christiano, Cotra, and Yudkowsky on AI progress

Maybe I'm wrong about her deriving this from the Caplan bet? Ajeya hasn't actually confirmed that, it was just an inference I drew. I'll poke her to double-check.

Christiano, Cotra, and Yudkowsky on AI progress

I think Ajeya is inferring this from Eliezer's 2017 bet with Bryan Caplan. The bet was jokey and therefore (IMO) doesn't deserve much weight, though Eliezer comments that it's maybe not totally unrelated to timelines he'd reflectively endorse:

[T]he generator of this bet does not necessarily represent a strong epistemic stance on my part, which seems important to emphasize. But I suppose one might draw conclusions from the fact that, when I was humorously imagining what sort of benefit I could get from exploiting this amazing phenomenon, my System 1 thought that having the world not end before 2030 seemed like the most I could reasonably ask.

In general, my (maybe-partly-mistaken) Eliezer-model...

  • thinks he knows very little about timelines (per the qualitative reasoning in There's No Fire Alarm For AGI and in Nate's recent post -- though not necessarily endorsing Nate's quantitative probabilities);
  • and is wary of trying to turn 'I don't know' into a solid, stable number for this kind of question (cf. When (Not) To Use Probabilities);
  • but recognizes that his behavior at any given time, insofar as it is coherent, must reflect some implicit probabilities. Quoting Eliezer back in 2016:

[... T]imelines are the hardest part of AGI issues to forecast, by which I mean that if you ask me for a specific year, I throw up my hands and say “Not only do I not know, I make the much stronger statement that nobody else has good knowledge either.” Fermi said that positive-net-energy from nuclear power wouldn’t be possible for 50 years, two years before he oversaw the construction of the first pile of uranium bricks to go critical. The way these things work is that they look fifty years off to the slightly skeptical, and ten years later, they still look fifty years off, and then suddenly there’s a breakthrough and they look five years off, at which point they’re actually 2 to 20 years off.

If you hold a gun to my head and say “Infer your probability distribution from your own actions, you self-proclaimed Bayesian” then I think I seem to be planning for a time horizon between 8 and 40 years, but some of that because there’s very little I think I can do in less than 8 years, and, you know, if it takes longer than 40 years there’ll probably be some replanning to do anyway over that time period.

And then how *long* takeoff takes past that point is a separate issue, one that doesn’t correlate all that much to how long it took to start takeoff. [...]

Yudkowsky and Christiano discuss "Takeoff Speeds"

(... Admittedly, you read fast enough that my 'skimming' is your 'reading'. 😶)

Yudkowsky and Christiano discuss "Takeoff Speeds"

Yeah, even I wasn't sure you'd read those three things, Eliezer, though I knew you'd at least glanced over 'Takeoff Speeds' and 'Biological Anchors' enough to form opinions when they came out. :)

Yudkowsky and Christiano discuss "Takeoff Speeds"

I grimly predict that the effect of this dialogue on the community will be polarization

Beware of self-fulfilling prophecies (and other premature meta)! If both sides in a dispute expect the other side to just entrench, then they're less likely to invest the effort to try to bridge the gap.

This very comment section is one of the main things that will determine the community's reaction, and diverting our focus to 'what will our reaction be?' before we've talked about the object-level claims can prematurely lock in a certain reaction.

(That said, I think you're doing a useful anti-polarization thing here, by showing empathy for people you disagree with, and showing willingness to criticize people you agree with. I don't at all dislike this comment overall; I just want to caution against giving up on something before we've really tried. This is the first proper MIRI-response to Paul's takeoff post, and should be a pretty big update for a lot of people -- I don't think people were even universally aware that Eliezer endorses hard takeoff anymore, much less aware of his reasoning.)

Discussion with Eliezer Yudkowsky on AGI interventions

Thank you, Tapatakt! :)

I feel like https://www.lesswrong.com/s/n945eovrA3oDueqtq could be even more useful to have in foreign languages, though that's a larger project.

A general thought about translations: If I were translating this stuff, I'd plausibly go down a list like this, translating later stuff once the earlier stuff was covered:

  1. Rationality: A-Z
  2. Inadequate Equilibria (+ Hero Licensing)
  3. Scout Mindset (if there aren't legal obstacles to distribution)
  4. Superintelligence (if there aren't legal obstacles to distribution)
  5. There's No Fire Alarm for Artificial General Intelligence
  6. AI Alignment: Why It's Hard, and Where to Start (and/or Ensuring Smarter-Than-Human Intelligence Has a Positive Outcome)
  7. Security Mindset and Ordinary Paranoia + Security Mindset and the Logistic Success Curve
  8. The Rocket Alignment Problem
  9. Selections from Arbital's AI alignment "explore" page (unfortunately not well-organized or fully edited)
  10. Late 2021 MIRI Conversations
Load More