Rob Bensinger

Communications lead at MIRI. Unless otherwise indicated, my posts and comments here reflect my own views, and not necessarily my employer's.



Rob B's Shortform Feed

Shared with permission, a google doc exchange confirming Eliezer still finds the arguments for alignment optimism, slower takeoffs, etc. unconvincing:

Daniel Filan: I feel like a bunch of people have shifted a bunch in the type of AI x-risk that worries them (representative phrase is "from Yudkowsky/Bostrom to What Failure Looks Like part 2 part 1") and I still don't totally get why.

Eliezer Yudkowsky: My bitter take:  I tried cutting back on talking to do research; and so people talked a bunch about a different scenario that was nicer to think about, and ended up with their thoughts staying there, because that's what happens if nobody else is arguing them out of it.

That is: this social-space's thought processes are not robust enough against mildly adversarial noise, that trying a bunch of different arguments for something relatively nicer to believe, won't Goodhart up a plausible-to-the-social-space argument for the thing that's nicer to believe.  If you talk people out of one error, somebody else searches around in the space of plausible arguments and finds a new error.  I wasn't fighting a mistaken argument for why AI niceness isn't too intractable and takeoffs won't be too fast; I was fighting an endless generator of those arguments.  If I could have taught people to find the counterarguments themselves, that would have been progress.  I did try that.  It didn't work because the counterargument-generator is one level of abstraction higher, and has to be operated and circumstantially adapted too precisely for the social-space to be argued into it using words.

You can sometimes argue people into beliefs.  It is much harder to argue them into skills.  The negation of Robin Hanson's rosier AI scenario was a belief.  Negating an endless stream of rosy scenarios is a skill.

Caveat: this was a private reply I saw and wanted to share (so people know EY's basic epistemic state, and therefore probably the state of other MIRI leadership). This wasn't an attempt to write an adequate public response to any of the public arguments put forward for alignment optimism or non-fast takeoff, etc., and isn't meant to be a replacement for public, detailed, object-level discussion. (Though I don't know when/if MIRI folks plan to produce a proper response, and if I expected such a response soonish I'd probably have just waited and posted that instead.)

"Existential risk from AI" survey results

One-off, though Carlier, Clarke, and Schuett have a similar survey coming out in the next week.

Predict responses to the "existential risk from AI" survey

I've added six prediction interfaces: two for your own answers to the two Qs, two for your guess at the mean survey respondent answers, and two for your guess at the median respondent answers.

Predict responses to the "existential risk from AI" survey

I think it might be more interesting to sketch what you expect the distribution of views to look like, as opposed to just giving a summary statistic. I can add probability Qs, but I avoided it initially so as not to funnel people into doing the less informative version of this exercise.

This Sunday, 12PM PT: Scott Garrabrant on "Finite Factored Sets"

Here's the basic content, in the form of a transcript+video of a version of this talk Scott gave at Topos 12 days ago: This heavily overlaps with the LW talk today.

I'm guessing we'll release the LW talk video sometime too.

Peekskill Lyme Incidence says:

The immunization you can now give your puppy is essentially this original [LYMErix] vaccine, says Stanley Plotkin, a professor and consultant who literally wrote the book on vaccines, and whose son almost died from cardiac Lyme disease.

But we'd need a lot more detail and confirmation than just 'a second-hand claim from an expert that they're "essentially" the same'.

Sabien on "work-life" balance

Duncan posted it in a private chat I was in (and on FB); I asked if he'd cross-post to LW, or if he'd prefer that I post it for him; he voted for the latter. He didn't ask for it to be cross-posted, but also didn't object.

MIRI location optimization (and related topics) discussion

Thanks for the suggestion! Not sure why you can't contact me on LinkedIn, but I check email (and LW PM) more anyway; here's my email.

MIRI location optimization (and related topics) discussion

Thanks for the suggestions! SLC didn't make our top 30 list, but it's maybe top 50, and does seem like it has a lot of nice aspects!

From another comment I left here:

Research Triangle Park is on the list of 14+ cities that Alex classified one tier below our top thirty (as of one month ago): "Moved to backburner; had varying levels of initial interest but also flags; investigated and deprioritized; plausible something could bump them back onto our radar, but seems unlikely." Along with places like Ithaca NY, Salt Lake City UT, Pittsburgh / CMU, Princeton NJ, and Providence RI.

Load More