Staying Sane While Taking Ideas Seriously


My ML Scaling bibliography

Is this meant to be a linkpost? I don't see any content except for the comment above.

Eutopia is Scary

The subconscious mind knows exactly what it's flinching away from considering. :-)

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

A secondary concern in that it's better to have one org that has some people in different locations, but everyone communicating heavily, than to have two separate organizations.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

Sure - and MIRI/FHI are a decent complement to each other, the latter providing a respectable academic face to weird ideas. 

Generally though, it's far more productive to have ten top researchers in the same org rather than having five orgs each with two top researchers and a couple of others to round them out. Geography is a secondary concern to that.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

Thank you for writing this, Jessica. First, you've had some miserable experiences in the last several years, and regardless of everything else, those times sound terrifying and awful. You have my deep sympathy.

Regardless of my seeing a large distinction between the Leverage situation and MIRI/CFAR, I agree with Jessica that this is a good time to revisit the safety of various orgs in the rationality/EA space.

I almost perfectly overlapped with Jessica at MIRI from March 2015 to June 2017. (Yes, this uniquely identifies me. Don't use my actual name here anyway, please.) So I think I can speak to a great deal of this.

I'll run down a summary of the specifics first (or at least, the specifics I know enough about to speak meaningfully), and then at the end discuss what I see overall.

Claim: People in and adjacent to MIRI/CFAR manifest major mental health problems, significantly more often than the background rate.

I think this is true; I believe I know two of the first cases to which Jessica refers; and I'm probably not plugged-in enough socially to know the others. And then there's the Ziz catastrophe.

Claim: Eliezer and Nate updated sharply toward shorter timelines, other MIRI researchers became similarly convinced, and they repeatedly tried to persuade Jessica and others.

This is true, but non-nefarious in my genuine opinion, because it's a genuine belief and because given that belief, you'll have better odds of success if the whole team at least takes the hypothesis quite seriously.

(As for me, I've stably been at a point where near-term AGI wouldn't surprise me much, but the lack of it also wouldn't surprise me much. That's all it takes, really, to be worried about near-term AGI.)

Claim: MIRI started getting secretive about their research.

This is true, to some extent. Nate and Eliezer discussed with the team that some things might have to be kept secret, and applied some basic levels of it to things we thought at the time might be AGI-relevant instead of only FAI-relevant. I think that here, the concern was less about AGI timelines and more about the multipolar race caused by DeepMind vs OpenAI. Basically any new advance gets deployed immediately in our current world.

However, I don't recall ever being told I'm not allowed to know what someone else is working on, at least in broad strokes. Maybe my memory is faulty here, but it diverges from Jessica's. 

(I was sometimes coy about whether I knew anything secret or not, in true glomarization fashion; I hope this didn't contribute to that feeling.)

There are surely things that Eliezer and Nate only wanted to discuss with each other, or with a specific researcher or two.

Claim: MIRI had rarity narratives around itself and around Eliezer in particular.

This is true. It would be weird if, given MIRI's reason for being, it didn't at least have the institutional rarity narrative—if one believed somebody else were just as capable of causing AI to be Friendly, clearly one should join their project instead of starting one's own.

About Eliezer, there was a large but not infinite rarity narrative. We sometimes joked about the "bus factor": if researcher X were hit by a bus, how much would the chance of success drop? Setting aside that this is a ridiculous and somewhat mean thing to joke about, the usual consensus was that Eliezer's bus quotient was the highest one but that a couple of MIRI's researchers put together exceeded it. (Nate's was also quite high.)

(My expectation is that the same would not have been said about Geoff within Leverage.)

Claim: Working at MIRI/CFAR made it harder to connect with people outside the community.

There's an extent to which this is true of any community that includes an idealistic job (i.e. a paid political activist probably has likeminded friends and finds it a bit more difficult to connect outside that circle). Is it true beyond that?

Not for me, at least. I maintained my ties with the other community I'd been plugged into (social dancing) and kept in good touch with my family (it helps that I have a really good family). As with the above example, the social path of least resistance would have been to just be friends with the same network of people in one's work orbit, but there wasn't anything beyond that level of gravity in effect for me.

Claim: CFAR got way too far into Shiny-Woo-Adjacent-Flavor-Of-The-Week.

This is a unfair framing... because I agree with Jessica's claim 100%. Besides Kegan Levels and the MAPLE dalliance, there was the Circling phase and probably much else I wasn't around for.

As for causes, I've been of the opinion that Anna Salamon has a lot of strengths around communicating ideas, but that her hiring has had as many hits as misses. There's massive churn, people come in with their Big Ideas and nobody to stop them, and also people come in who aren't in a good emotional place for their responsibilities. I think CFAR would be better off if Anna delegated hiring to someone else. [EDIT: Vaniver corrects me to say that Pete Michaud has been mostly in charge of hiring for the past several years, in which case I'm criticizing him rather than Anna for any bad hiring decisions during that time.]

Overall Thoughts

Essentially, I think there's one big difference between issues with MIRI/CFAR and issues at Leverage:

The actions of CFAR/MIRI harmed people unintentionally, as evidenced by the result that people burned out and left quickly and with high frequency. The churn, especially in CFAR, hurt the mission, so it was definitely not the successful result of any strategic process.

Geoff Anders and others at Leverage harmed people intentionally, in ways that were intended to maintain control over those people. And to a large extent, that seems to have succeeded until Leverage fell apart.

Specifically, [accidentally triggering psychotic mental states by conveying a strange but honestly held worldview without adding adequate safeties] is different from [intentionally triggering psychotic mental states in order to pull people closer and prevent them from leaving], which is Zoe's accusation. Even if it's possible for a mental breakdown to be benign under the right circumstances, and even if an unplanned one is more likely to result in very very wrong circumstances, I'm far more terrified of a group that strategically plans for its members to have psychosis with the intent of molding those members further toward the group's mission.

Unintentional harm is still harm, of course! It might have even been greater harm in total! But it makes a big difference when it comes to assessing how realistic a project of reform might be.

There are surely some deep reforms along these lines that CFAR/MIRI must consider. For one thing: scrupulosity, in the context of AI safety, seems to be a common thread in several of these breakdowns. I've taken this seriously enough in the past to post extensively on it here. I'd like CFAR/MIRI leadership to carefully update on how scrupulosity hurts both their people and their mission, and think about changes beyond surface-level things like adding a curriculum on scrupulosity. The actual incentives ought to change.

Finally, a good amount of Jessica's post (similarly to Zoe's post) concerns her inner experiences, on which she is the undisputed expert. I'm not ignoring those parts above. I just can't say anything about them, merely that as a third person observer it's much easier to discuss the external realities than the internal ones. (Likewise with Zoe and Leverage.)

Common knowledge about Leverage Research 1.0

My own strong agreement with the content makes it hard to debias my approval here, but I want to generally massively praise edits that explicitly cross out the existing comment, and state that they've changed their minds, and why they've done so.

(There are totally good reasons to retract without comment, of course, and I'm glad that LW now offers this option. I'm just giving Davis credit for putting his update out there like this.)

Common knowledge about Leverage Research 1.0

There's a lot going on in this comment, but I note with interest that this is the first time I've seen someone weigh in on questions of cultish behavior from the perspective of a former cult leader. 

I'm fascinated with the claim that if you take on the outer facade of a cult, you now have a strong incentive gradient to turn up the cultishness (maybe because you're now drawing in people who are looking for more of that, and driving away anyone who's put off by it). Obviously the claim needs more than one person's testimony, but it makes sense.

I wonder if some early red flags with Leverage (living together with your superiors who also did belief reporting sessions with you, believing Geoff's theories were the word of god, etc) were explicitly laughed off as "oh, haha, we know we're not a cult, so we can chuckle about our resemblances to cults".

Whole Brain Emulation: No Progress on C. elgans After 10 Years

On the other hand, sometimes people end up walking right through what the established experts thought to be a wall. The rise of deep learning from a stagnant backwater in 2010 to a dominant paradigm today (crushing the old benchmarks in basically all of the most well-studied fields) is one such case.

In any particular case, it's best to expect progress to take much, much longer than the Inside View indicates. But at the same time, there's some part of the research world where a major rapid shift is about to happen.

Load More