27

LESSWRONG
LW

26
AI ControlCommunity OutreachEconomicsHPMOR (discussion & meta)TribalismAICommunityWorld Optimization

14

AI Safety's Berkeley Bubble and the Allies We're Not Even Trying to Recruit

by Mr. Counsel
7th Nov 2025
13 min read
0

14

This post was rejected for the following reason(s):

  • No LLM generated, heavily assisted/co-written, or otherwise reliant work. LessWrong has recently been inundated with new users submitting work where much of the content is the output of LLM(s). This work by-and-large does not meet our standards, and is rejected. This includes dialogs with LLMs that claim to demonstrate various properties about them, posts introducing some new concept and terminology that explains how LLMs work, often centered around recursiveness, emergence, sentience, consciousness, etc. (these generally don't turn out to be as novel or interesting as they may seem).

    Our LLM-generated content policy can be viewed here.

I'm including a note here to say that this post was initially approved by another moderator, but on review it seemed pretty clear to me that it was substantially (though not entirely) LLM-written, and I confirmed that with other reviewers.  This post in particular is unsuitable for LessWrong for reasons described in LLM-generated text is not testimony.

AI ControlCommunity OutreachEconomicsHPMOR (discussion & meta)TribalismAICommunityWorld Optimization

14

New Comment
Moderation Log
More from Mr. Counsel
View more
Curated and popular this week
0Comments

Epistemic status: outside view critique based on public discourse, some HQ/location discussion, and a bit of lived experience. I know there are exceptions and counterexamples; I’m arguing about the center of gravity and revealed incentives of the Bay/EA/safety cluster, not claiming omniscience about every individual. 

There’s a scene near the end of Harry Potter and the Methods of Rationality that I have not been able to get out of my head.

Voldemort has Harry fully under his power in the graveyard: stripped, surrounded by Death Eaters, locked in by fresh constraints. Before he moves forward with his plan for Harry and the protections around that, he pauses. He looks at his followers and asks whether anyone can see a flaw in what he’s arranged. Whether he’s overlooked anything important.

And the Death Eaters just stand there.

No one suggests a change. No one points out a flaw. Not because there’s nothing to say, but because they’re in an echo chamber: too similar, too deferential, too scared of contradicting the Dark Lord. Voldemort curses them for it. It’s framed as a core failure mode of having a smart leader surrounded by people who are too similar and too deferential to catch his blind spots when it matters most.

We all read that. Many of us nodded along. Some of us built our identities around not being like those people.

I’m writing this because, from where I’m sitting, the AI safety/rationalist/MIRI cluster has drifted disturbingly close to that exact parable: on the social level, not the math level.

I say this as someone who takes the core worries seriously. I’m not here to mock the cause. I think Yudkowsky, MIRI, and the safety crowd are, sincerely, on the side of the light. But I am saying: you wrote the story about echo chambers. Then you built one.

To be explicit about scope: I’m not claiming that every person in AI safety, or in the Bay, matches this description. I’m talking about the center of gravity you get if you weight by social influence, funding, and HQ location. There are plenty of individuals who are partial counterexamples in good ways; my claim is that the structure and revealed incentives aren’t organized around them.

The Shape of the Bubble

Let me sketch the outline as I see it.

If you look at the center of gravity of the movement’s social graph, the people near the money, the org HQs, and the social hubs, it’s something like:

  • Bay Area/Berkeley/SF
  • young-ish, highly educated, very online
  • overwhelmingly blue tribe on U.S. politics (see LessWrong’s demographic survey and Alameda County's election results)

There are obvious reasons the center of gravity ended up here: talent agglomeration, proximity to labs, social proof, the fact that the first big donors and orgs were already here. I don’t think anyone sat down and said “let’s maximize monoculture.” I’m saying: given where we are now and what we now know, the continuation of that equilibrium looks more like comfort seeking than mission seeking.

The HQ location discussion a few years ago wasn’t “Where is best place to advance our mission?” It was:

  • Nature, quiet, and walks
  • Uber/UberEats
  • Can we “mesh well with people who already live here”?
  • Aren’t “extremely conservative”
  • Avoid ticks and mosquitoes

Those are understandable human preferences. But in the long, detailed writeup and comment thread, there was almost no explicit discussion of:

  • political diversity as a value in itself
  • proximity to courts, Hill staff, financial markets, boring civil service adults
  • regular contact with people whose lives and priors are nothing like Bay tech/EA

One way to see how skewed the optimization was is to look at the final choice set. We somehow ended up debating Berkeley vs. Bellingham (Berkeley proper versus what is basically Berkeley Jr.) instead of, say, Berkeley vs. somewhere near Boston (or Austin/NYC, as Zvi and others have already suggested on epistemic grounds).

I’m not asking anyone to move to Houston or some random red state exurb. Boston is hardly a right wing fantasy: it hits most of the same “walkable, educated, LGBTQ friendly, lots of nerds” desires, but it’s also a city that’s taken seriously by thinkers in the center and on the right, plugged into universities, courts, finance, and policy. If your last-round comparison is Berkeley or a smaller, more remote Berkeley, rather than Berkeley or a place that opens up genuinely different coalitions, that’s a sign the search was pointed at comfort, not coverage.

On public messaging, the default mode for years has been:

  • utilitarian
  • very “space of possible minds”
  • “unaligned optimizers,” “paperclip maximizers,” “lose the light cone”
  • plus a heavy dose of “everyone else is underestimating p(doom)”

Again, those are not wrong frames. But they are very native to one tribe, and it’s not the tribe that actually owns the key constitutional and institutional levers

From inside this world, it all feels normal: we live where our friends are, we talk how we talk, we optimize for being around other people who get it.

From the outside, it looks uncomfortably like that graveyard scene in HPMOR: one very smart guy plus a room full of people who are very much like him, who share his priors, and who are not great at saying, “My lord, you are missing something enormous.”

It’s important to be clear what I’m claiming here. I’m not saying that every community has some insularity, ours is just a bit above average. I’m saying that, given the mission and the stakes this community claims for itself, having the center of gravity anchored in Berkeley/SF produces an unusually bad monoculture:

  • it systematically marginalizes or filters out people whose instincts we need (classical liberals, rule of law conservatives, boring institutionalists, parents with something to lose), and
  • its ambient politics make it socially costly to treat those people as peers rather than enemies.

The Missing Question #1: Who Is Not in the Room?

In that HQ/location thread, people were thoughtful and reflective about a lot of things:

  • Is this a peaceful place to think?
  • Will people want to live here?
  • Is it walkable?
  • Is it LGBTQ-friendly?
  • What about cost of living, weather, mosquitoes, ticks?

Somehow, in 160 comments, almost nobody said:

  • Who isn't here if we do this?
  • Which kinds of people will we almost never encounter at the grocery store or at school pickup or at dinner?
  • Are we okay with a location that is one tribe politically, very rich, and very homogeneous in race, class, and worldview?

If you look at how the discussion actually ran, the revealed objective function of the decision making center seems to be something like: maximize cognitive freedom for current insiders, in a place that matches their cultural tastes, while minimizing discomfort and conflict.

This is a perfectly understandable human goal.

It is not obviously the right goal if your story is "we are trying to steer the entire future of humanity."

In ordinary intellectual communities, a Berkeley-ish monoculture mostly costs you some robustness and creativity. If you’re actually trying to influence state capacity, constitutional norms, and markets at scale, it specifically cuts you off from the people who sit on the veto points: courts, regulators, politicians, serious financial conservatives. Those are exactly the folks who can make change in the real world. 

The Missing Question #2: Who Are Our Natural Allies? (and why EconTalk should have been treated as a test)

Let me start with a concrete case.

When Eliezer went on EconTalk, Russ Roberts' long-running economics podcast, he walked into a room full of:

  • classical liberal and right leaning econ nerds
  • people whose entire intellectual religion is no central planner, no unaccountable sovereign, and rule of law and markets above technocrats

If you translate AI risk into their language, the story looks like this:

"We are on track to build a system that effectively acts as a sovereign or central planner above voters, courts, and markets, and then entangle it with the state and a handful of corporations. Once that happens, we may never be able to unwind it."

This is exactly the kind of scenario classical liberals and rule of law conservatives have been training to hate for centuries:

  • No new sovereign without consent.
  • No unaccountable central planner dictating prices, speech, or association.
  • No delegation of core governmental judgment to opaque mechanisms.

You may disagree with them in a lot of ways, but on the no AI sovereign/no new central planner framing, classical liberals and rule of law conservatives are one of your natural allies.

Classical liberals will fight broad technocratic overreach, but they'll accept narrow, well-targeted constraints when the alternative is creating a de facto sovereign that permanently destroys the very markets, property rights, and rule of law they care about.

So when you get an EconTalk slot and still mostly run the same "unaligned optimizers/paperclips/cosmic stakes" script, the missed opportunity isn't just that we lost some listeners. It's that we didn't even try to pitch the "no new sovereign" part of the story to exactly the people whose professional identity is "we stop new sovereigns." I'm not second guessing the content of what Yudkowsky said there; the core technical worries seem basically right to me. I'm saying that, given these worries, treating EconTalk as just another venue for the usual spiel, rather than as a deliberate attempt to recruit a natural ally tribe, is strong evidence that we weren't in coalition building mode at all.

This isn't just a one-off communication mistake; it's evidence about how the whole ecosystem is pointed: toward talking to itself, in its own dialect, even when the audience is different. 

The Missing Question #3: What Would It Take to Work With Them?

Being based in Berkeley doesn't just fail to help with cross-aisle dialogue; it actively sabotages it. From that vantage point, anyone on the classical liberal/rule of law right usually only shows up as an abstraction or an enemy combatant. They're someone you fly out to visit for a one-off meeting, not someone you bump into at a party or sit next to on a board.

The issue isn't just being outnumbered; it's being treated as socially radioactive. In a lot of Berkeley adjacent spaces, a classical liberal or rule of law conservative isn't just "someone I disagree with," but "someone I'd lose friends for treating as a peer." and that's exactly the wrong incentive gradient if you're trying to build this coalition.

And the asymmetry cuts both ways. From their side, most classical liberal/rule of law types only ever encounter AI doom as either sci-fi metaphor or culture war noise. Their natural reflex is to assume you're just here to regulate away their free markets under a new scary sounding pretext. And because you live in Berkeley, talk like Berkeley, and hire from Berkeley, you will be instantly coded as Berkeley liberals whether or not you ever say that out loud yourselves. 

They don't have the time or background to wade through Sequences, LessWrong, and doom podcasts just to figure out whether there's a real "no new sovereign" problem underneath. Venues like Econtalk are rare precisely because that audience is already listening carefully and is prepared to treat you as a serious mind rather than a meme. If we don't aim our message correctly in those few places, we shouldn't be surprised that cross aisle engagement mostly fails everywhere else. 

The conservative outreach I've seen so far doesn't really reassure me. I haven't done a full review and I don't want to single out individuals. There are honorable exceptions, including Soares' own attempts to take people like J.D. Vance seriously when his friends distrust them. But as a cluster, we still mostly talk as if we're explaining ourselves to a caricature of conservatives. It feels like an early, clumsy drat of the kind of gears-level modelling and back and forth we'd actually need. I'd be delighted to be shown examples that do better.

The deeper problem, I think, is that almost nobody in this world has the permission structure to model the right at its best and treat a right of center thinker as an equal. Doing that would mean talking like a serious conservative long enough to get socially recoded as "one of them," admitting they're basically right about some deep things (like the dangers of concentrated, unaccountable power), and maybe giving them real veto power over plans that small like "new sovereign." In a lot of Berkeley adjacent spaces, that's a good way to lose friends, grants, and status. In practice, this means conservatives show up as a messaging target or a stereotype, not as partners whose instincts can actually change the plan.

From the outside, it looks like this: one of the tribes whose instincts we need has been left strikingly under addressed because they are outside the Berkeley Overton window.

"But We Can't All Move to DC/Boston/Whatever..."

I can already hear some reasonable pushback:

  • We can't just uproot everyone.
  • I hate winter.
  • We don't have the capacity to rebuild a community from scratch in DC
  • I'm bad at talking to those people; someone else should do it.

All true. All human. And all of it sounds exactly like the kind of self justification that keeps the graveyard comfortable while Harry plots his escape.

I'm not saying that everyone must move to DC or you are a bad person for not living in Boston right now. 

I am saying that, given the stakes, it is not okay if no one in the room is explicitly responsible for asking:

  • Who are our natural allies if we adjust our framing?
  • Where do those people live and work?
  • How do we talk to them in their language, not ours?

It is also not okay if every high leverage opportunity (EconTalk, FedSoc-ish audiences, WSJ-ish audiences) is treated as just another venue for the usual spiel, instead of "this is a different immune system with a different dialect; optimize for that."

If we really believe the stakes we write about, then where and who we are near is not a neutral aesthetic choice. It is part of the problem statement.

The Parable, Pointed Back at Us

Back to HPMOR.

The problem in that scene isn't that Voldemort is arrogant. It's that no one around him will say, "My lord, here is the flaw you can't see from where you stand."

I don't expect everyone at MIRI, or on LessWrong, or in the broader safety world to agree with my politics or my coalitions.

I do expect, given the stakes we describe, that someone in the room should be able to say:

  • We are dangerously over-indexed to one city, one class, one tribe.
  • We are not seriously speaking to the people who could see AI risk as a risk to individual autonomy.
  • We are optimizing for comfort, not for coalition.

If that conversation is happening, it's happening very quietly, From the outside, the center of gravity of the ecosystem still looks like:

  • Berkeley/Bay as the unquestioned hub
  • repeated missed opportunities with classical liberal audiences
  • and a location search where Lyme disease got a lot more explicit attention than "Will we regularly see or cooperate with people who aren't shaped like us?"

I don't think that we are stupid or evil. I think we are sitting in a room that has become more like that graveyard than any of us want to admit.

This post is me trying not to be another silent Death Eater.

What I Actually Want

This is not a call for purity or self-immolation. It's a call for specific, boring changes in what counts as obviously important:

Change 1: Make "no AI sovereign/no new central planner" a first-class framing.

When talking to classical liberal and rule of law audiences, this should be the headline, not the footnote. Emphasize the potential loss of individual autonomy.

Change 2: Assign someone explicit responsibility for cross-tribe coalition

Not vibes outreach, but: 

  • Who are the institutional and people that hate unaccountable sovereigns?
  • Who owns talking to them regularly, in their language?
  • Who owns listening to their constraints

Change 3: Treat the Berkeley bubble as a liability, not a neutral backdrop.

This doesn't mean torch everything and move tomorrow. It does mean:

  • admitting that this particular monoculture is especially ill suited to the coalitions we need
  • seeking out people whose priors are alien to that monoculture and giving them real voice
  • being suspicious of decisions, like the HQ search, that conveniently maximize comfort for one tribe while minimizing contact with everyone else.

Change 4: Reward people for saying "you missed something," not just for doing more doom math.

If someone walks into the room and says, "You are not talking to these people at all," that should not be a weird social move. It should be a recognized kind of contribution. 

If you're going to take on a de facto leadership role in an existential risk conversation, saying, "I'm bad at that kind of social cognition" can't be the end of the story. You don't need to spin, but you do need to understand how other minds work well enough to move them, or, if you can't do that, then empower people who can.

And if someone buys the core x-risk story and sincerely wants to help, "they're Republican/not our tribe/fail this social litmus test" is not a valid reason to treat them as radioactive. You only get to alienate sincere helpers when they way they want to "help" would actually damage the mission.

I think that there's a very human reason this is all hard. For many people in this world, childhood and early careers came with a steady message, implicit or explicit, that something about them was wrong and had to change. When they finally found a culture where their weirdness was normal and their intensity was valued, of course they clung to it and built up antibodies to "you need to change." I'm not asking anyone to give that up lightly. I'm saying that if we take seriously the job this community has claimed for itself, then some of the change has to happen on our side too: where we live, who we hire, who we treat as peers, and who we share power with.

To close: if that HPMOR scene was anything more than a fun characterization, then the real-world version of that question ("What did I miss") has to include "Who isn't in this room?" and "Who am I not even trying to enlist?"

Right now, the honest answer looks too much like: classical liberals, rule of law conservatives, people with very different lives, and anyone who doesn't live within easy driving distance of Berkeley.

We wrote the parable. Please let's not live it.

-Mr. Counsel

P.S. Just to be painfully explicit, I do not think Eliezer is Voldemort and I do not think MIRI are Death Eaters. Eliezer is on the side of light; that's the only reason this critique matters at all. The only reason I can even use this parable is because he wrote HPMOR in the first place, and wrote it well enough that it became a shared language for thinking about exactly these failure modes. All credit to Yudkowsky for giving us this story; my claim is that, on this one axis, the people who learned it best haven't pushed it quite far enough in their own lives. This post is just me, very belatedly, trying to ask, "My lord, why are you leaving Harry with his wand?"

P.P.S. I'd be especially interested in pushback on my arguments, particularly whether I'm overestimating the "Berkeley Bubble" effect vs other bottlenecks and whether others have used the "no new sovereign" frame with classical liberal audiences (and if they've been successful).