Research Coordinator of area "Do Not Build Uncontrollable AI" for AI Safety Camp.

See explainer on why AGI could not be controlled enough to stay safe:



Bias in Evaluating AGI X-Risks
Developments toward Uncontrollable AI
Why Not Try Build Safe AGI?


Even if you are focussed on long-term risks, you can still whistleblow on eggregious harms caused by these AI labs right now.  Providing this evidence enables legal efforts to restrict these labs. 

Whistleblowing is not going to solve the entire societal governance problem, but it will enable others to act on the information you provided.

It is much better than following along until we reached the edge of the cliff.

Are you thinking of blowing the whistle on something in between work on AGI and getting close to actually achieving it?

Good question.  

Yes, this is how I am thinking about it. 

I don't want to wait until competing AI corporations become really good at automating work in profitable ways, also because by then their market and political power would be entrenched. I want society to be well-aware way before then that the AI corporations are acting recklessly, and should be restricted.

We need a bigger safety margin.  Waiting until corporate machinery is able to operate autonomously would leave us almost no remaining safety margin.

There are already increasing harms, and a whistleblower can bring those harms to the surface.  That in turn supports civil lawsuits, criminal investigations, and/or regulator actions.

Harms that fall roughly in these categories – from most directly traceable to least directly traceable:

  1. Data laundering (what personal, copyrighted and illegal data is being copied and collected en masse without our consent).
  2. Worker dehumanisation (the algorithmic exploitation of gig workers;  the shoddy automation of people's jobs;  the criminal conduct of lab CEOs)
  3. Unsafe uses (everything from untested uses in hospitals and schools, to mass disinformation and deepfakes, to hackability and covered-up adversarial attacks, to automating crime and the kill cloud, to knowingly building dangerous designs).
  4. Environmental pollution (research investigations of data centers, fab labs, and so on)

For example: 

  1. If an engineer revealed authors' works in the datasets of ChatGPT, Claude, Gemini or Llama that would give publishers and creative guilds the evidence they need to ramp up lawsuits against the respective corporations (to the tens or hundreds). 
    1. Or if it turned out that the companies collected known child sexual abuse materials (as OpenAI probably did, and a collaborator of mine revealed for StabilityAI and MidJourney).
  2. If the criminal conduct of the CEO of an AI corporation was revealed
    1. Eg. it turned out that there is a string of sexual predation/assault in leadership circles of OpenAI/CodePilot/Microsoft.
    2. Or it turned out that Satya Nadella managed a refund scam company in his spare time.
  3. If managers were aware of the misuses of their technology, eg. in healthcare, at schools, or in warfare, but chose to keep quiet about it.

Revealing illegal data laundering is actually the most direct, and would cause immediate uproar.  
The rest is harder and more context-dependent.  I don't think we're at the stage where environmental pollution is that notable (vs. the fossil fuel industry at large), and investigating it across AI hardware operation and production chains would take a lot of diligent research as an inside staff member.

Someone shared the joke: "Remember the Milgram experiment, where they found out that everybody but us would press the button?"

My response: Right! Expect AGI lab employees to follow instructions, because of…

  • deference to authority
  • incremental worsening (boiling frog problem)
  • peer proof (“everyone else is doing it”)
  • escalation of commitment

You can literally have a bunch of engineers and researchers believe that their company is contributing to AI extinction risk, yet still go with the flow.

They might even think they’re improving things at the margin. Or they have doubts, but all their colleagues seem to be going on as usual.

In this sense, we’re dealing with the problems of having that corporate command structure in place that takes in the loyal, and persuades them to do useful work (useful in the eyes of power-and-social-recognition-obsessed leadership).

I appreciate this comment.

Be careful though that we’re not just dealing with a group of people here.

We’re dealing with artificial structures (ie. corporations) that take in and fire human workers as they compete for profit. With the most power-hungry workers tending to find their way to the top of those hierarchical structures.

When someone is risking the future of the entire human race, we'll see whistleblowers give up their jobs and risk their freedom and fortune to take action.

There are already AGI lab leaders that are risking the future of the entire human race.

Plenty of consensus to be found on that.

So why no whistleblowing?

If you’re smart and specialised in researching capability risks, it would not be that surprising if you come up with new feasible mechanisms that others were not aware of.

That’s my opinion on this.

Capabilities people may have more opportunities to call out risks, both internally and externally (whistleblowing).

I would like to see this. I am not yet aware of a researcher deciding to whistleblow on the AGI lab they work at.

If you are, please meet with an attorney in person first, and preferably get advice from an experienced whistleblower to discuss preserving anonymity – I can put you through: remmelt.ellen[a|}protonmail{d07]com

There’s so much that could be disclosed that would help bring about injunctions against AGI labs.

Even knowing what copyrighted data is in the datasets would be a boon for lawsuits.

[cross-posted replies from EA Forum]

Ben, it is very questionable that 80k is promoting non-safety roles at AGI labs as 'career steps'. 

Consider that your model of this situation may be wrong (account for model error). 

  • The upside is that you enabled some people to skill up and gain connections. 
  • The downside is that you are literally helping AGI labs to scale commercially (as well as indirectly supporting capability research).



A range of opinions from anonymous experts about the upsides and downsides of working on AI capabilities

I did read that compilation of advice, and responded to that in an email (16 May 2023):

"Dear [a],

People will drop in and look at job profiles without reading your other materials on the website. I'd suggest just writing a do-your-research cautionary line about OpenAI and Anthropic in the job descriptions itself.

Also suggest reviewing whether to trust advice on whether to take jobs that contribute to capability research.

  • Particularly advice by nerdy researchers paid/funded by corporate tech. 
  • Particularly by computer-minded researchers who might not be aware of the limitations of developing complicated control mechanisms to contain complex machine-environment feedback loops. 

Totally up to you of course.

Warm regards,



We argue for this position extensively in my article on the topic

This is what the article says: 
"All that said, we think it’s crucial to take an enormous amount of care before working at an organisation that might be a huge force for harm. Overall, it’s complicated to assess whether it’s good to work at a leading AI lab — and it’ll vary from person to person, and role to role." 

So you are saying that people are making a decision about working for an AGI lab that might be (or actually is) a huge force for harm. And that whether it's good (or bad) to work at an AGI lab depends on the person – ie. people need to figure this out for them personally.

Yet you are openly advertising various jobs at AGI labs on the job board. People are clicking through and applying. Do you know how many read your article beforehand?

~ ~ ~
Even if they did read through the article, both the content and framing of the advice seems misguided. Noticing what is emphasised in your considerations. 

Here are the first sentences of each consideration section:
(ie. as what readers are most likely to read, and what you might most want to convey).

  1. "We think that a leading — but careful — AI project could be a huge force for good, and crucial to preventing an AI-related catastrophe."
    • Is this your opinion about DeepMind, OpenAI and Anthropic? 
  2. "Top AI labs are high-performing, rapidly growing organisations. In general, one of the best ways to gain career capital is to go and work with any high-performing team — you can just learn a huge amount about getting stuff done. They also have excellent reputations more widely. So you get the credential of saying you’ve worked in a leading lab, and you’ll also gain lots of dynamic, impressive connections."
    • Is this focussing on gaining prestige and (nepotistic) connections as an instrumental power move, with the hope of improving things later...?
    • Instead of on actually improving safety?
  3. "We’d guess that, all else equal, we’d prefer that progress on AI capabilities was slower."
    • Why is only this part stated as a guess?
      • I did not read "we'd guess that a leading but careful AI project, all else equal, could be a force of good". 
      • Or inversely:  "we think that continued scaling of AI capabilities could be a huge force of harm."
      • Notice how those framings come across very differently.
    • Wait, reading this section further is blowing my mind.
      • "But that’s not necessarily the case. There are reasons to think that advancing at least some kinds of AI capabilities could be beneficial. Here are a few"
      • "This distinction between ‘capabilities’ research and ‘safety’ research is extremely fuzzy, and we have a somewhat poor track record of predicting which areas of research will be beneficial for safety work in the future. This suggests that work that advances some (and perhaps many) kinds of capabilities faster may be useful for reducing risks."
        • Did you just argue for working on some capabilities because it might improve safety?  This is blowing my mind.
      • "Moving faster could reduce the risk that AI projects that are less cautious than the existing ones can enter the field."
        • Are you saying we should consider moving faster because there are people less cautious than us?  
        • Do you notice how a similarly flavoured argument can be used by and is probably being used by staff at three leading AGI labs that are all competing with each other? 
        • Did OpenAI moving fast with ChatGPT prevent Google from starting new AI projects?
      • "It’s possible that the later we develop transformative AI, the faster (and therefore more dangerously) everything will play out, because other currently-constraining factors (like the amount of compute available in the world) could continue to grow independently of technical progress."
        • How would compute grow independently of AI corporations deciding to scale up capability?
        • The AGI labs were buying up GPUs to the point of shortage. Nvidia was not able to supply them fast enough. How is that not getting Nvidia and other producers to increase production of GPUs?
        • More comments on the hardware overhang argument here.
      • "Lots of work that makes models more useful — and so could be classified as capabilities (for example, work to align existing large language models) — probably does so without increasing the risk of danger"
        • What is this claim based on?
  4. "As far as we can tell, there are many roles at leading AI labs where the primary effects of the roles could be to reduce risks."
    1. As far as I can tell, this is not the case.
      1. For technical research roles, you can go by what I just posted
      2. For policy, I note that you wrote the following:
        "Labs also often don’t have enough staff... to figure out what they should be lobbying governments for (we’d guess that many of the top labs would lobby for things that reduce existential risks)."
        1. I guess that AI corporations use lobbyists for lobbying to open up markets for profit, and to not get actually restricted by regulations (maybe to move focus to somewhere hypothetically in the future, maybe to remove upstart competitors who can't deal with the extra compliance overhead, but don't restrict us now!).
        2. On prior, that is what you should expect, because that is what tech corporations do everywhere. We shouldn't expect on prior that AI corporations are benevolent entities that are not shaped by the forces of competition. That would be naive.

~ ~ ~
After that, there is a new section titled "How can you mitigate the downsides of this option?"

  • That section reads as thoughtful and reasonable.
  • How about on the job board, you link to that section in each AGI lab job description listed, just above the 'VIEW JOB DETAILS' button?  
  • That would help guide through potential applicants to AGI lab positions to think through their decision.
Load More