Review

Several tech leaders descended upon Capitol Hill last week to discuss the rapid expansion of generative AI. It was a mostly staid meeting until the potential harms from Meta's new Llama 2 model came up.

During the discussion, attended by most of the Senate's 100 members, Tristan Harris, a co-founder of the Center for Humane Technology, said he recently had engineers take Meta's powerful large language model Llama 2 for a "test drive." After some prompting, Harris said that a chat with Llama 2 came back with a detailed walkthrough of how to create anthrax as a biological weapon, according to one person familiar with the forum and two senators present. That prompted a testy exchange between Harris and Mark Zuckerberg, co-founder and CEO of Meta, formerly known as Facebook. Most specifics of the exchange between Harris and Zuckerberg have not been previously reported, although Harris receiving directions from Llama 2 about an unidentified biological weapon was noted by The Washington Post.

Among the two dozen tech leaders at the forum were Elon Musk, owner of Twitter and CEO of Tesla and SpaceX; Sam Altman, CEO of OpenAI; Satya Nadella, CEO of Microsoft; Jensen Huang, CEO of Nvidia; and Sundar Pichai, CEO of Google.

The gathering was led by Senate Majority Leader Chuck Schumer, Democratic Sen. Martin Heinrich, and Republican Sens. Mike Rounds, and Todd Young, who all make up a new "artificial intelligence working group." The group formed earlier this year, a few months after OpenAI's ChatGPT bot became known the world over.

During the session, Zuckerberg attempted to downplay Harris' statement that Llama 2 can tell users how to make anthrax, saying anyone who was looking for such a guide could find out how to make anthrax on YouTube, according to both of the senators present. Harris rejected the argument, saying such guides do not come up on YouTube, and even if they did, the level of detail and guidance provided by Llama 2 was unique to such a powerful generative AI model. It's also largely an open-source model, meaning it's freely available to use and adapt.

"It was one of the only moments in the whole thing that was like, 'Oh,'" one of the senators present said, describing the exchange as having caught people's attention. "Twenty-four out of the 26 panelists there basically said exactly the same thing over and over: 'We need to protect AI innovation but with safeguards in place.'"

A Meta spokesperson declined to comment. Harris did not respond to requests for comment.

Beyond the brief snit between Harris and Zuckerberg, there was little in-depth discussion of the issues surrounding AI, according to all three of the people familiar with the meeting. Even the ability of Llama 2 to guide a prospective user on creating anthrax was not cause for any extended probing, the people said.

"It was, 'Ok, next speaker,' it moved right along," one of the senators present said.

Llama 2's power is well-known inside Meta. Its ability to turn up detailed instructions for creating a biological weapon like anthrax is to be expected, two people familiar with the company said.

"Really, this is going to be the case for every LLM of a certain size, unless you kneecap it for certain things," one of the people familiar with Meta said. "There will be edge cases. But the ones that are products, like ChatGPT, as opposed to open source releases, they just nerf it for this and that."

Still, AI tools trained on trillions of pieces of information scraped from the whole of the internet are difficult to control. Earlier this year, a user of a Discord bot created with ChatGPT was able to get the chemical recipe for napalm, a highly flammable liquid used as a military weapon. ChatGPT and Google's Bard are also known for serving up information to users that is incorrect, composed of misinformation, or simply made up, dubbed "hallucinations."

Are you a Meta employee or someone else with insight to share? Contact Kali Hays at khays@insider.com, on secure messaging app Signal at [phone number], or through Twitter DM at @hayskali. Reach out using a non-work device.

Get in touch with Bryan Metzger by email at bmetzger@insider.com or find him on Twitter at @metzgov.

New Comment
11 comments, sorted by Click to highlight new comments since:

The conjunction of "Llama-2 can give accurate instructions for making anthrax" and "Anthrax recipes are hard to obtain, apart from Llama-2," is almost certainly false.

We know that it's hard to make biological weapons with LLMs, because Dario Amodei testified before US congress that the most advanced models that Anthropic has cannot reliably give instructions to make such biological weapons yet. But Anthropic's most advanced models are way, way better than Llama-2 -- so if the most advanced models Anthropic has can't do it, Llama-2 almost certainly cannot. (Either that or anthrax has accurate instructions for it scattered everywhere on the internet and is an unusually easy biological agent to make such that Llama-2 did pick it up -- but again that means Llama-2 isn't particularly a problem!).

I'm sure if you asked an delobotomized version of Llama-2 for instructions it would give you instructions that sound scary, but that's an entirely different matter.

Either that or anthrax has accurate instructions for it scattered everywhere on the internet and is an unusually easy biological agent to make such that Llama-2 did pick it up -- but again that means Llama-2 isn't particularly a problem!


Hard disagree. These techniques are so much more worrying if you don't have to piece together instructions from different locations and assess the reliability of comments on random forums.

Yeah, terrorists are often not very bright, conscientious, or creative.[1] I think rationalist-y types might systematically overestimate how much proliferation of non-novel information can still be bad, via giving scary ideas to scary people.

  1. ^

    No offense intended to any members of the terror community reading this comment

We know that it's hard to make biological weapons with LLMs, because Dario Amodei testified before US congress that the most advanced models that Anthropic has cannot reliably give instructions to make such biological weapons yet.

Fwiw I take this as moderate but not overwhelming evidence. (I think I agree with the rest of your comment, just flagging this seemed slightly overstated)

[-]mic10

It's a bit ambiguous, but I personally interpreted the Center for Humane Technology's claims here in a way that would be compatible with Dario's comments:

"Today, certain steps in bioweapons production involve knowledge that can’t be found on Google or in textbooks and requires a high level of specialized expertise — this being one of the things that currently keeps us safe from attacks," he added.

He said today’s AI tools can help fill in "some of these steps," though they can do this "incompletely and unreliably." But he said today’s AI is already showing these "nascent signs of danger," and said his company believes it will be much closer just a few years from now.

"A straightforward extrapolation of today’s systems to those we expect to see in two to three years suggests a substantial risk that AI systems will be able to fill in all the missing pieces, enabling many more actors to carry out large-scale biological attacks," he said. "We believe this represents a grave threat to U.S. national security."

If Tristan Harris was, however, making the stronger claim that jailbroken Llama 2 could already supply all the instructions to produce anthrax, that would be much more concerning than my initial read.

Why was "Tristan" qualified to attend but not Eliezer? When is this community going to stop putting up with the denigration of its actual experts and the elevation of imposters?

Ah... well, one perspective is that the world runs substantially on prestige, and rationalists tend not to play that game. There are not many buttons for "but actually you should get serious about listening to the people who have been repeatedly right about very important things way before anyone else was". That is often barely any currency at all in the games about who gets seats at the table.

From this perspective, if one gives up the pursuit of prestige in favor of actually getting things done, if one does not focus on signaling that they are the person who got it done, one is often not the person who gets to take credit or is listened to about the things. 

More broadly, getting angry or bitter about not having the respect and power they think that they have earned seems to me like it can cause people to waste a lot of energy for no results. I would be more willing to lean into parts of me that feel angry or bitter about it if I expected it had a decent shot of paying off in terms of correcting the credit allocation in the long-term. I currently expect it does not.

But for instance, on the current margin it seems to me like people have few good ideas for good AI policies whatsoever. I would be proud for rationalists to figure out and share some actually good policies even if those individuals aren't the people who get the credit for coming up with them or implementing them.

[Disclaimer that there are multiple perspectives on the situation and to be clear if a rationalist saw an opportunity to wield more power in an honorable and truthful way that would not warp their epistemic environment and sanity then I would heartily encourage them to do so.]

"Power corrupts the few, while weakness corrupts the many."

  • Eric Hoffer

This comment confuses me.

  1. Why is Tristan in quotes? Do you not believe it's his real name?
  2. What is the definition of the community you're referring to?
  3. I don't think I see any denigration happening --what are you referring to?
  4. What makes someone an expert or an imposter in your eyes? In the eyes of the community?
[-]mic10

Just speaking pragmatically, the Center for Humane Technology has probably built stronger relations with DC policy people compared to MIRI.

Speaking pragmatically, isn't Tristan aligned with "AI ethics," not AI safety (i.e., X-risk)?