President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

T_W

Released today (10/30/23) this is crazy, perhaps the most sweeping action taken by government on AI yet.

Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use^[1] and focusing solely on those aimed at risk. It's worth noting that some of these are very specific and direct an action to be taken by one of the executive branch organizations (i.e. sharing of safety test results) but others are guidances, which involve "calls on Congress" to pass legislation that would codify the desired action.

[Update]: The official order (this is a summary of the press release) has now be released, so if you want to see how these are codified to a greater granularity, look there^[2].

Existential Risk Related Actions:

Require that developers of the most powerful AI systems share their safety test results and other critical information with the U.S. government. In accordance with the Defense Production Act, the Order will require that companies developing any foundation model that poses a serious risk to national security, national economic security, or national public health and safety must notify the federal government when training the model, and must share the results of all red-team safety tests.
Develop standards, tools, and tests to help ensure that AI systems are safe, secure, and trustworthy. The National Institute of Standards and Technology will set the rigorous standards for extensive red-team testing to ensure safety before public release. The Department of Homeland Security will apply those standards to critical infrastructure sectors and establish the AI Safety and Security Board. The Departments of Energy and Homeland Security will also address AI systems’ threats to critical infrastructure, as well as chemical, biological, radiological, nuclear, and cybersecurity risks. Together, these are the most significant actions ever taken by any government to advance the field of AI safety.
Protect against the risks of using AI to engineer dangerous biological materials by developing strong new standards for biological synthesis screening. Agencies that fund life-science projects will establish these standards as a condition of federal funding, creating powerful incentives to ensure appropriate screening and manage risks potentially made worse by AI
Order the development of a National Security Memorandum that directs further actions on AI and security, to be developed by the National Security Council and White House Chief of Staff. This document will ensure that the United States military and intelligence community use AI safely, ethically, and effectively in their missions, and will direct actions to counter adversaries’ military use of AI.
Expand bilateral, multilateral, and multistakeholder engagements to collaborate on AI. The State Department, in collaboration, with the Commerce Department will lead an effort to establish robust international frameworks for harnessing AI’s benefits and managing its risks and ensuring safety. This will include accelerating development and implementation AI standards.

Non-Existential Risk Actions:

General

Protect Americans from AI-enabled fraud and deception by establishing standards and best practices for detecting AI-generated content and authenticating official content. The Department of Commerce will develop guidance for content authentication and watermarking to clearly label AI-generated content. Federal agencies will use these tools to make it easy for Americans to know that the communications they receive from their government are authentic—and set an example for the private sector and governments around the world.

Discrimination

Provide clear guidance to landlords, Federal benefits programs, and federal contractors to keep AI algorithms from being used to exacerbate discrimination.
Address algorithmic discrimination through training, technical assistance and coordination between the Department of Justice and Federal civil rights offices
Ensure fairness throughout the criminal justice system by developing best practices on the use of AI in sentencing, parole and probation, pretrial release and detention, risk assessments, surveillance, crime forecasting and predictive policing, and forensic analysis.

Healthcare

Advance the responsible use of AI in healthcare and the development of affordable and life-saving drugs. The Department of Health and Human Services will also establish a safety program to receive reports of—and act to remedy – harms or unsafe healthcare practices involving AI.

Jobs

Develop principles and best practices to mitigate the harms and maximize the benefits of AI for workers by addressing job displacement; labor standards; workplace equity, health, and safety; and data collection. These principles and best practices will benefit workers by providing guidance to prevent employers from undercompensating workers, evaluating job applications unfairly, or impinging on workers’ ability to organize.
Produce a report on AI’s potential labor-market impacts, and study and identify options for strengthening federal support for workers facing labor disruptions, including from AI.

Privacy

Protect Americans’ privacy by prioritizing federal support for accelerating the development and use of privacy-preserving techniques as well as evaluations of the effectiveness of these techniques
Strengthen privacy-preserving research and technologies, such as cryptographic tools that preserve individuals’ privacy, by funding a Research Coordination Network to advance rapid breakthroughs and development. The National Science Foundation will also work with this network to promote the adoption of leading-edge privacy-preserving technologies by federal agencies.
Evaluate how agencies collect and use commercially available information—including information they procure from data brokers—and strengthen privacy guidance for federal agencies to account for AI risks.

^{^}
Out of 26 distinct proposals, 7 (27%) are geared towards increasing use or capabilities and 2 (8%) proposals are a mixed bag of both encouraging development but also further safety.
^{^}
I can also do a similar post for that if there's interest, but it would be significantly longer

This was the press release; the actual order has now been published.

One safety-relevant part:

4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:
   (i) Companies developing or demonstrating an intent to develop potential dual-use foundation models to provide the Federal Government, on an ongoing basis, with information, reports, or records regarding the following:
      (A) any ongoing or planned activities related to training, developing, or producing dual-use foundation models, including the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats;
      (B) the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights; and
      (C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section, and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security. Prior to the development of guidance on red-team testing standards by NIST pursuant to subsection 4.1(a)(ii) of this section, this description shall include the results of any red-team testing that the company has conducted relating to lowering the barrier to entry for the development, acquisition, and use of biological weapons by non-state actors; the discovery of software vulnerabilities and development of associated exploits; the use of software or tools to influence real or virtual events; the possibility for self-replication or propagation; and associated measures to meet safety objectives; and
   (ii) Companies, individuals, or other organizations or entities that acquire, develop, or possess a potential large-scale computing cluster to report any such acquisition, development, or possession, including the existence and location of these clusters and the amount of total computing power available in each cluster.
(b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:
   (i) any model that was trained using a quantity of computing power greater than 10²⁶ integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10²³ integer or floating-point operations; and
   (ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10²⁰ integer or floating-point operations per second for training AI.

This requires reporting of plans for training and deployment, as well as ownership and security of weights, for any model with training compute over FLOPs. Might be enough of a talking point with corporate leadership to stave off things like hypothetical irreversible proliferation of a GPT-4.5 scale open weight LLaMA 4.

Is there a definition of "dual-use foundation model" anywhere in the text?

(k) The term “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by:

(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

(ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or

(iii) permitting the evasion of human control or oversight through means of deception or obfuscation.

Models meet this definition even if they are provided to end users with technical safeguards that attempt to prevent users from taking advantage of the relevant unsafe capabilities.

(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

Wouldn't this include most, if not all, uncensored LLMs?

And thus any person/organization working on them?

I think the key here is 'substantially'. That's a standard of evidence which must be shown to apply to the uncensored LLM in question. I think it's unclear if current uncensored LLMs would meet this level. I do think that if GPT-4 were to be released as an open source model, and then subsequently fine-tuned to be uncensored, that it would be sufficiently capable to meet the requirement of 'substantially lowering the barrier of entry for non-experts'.

Do you know who would be deciding on orders like this one? Some specialized department in the USG, whatever judge that happens to hear the case, or something else?

I do not know. I can say that I'm glad they are taking these risks seriously. The low screening security on DNA synthesis orders has been making me nervous for years, ever since I learned the nitty gritty details while I was working on engineering viruses in the lab to manipulate brains of mammals for neuroscience experiments back in grad school. Allowing anonymous people to order custom synthetic genetic sequences over the internet without screening is just making it too easy to do bad things.

Do you think we need to ban open source LLMs to avoid catastrophic biorisk? I'm wondering if there are less costly ways of achieving the same goal. Mandatory DNA synthesis screening is a good start. It seems that today there are no known pathogens which would cause a pandemic, and therefore the key thing to regulate is biological design tools which could help you design a new pandemic pathogen. Would these risk mitigations, combined with better pandemic defenses via AI, counter the risk posed by open source LLMs?

I think that in the long term, we can make it safe to have open source LLMs, once there are better protections in place. By long term, I mean, I would advocate for not releasing stronger open source LLMs for probably the next ten years or so. Or until a really solid monitoring system is in place, if that happens sooner. We've made a mistake by publishing too much research openly, with tiny pieces of dangerous information scattered across thousands of papers. Almost nobody has time and skill sufficient to read and understand all that, or even a significant fraction. But models can, and so a model that can put the pieces together and deliver them in a convenient summary is dangerous because the pieces are there.

Why do you believe it's, on the whole, a 'mistake' instead of beneficial?

I can think of numerous benefits, especially in the long term.

e.g. drawing the serious attention of decision makers who might have otherwise believed it to be a bunch of hooey, and ignored the whole topic.

e.g. discouraging certain groups from trying to 'win' in a geopolitical contest, by rushing to create a 'super'-GPT, as they now know their margin of advantage is not so large anymore.

Oh, I meant that the mistake was publishing too much information about how to create a deadly pandemic. No, I agree that the AI stuff is a tricky call with arguments to be made for both sides. I'm pretty pleased with how responsibly the top labs have been handling it, compared to how it might have gone.

Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.

Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.

Okay, I guess my question still applies?

For example, it might be that letting it progress without restriction has more upsides then slowing it down.

An example of something I would be strongly against anyone publishing at this point in history is an algorithmic advance which drastically lowered compute costs for an equivalent level of capabilities, or substantially improved hazardous capabilities (without tradeoffs) such as situationally-aware strategic reasoning or effective autonomous planning and action over long time scales. I think those specific capability deficits are keeping the world safe from a lot of possible bad things.

Yes it's clear these are your views, Why do you believe so?

I think... maybe I see the world and humanity's existence on it, as a more fragile state of affairs than other people do. I wish I could answer you more thoroughly.

https://www.lesswrong.com/posts/uPi2YppTEnzKG3nXD/nathan-helm-burger-s-shortform?commentId=qmrrKminnwh75mpn5

Not sure, but maybe the new AI institute they're setting up as a result

The temporary technical conditions in 4.2(b) such as FLOPs of training compute seem to apply without further qualification for whether a model is "dual-use" in a more particular sense. So unclear if the definition of "dual-use" in 3(k) is relevant to application of reporting requirements in 4.2(a) until updated technical conditions get defined.

Calling mundane risk "near term" sneaks in the implication that extinction risk isn't.

What alternative would you propose? I don't really like mundane risk but agree that an alternative would be better. For now I'll just change to "non-existential risk actions"

This made me wonder about a few things:

How responsible is CSET for this? CSET is the most highly funded longtermist-ish org, as far as I can tell from checking openbook.fyi (I could be wrong), so I've been trying to understand them better, since I don't hear much about them on LW or the EA Forum. I suspected they were having a lot of impact "behind the scenes" (from my perspective), and maybe this is a reflection of that?
Aaron Bergman said on Twitter that for him, "the ex ante probability of something at least this good by the US federal government relative to AI progress, from the perspective of 5 years ago was ~1%[.] Ie this seems 99th-percentile-in-2018 good to me", and many people seemed to agree. Stefan Schubert then said that "if people think the policy response is "99th-percentile-in-2018", then that suggests their models have been seriously wrong." I was wondering, do people here agree with Aaron that this EO appeared unlikely back then, and, if so, what do you think the correct takeaway from the existence of this EO is?

Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use and focusing solely on those aimed at risk.

Thanks for the work put into the distillation! But I think that the acceleration proposal to safety proposal ratio is highly relevant. British PM's Rishi Sunak's speech, for example, was in large part an announcement that the UK would not regulate AI anytime soon. I've argued previously that governments have strong short term incentives to accelerate AI and even lie about it, so my prediction is that omitting the ratio of safety to pro-acceleration points here, by omitting pro-acceleration points entirely, is net harmful.

Hmm, I get the idea that people value succinctness a lot with these sorts of things, because there's so much AI information to take in now, so I'm not so sure about the net effect, but I'm wondering maybe if I could get at your concern here by mocking up a percentage (i.e. what percentage of the proposals were risk oriented vs progress oriented)?

It wouldn't tell you the type of stuff the Biden administration is pushing, but it would tell you the ratio which is what you seem perhaps most concerned with.

[Edit] this is included now

I spent a few hours reading, and parsing out, sections 4 and 5 of the recent White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

The following are my rough notes on each subsection in those two subsections, summarizing what I understand each to mean, and my personal thoughts.

My high level thoughts are at the bottom.

Section by section

Section 4 – Ensuring the Safety and Security of AI Technology.

4.1

Summary:
- The secretary of commerce and NIST are going to develop guidelines and best practices for AI systems.
- In particular:
  - “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
    - What does this literally mean? Does this allocate funding towards research to develop these benchmarks? What will concretely happen in the world as a result of this initiative?
- It also calls for the establishment of guidelines for conducting red-teaming.
  - [[quote]]
    - (ii) Establish appropriate guidelines (except for AI used as a component of a national security system), including appropriate procedures and processes, to enable developers of AI, especially of dual-use foundation models, to conduct AI red-teaming tests to enable deployment of safe, secure, and trustworthy systems. These efforts shall include:
      - (A) coordinating or developing guidelines related to assessing and managing the safety, security, and trustworthiness of dual-use foundation models; and
      - (B) in coordination with the Secretary of Energy and the Director of the National Science Foundation (NSF), developing and helping to ensure the availability of testing environments, such as testbeds, to support the development of safe, secure, and trustworthy AI technologies, as well as to support the design, development, and deployment of associated PETs, consistent with section 9(b) of this order.
Commentary:
- I imagine that these standards and guidelines are going to be mostly fake.
- Are there real guidelines somewhere in the world? What process leads to real guidelines?

4.2

Summary:
- a
  - Anyone who has or wants to train a foundation model, needs to
    - Report their training plans and safeguards.
    - Report who has access to the model weights, and the cybersecurity protecting them
    - The results of red-teaming on those models, and what they did to meet the safety bars
  - Anyone with a big enough computing cluster needs to report that they have it.
- b
  - The Secretary of Commerce (and some associated agencies) will make (and continually update) some standards for models and computer clusters that are subject to the above reporting requirements. But for the time being,
    - Any models that were trained with more than 10^26 flops
    - Any models that are trained primarily on biology data and trained using greater than 10^23 flops
    - Any datacenter that connected with greater than 100 gigabits per second
    - Any datacenter that can train an AI at 10^20 flops
- c
  - I don’t know what this subsection is about. Something about protection cyber security for “United States Infrastructure as a Service” products.
  - This includes some tracking of when foreigners want to use US AI systems in ways that might pose a cyber-security risk, using standards identical to the ones laid out above.
- d
  - More stuff about IaaS, and verifying the identity of foreigners.
Thoughts:
- Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
- What do I think about this overall?
  - I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
  - The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
  - The interest in red-teaming is promising, but again it depends on the implementation details.
    - I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
      - What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

4.3

Summary:
- They want to protect against AI cyber-security attacks. Mostly this entails government agencies issuing reports.
  - a – Some actions aimed at protecting “critical infrastructure” (whatever that means).
    - Heads of major agencies need to provide an annual report to the Secretary of Homeland security on potential ways that AIs open vulnerabilities to critical infrastructure in their purview.
    - “…The Secretary of the Treasury shall issue a public report on best practices for financial institutions to manage AI-specific cybersecurity risks.”
    - Government orgs will incorporate some new guidelines.
    - The secretary of homeland security will work with government agencies to mandate guidelines.
    - Homeland security will make an advisory committee to “provide to the Secretary of Homeland Security and the Federal Government’s critical infrastructure community advice, information, or recommendations for improving security, resilience, and incident response related to AI usage in critical infrastructure.”
  - b – Using AI to improve cybersecurity
    - One piece that is interesting in that: “the Secretary of Defense and the Secretary of Homeland Security shall…each develop plans for, conduct, and complete an operational pilot project to identify, develop, test, evaluate, and deploy AI capabilities, such as large-language models, to aid in the discovery and remediation of vulnerabilities in critical United States Government software, systems, and networks”, and then report on their results
Commentary
- This is mostly about issuing reports, and guidelines. I have little idea if any of that is real or if this is just an expansion of lost-purpose bureaucracy. My guess is that there will be few people in the systems that have inside views that allow them to write good guidelines for their domains of responsibility regarding AI, and mostly these reports will be epistemically conservative and defensible, with a lot of “X is possibly a risk” where the authors have large uncertainty about how large the risk is.
- Trying to use AI to improve cyber security sure is interesting. I hope that they can pull that off. It seems like one of the things that ~ needs to happen for the world to end up in a good equilibrium is for computer security to get a lot better. Otherwise anyone developing a powerful model will have the weights stolen, and there’s a really vulnerable vector of attack for not-even-very-capable AI systems. I think the best hope for that is using our AI systems to shore up computer security defense, and hope that at higher-than-human levels of competence, cyber warfare is not so offense-dominant. (As an example, someone suggested maybe using AI to write a secure successor to C, and the using AI to “swap out” the lower layers of our computing stacks with that more secure low level language.)
  - Could that possibly happen in government? I generally expect that private companies would be way more competent at this kind of technical research, but maybe the NSA is a notable and important exception? If they’re able to stay ten years ahead in cryptography, maybe they can stay 10 years ahead in AI cyberdefense.
    - This raises the question, what advantage allows the NSA to stay 10 years ahead? I assume that it is a combination of being able to recruit top talent, and that there are things that they are allowed to do that would be illegal for anyone else. But I don’t actually know if that’s true.

4.4 – For reducing AI-mediated CHEMICAL, BIOLOGICAL, RADIOLOGICAL, AND NUCLEAR threats, focusing on biological weapons in particular.

Summary:
- a
  - The Secretary of Homeland Security (with help from other executive departments) will “evaluate” the potential of AI to both increase and to defend against these threats. This entails talking with experts and then submitting a report to the president.
  - In particular, it orders the Secretary of Defense (with the help of some other governmental agencies) to conduct a study that “assesses the ways in which AI can increase biosecurity risks, including risks from generative AI models trained on biological data, and makes recommendations on how to mitigate these risks”, evaluates the risks associated with the biology datasets used to train such systems, assesses ways to use AI to reduces biosecurity risks.
- b – Specifically to reduce risks from synthetic DNA and RNA.
  - The office of science and technology policy (with the help of other executive departments) are going to develop a “framework” for synthetic DNA/RNA companies to “implement procurement and screening mechanisms”. This entails developing “criteria and mechanisms” for identifying dangerous nucleotide sequences, and establishing mechanism for doing at-scale screening of synthetic nucleotides.
  - Once such a framework is in place, all (government?) funding agencies that fund life science research will make compliance with that framework a condition of funding.
  - All of this, once set up, needs to be evaluated and stress tested, and then a report sent to the relevant agencies.
Commentary:
- The part about setting up a framework for mandatory screening of nucleotide sequences, seems non-fake. Or at least it is doing more than commissioning assessments and reports.
  - And it seems like a great idea to me! Even aside from AI concerns, my understanding is that the manufacture synthetic DNA is one major vector of biorisk. If you can effectively identify dangerous nucleotide sequences (and that is the part that seems most suspicious to me), this is one of the few obvious places to enforce strong legal requirements. These are not (yet) legal requirements, but making this a condition of funding seems like a great step.

4.5

Summary
- Aims to increase the general ability for identifying AI generated content, and mark all Federal AI generated content as such.
- a
  - The secretary of commerce will produce a report on the current and likely-future methods for, authenticating non-AI content, identifying AI content, watermarking AI content, preventing AI systems from “producing child sexual abuse material or producing non-consensual intimate imagery of real individuals (to include intimate digital depictions of the body or body parts of an identifiable individual)”
- b
  - Using that report, the Secretary of Commerce will develop guidelines for detecting and authenticating AI content.
- c
  - Those guidelines will be issued to relevant federal agencies
- d
  - Possibly those guidelines will be folded into the Federal Acquisitions Regulation (whatever that is)
Commentary
- Seems generally good to be able to distinguish between AI generated material and non-AI generated material. I’m not sure if this process will turn up anything real that meaningfully impacts anyone’s experience of communications from the government.

4.6

Summary
- The Secretary of Commerce is responsible for running a “consultation process on potential risks, benefits, other implications” of open source foundation models, and then for submitting a report to the president on the results.
Commentary
- More assessments and reports.
- This does tell me that someone in the executive department has gotten the memo that open source models mean that it is easy to remove the safeguards that companies try to put in them.

4.7

Summary
- Some stuff about federal data that might be used to train AI Systems. It seems like they want to restrict the data that might enable CBRN weapons or cyberattacks, but otherwise make the data public?
Commentary
- I think I don’t care very much about this?

4.8

Summary
- This orders a National Security Memorandum on AI to be submitted to the president. This memorandum is supposed to “provide guidance to the Department of Defense, other relevant agencies”
Commentary:
- I don’t think that I care about this?

Section 5 – Promoting Innovation and Competition.

5.1 – Attracting AI Talent to the United States.

Summary
- This looks like a bunch of stuff to make it easier for foreign workers with AI relevant expertise to get visas, and to otherwise make it easy for them to come to, live in, work in, and stay in, the US.
Commentary
- I don’t know the sign of this.
- Do we want AI talent to be concentrated in one country?
  - On the one hand that seems like it accelerates timelines some, especially if there are 99.9% top tier AI researchers that wouldn’t otherwise be able to get visas, but who can now work at OpenAI. (It would surprise me if this is the case? Those people should all be able to get O1 visas, right?)
  - On the other hand, the more AI talent is concentrated in one country the smaller jurisdiction of the regulatory regime that slows down AI. If enough of the AI talent is in the US, regulations that slow down AI development in the US only have a substantial impact, at least the the short term, before that talent moves, but maybe also in the long term, if researchers care more about continuing to live in the US than they do about making cutting edge AI progress.

5.2

Summary
- a –
  - The director of the NSF will do a bunch of things to spur AI research.
    - …”launch a pilot program implementing the National AI Research Resource (NAIRR)”. This is evidently something that is intended to boost AI research, but I’m not clear on what it is or what it does.
    - …”fund and launch at least one NSF Regional Innovation Engine that prioritizes AI-related work, such as AI-related research, societal, or workforce needs.”
    - …”establish at least four new National AI Research Institutes, in addition to the 25 currently funded as of the date of this order.”
- b –
  - The Secretary of Energy will make a pilot program for training AI scientists.
- c –
  - Under Secretary of Commerce for Intellectual Property and Director of the United States Patent and Trademark Office will sort out how generative AI should impact patents, and issue guidance. There will be some similar stuff for copyright.
- d –
  - Secretary of Homeland Security “shall develop a training, analysis, and evaluation program to mitigate AI-related IP risks”
- e –
  - The HHS will prioritize grant-making to AI initiatives.
- f –
  - Something for the veterans.
- g –
  - Something for climate change
Commentary
- Again. I don’t know how fake this is. My guess is not that fake? There will be a bunch of funding for AI stuff, from the public sector, in the next two years.
- Most of this seems like random political stuff.

5.3 – Promoting Competition.

Summary
- a –
  - The heads of various departments are supposed to promote competition in AI, including in the inputs to AI (NVIDIA)?
- b
  - The Secretary of Commerce is going to incentivize competition in the semi-conductor industry, via a bunch of methods including
    - “implementing a flexible membership structure for the National Semiconductor Technology Center that attracts all parts of the semiconductor and microelectronics ecosystem”
    - mentorship programs
    - Increasing the resources available to startups (including datasets)
    - Increasing the funding to R&D for superconductors
- c – The Administrator of the Small Business Administration will support small businesses innovating and commercializing AI
- d
Commentary
- This is a lot of stuff. I don’t know that any of it will really impact how many major players there are at the frontier of AI in 2 years.
- My guess is probably not much. I don’t think the government knows to to create NVIDAs or OpenAIs.
- What the government can do is break up monopolies, but they’re not doing that here.

My high level takeaways

Mostly, this executive order doesn’t seem to push for much object-level action. Mostly it orders a bunch of assessments to be done, and reports on those assessments to be written, and then passed up to the president.

My best guess is that this is basically an improvement?

I expect something like the following to happen:

The relevant department heads talk with a bunch of experts.
The write up very epistemically conservative reports in which they say “we’re pretty sure that our current models in early 2024 can’t help with making bioweapons, but we don’t know (and can’t really know) what capabilities future systems will have, and therefore can’t really know what risk they’ll pose.”
The sitting president will then be weighing those unknown levels of national security risks against obvious economic gains and competition with China.

In general, this executive order means that the Executive branch is paying attention. That seems, for now, pretty good.

(Though I do remember in 2015 how excited and optimistic people in the rationality community were about Elon Musk, “paying attention”, and that ended with him founding OpenAI, what many of those folks consider to be the worst thing that anyone had ever done to date. FTX looked like a huge success worthy of pride, until it turned out that it was a damaging and unethical fraud. I’ve become much more circumspect about which things are wins, especially wins of the form “powerful people are paying attention”.)

My guess is that this comment would be much more readable with the central chunk of it in a google doc, or failing that a few levels fewer of indented bullets.

e.g. Take this section.

Thoughts:
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

I find it much more readable as the following prose rather than 5 levels of bullets. Less metacognition tracking the depth.

Thoughts
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

Possibly. I wrote this as personal notes, originally, in full nested list format. Then I spent 20 minutes removing some of the the nested-list-ness in wordpress which was very frustrating. I would definitely have organized it better if wordpress was less frustrating.

I did make a google doc format. Maybe the main lesson is that I should have edited it there.

The actual text or the order is 70 pages long and very hard to navigate. At the request of some DC friends, I made a tool for navigating the text that adds

Sidebar for navigation
Tooltips for definitions defined in section 3
Deep linking to any section/sentence of the text

Hope it's useful for some of you here! https://www.aijobstracker.com/ai-executive-order

My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.

All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.

Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.

Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to "wow, this was way more than I expected".

Worth noting is that they at least are trying to address deception by including it in the full bill readout. The type of model they hope to regulate here include those that permit "the evasion of human control or oversight through means of deception or obfuscation". The director of the OMB also has to come up with tests and safeguards for "discriminatory, misleading, inflammatory, unsafe, or deceptive outputs".

Wow, that's actually great!

this is crazy, perhaps the most sweeping action taken by government on AI yet.

Seems like too much consulting jargon and "we know it when we see it" vibes, with few concrete bright-lines. Maybe a lot hinges on enforcement of the dual-use foundation model policy... any chance developers can game the system to avoid qualifying as a dual-use model? Watermarking synthetic content does appear on its face a widely-applicable and helpful requirement.

My general impression is for these sorts of things, vagueness is generally positive, since it gives the executive and individual actors who want to make a name for themselves more leeway, and makes companies less able to wriggle out on technicalities. Contrast with vague RSPs, for which the value of vagueness is in the opposite direction.

But of course this is an executive order, so if enough companies aren’t subject to it based on technicalities, it could easily be changed and re-issued. I don’t know how common this is though.

Garrett responded to the main thrust well, but I will say that watermarking synthetic media seems fairly good as a next step for combating misinformation from AI imo. It's certainly widely applicable (not really even sure what the thrust of this distinction was) because it is meant to apply to nearly all synthetic content. Why exactly do you think it won't be helpful?

I agree, I was trying to highlight it as one of the most specific, useful policies from the EO. Understand the confusion given my comment was skeptical overall.

UK’s proposal for a joint safety institute seems maybe more notable:

Sunak will use the second day of Britain's upcoming two-day AI summit to gather “like-minded countries” and executives from the leading AI companies to set out a roadmap for an AI Safety Institute, according to five people familiar with the government’s plans.

The body would assist governments in evaluating national security risks associated with frontier models, which are the most advanced forms of the technology.

The idea is that the institute could emerge from what is now the United Kingdom’s government’s Frontier AI Taskforce, which is currently in talks with major AI companies Anthropic, DeepMind and OpenAI to gain access to their models. An Anthropic spokesperson said the company is still working out the details of access, but that it is “in discussions about providing API access.”

https://www.politico.eu/article/uk-pitch-ai-safety-institute-rishi-sunak/

This was the press release; the actual order has now been published.

One safety-relevant part:

4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:
   (i) Companies developing or demonstrating an intent to develop potential dual-use foundation models to provide the Federal Government, on an ongoing basis, with information, reports, or records regarding the following:
      (A) any ongoing or planned activities related to training, developing, or producing dual-use foundation models, including the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats;
      (B) the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights; and
      (C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section, and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security. Prior to the development of guidance on red-team testing standards by NIST pursuant to subsection 4.1(a)(ii) of this section, this description shall include the results of any red-team testing that the company has conducted relating to lowering the barrier to entry for the development, acquisition, and use of biological weapons by non-state actors; the discovery of software vulnerabilities and development of associated exploits; the use of software or tools to influence real or virtual events; the possibility for self-replication or propagation; and associated measures to meet safety objectives; and
   (ii) Companies, individuals, or other organizations or entities that acquire, develop, or possess a potential large-scale computing cluster to report any such acquisition, development, or possession, including the existence and location of these clusters and the amount of total computing power available in each cluster.
(b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:
   (i) any model that was trained using a quantity of computing power greater than 10²⁶ integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10²³ integer or floating-point operations; and
   (ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10²⁰ integer or floating-point operations per second for training AI.

Is there a definition of "dual-use foundation model" anywhere in the text?

(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

(ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or

(iii) permitting the evasion of human control or oversight through means of deception or obfuscation.

Models meet this definition even if they are provided to end users with technical safeguards that attempt to prevent users from taking advantage of the relevant unsafe capabilities.

(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

Wouldn't this include most, if not all, uncensored LLMs?

And thus any person/organization working on them?

Do you know who would be deciding on orders like this one? Some specialized department in the USG, whatever judge that happens to hear the case, or something else?

Why do you believe it's, on the whole, a 'mistake' instead of beneficial?

I can think of numerous benefits, especially in the long term.

e.g. drawing the serious attention of decision makers who might have otherwise believed it to be a bunch of hooey, and ignored the whole topic.

e.g. discouraging certain groups from trying to 'win' in a geopolitical contest, by rushing to create a 'super'-GPT, as they now know their margin of advantage is not so large anymore.

Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.

Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.

Okay, I guess my question still applies?

For example, it might be that letting it progress without restriction has more upsides then slowing it down.

Yes it's clear these are your views, Why do you believe so?

I think... maybe I see the world and humanity's existence on it, as a more fragile state of affairs than other people do. I wish I could answer you more thoroughly.

https://www.lesswrong.com/posts/uPi2YppTEnzKG3nXD/nathan-helm-burger-s-shortform?commentId=qmrrKminnwh75mpn5

Not sure, but maybe the new AI institute they're setting up as a result

Calling mundane risk "near term" sneaks in the implication that extinction risk isn't.

What alternative would you propose? I don't really like mundane risk but agree that an alternative would be better. For now I'll just change to "non-existential risk actions"

This made me wonder about a few things:

How responsible is CSET for this? CSET is the most highly funded longtermist-ish org, as far as I can tell from checking openbook.fyi (I could be wrong), so I've been trying to understand them better, since I don't hear much about them on LW or the EA Forum. I suspected they were having a lot of impact "behind the scenes" (from my perspective), and maybe this is a reflection of that?
Aaron Bergman said on Twitter that for him, "the ex ante probability of something at least this good by the US federal government relative to AI progress, from the perspective of 5 years ago was ~1%[.] Ie this seems 99th-percentile-in-2018 good to me", and many people seemed to agree. Stefan Schubert then said that "if people think the policy response is "99th-percentile-in-2018", then that suggests their models have been seriously wrong." I was wondering, do people here agree with Aaron that this EO appeared unlikely back then, and, if so, what do you think the correct takeaway from the existence of this EO is?

Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use and focusing solely on those aimed at risk.

It wouldn't tell you the type of stuff the Biden administration is pushing, but it would tell you the ratio which is what you seem perhaps most concerned with.

[Edit] this is included now

I spent a few hours reading, and parsing out, sections 4 and 5 of the recent White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

The following are my rough notes on each subsection in those two subsections, summarizing what I understand each to mean, and my personal thoughts.

My high level thoughts are at the bottom.

Section by section

Section 4 – Ensuring the Safety and Security of AI Technology.

4.1

Summary:
- The secretary of commerce and NIST are going to develop guidelines and best practices for AI systems.
- In particular:
  - “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
    - What does this literally mean? Does this allocate funding towards research to develop these benchmarks? What will concretely happen in the world as a result of this initiative?
- It also calls for the establishment of guidelines for conducting red-teaming.
  - [[quote]]
    - (ii) Establish appropriate guidelines (except for AI used as a component of a national security system), including appropriate procedures and processes, to enable developers of AI, especially of dual-use foundation models, to conduct AI red-teaming tests to enable deployment of safe, secure, and trustworthy systems. These efforts shall include:
      - (A) coordinating or developing guidelines related to assessing and managing the safety, security, and trustworthiness of dual-use foundation models; and
      - (B) in coordination with the Secretary of Energy and the Director of the National Science Foundation (NSF), developing and helping to ensure the availability of testing environments, such as testbeds, to support the development of safe, secure, and trustworthy AI technologies, as well as to support the design, development, and deployment of associated PETs, consistent with section 9(b) of this order.
Commentary:
- I imagine that these standards and guidelines are going to be mostly fake.
- Are there real guidelines somewhere in the world? What process leads to real guidelines?

4.2

Summary:
- a
  - Anyone who has or wants to train a foundation model, needs to
    - Report their training plans and safeguards.
    - Report who has access to the model weights, and the cybersecurity protecting them
    - The results of red-teaming on those models, and what they did to meet the safety bars
  - Anyone with a big enough computing cluster needs to report that they have it.
- b
  - The Secretary of Commerce (and some associated agencies) will make (and continually update) some standards for models and computer clusters that are subject to the above reporting requirements. But for the time being,
    - Any models that were trained with more than 10^26 flops
    - Any models that are trained primarily on biology data and trained using greater than 10^23 flops
    - Any datacenter that connected with greater than 100 gigabits per second
    - Any datacenter that can train an AI at 10^20 flops
- c
  - I don’t know what this subsection is about. Something about protection cyber security for “United States Infrastructure as a Service” products.
  - This includes some tracking of when foreigners want to use US AI systems in ways that might pose a cyber-security risk, using standards identical to the ones laid out above.
- d
  - More stuff about IaaS, and verifying the identity of foreigners.
Thoughts:
- Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
- What do I think about this overall?
  - I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
  - The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
  - The interest in red-teaming is promising, but again it depends on the implementation details.
    - I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
      - What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

4.3

Summary:
- They want to protect against AI cyber-security attacks. Mostly this entails government agencies issuing reports.
  - a – Some actions aimed at protecting “critical infrastructure” (whatever that means).
    - Heads of major agencies need to provide an annual report to the Secretary of Homeland security on potential ways that AIs open vulnerabilities to critical infrastructure in their purview.
    - “…The Secretary of the Treasury shall issue a public report on best practices for financial institutions to manage AI-specific cybersecurity risks.”
    - Government orgs will incorporate some new guidelines.
    - The secretary of homeland security will work with government agencies to mandate guidelines.
    - Homeland security will make an advisory committee to “provide to the Secretary of Homeland Security and the Federal Government’s critical infrastructure community advice, information, or recommendations for improving security, resilience, and incident response related to AI usage in critical infrastructure.”
  - b – Using AI to improve cybersecurity
    - One piece that is interesting in that: “the Secretary of Defense and the Secretary of Homeland Security shall…each develop plans for, conduct, and complete an operational pilot project to identify, develop, test, evaluate, and deploy AI capabilities, such as large-language models, to aid in the discovery and remediation of vulnerabilities in critical United States Government software, systems, and networks”, and then report on their results
Commentary
- This is mostly about issuing reports, and guidelines. I have little idea if any of that is real or if this is just an expansion of lost-purpose bureaucracy. My guess is that there will be few people in the systems that have inside views that allow them to write good guidelines for their domains of responsibility regarding AI, and mostly these reports will be epistemically conservative and defensible, with a lot of “X is possibly a risk” where the authors have large uncertainty about how large the risk is.
- Trying to use AI to improve cyber security sure is interesting. I hope that they can pull that off. It seems like one of the things that ~ needs to happen for the world to end up in a good equilibrium is for computer security to get a lot better. Otherwise anyone developing a powerful model will have the weights stolen, and there’s a really vulnerable vector of attack for not-even-very-capable AI systems. I think the best hope for that is using our AI systems to shore up computer security defense, and hope that at higher-than-human levels of competence, cyber warfare is not so offense-dominant. (As an example, someone suggested maybe using AI to write a secure successor to C, and the using AI to “swap out” the lower layers of our computing stacks with that more secure low level language.)
  - Could that possibly happen in government? I generally expect that private companies would be way more competent at this kind of technical research, but maybe the NSA is a notable and important exception? If they’re able to stay ten years ahead in cryptography, maybe they can stay 10 years ahead in AI cyberdefense.
    - This raises the question, what advantage allows the NSA to stay 10 years ahead? I assume that it is a combination of being able to recruit top talent, and that there are things that they are allowed to do that would be illegal for anyone else. But I don’t actually know if that’s true.

4.4 – For reducing AI-mediated CHEMICAL, BIOLOGICAL, RADIOLOGICAL, AND NUCLEAR threats, focusing on biological weapons in particular.

Summary:
- a
  - The Secretary of Homeland Security (with help from other executive departments) will “evaluate” the potential of AI to both increase and to defend against these threats. This entails talking with experts and then submitting a report to the president.
  - In particular, it orders the Secretary of Defense (with the help of some other governmental agencies) to conduct a study that “assesses the ways in which AI can increase biosecurity risks, including risks from generative AI models trained on biological data, and makes recommendations on how to mitigate these risks”, evaluates the risks associated with the biology datasets used to train such systems, assesses ways to use AI to reduces biosecurity risks.
- b – Specifically to reduce risks from synthetic DNA and RNA.
  - The office of science and technology policy (with the help of other executive departments) are going to develop a “framework” for synthetic DNA/RNA companies to “implement procurement and screening mechanisms”. This entails developing “criteria and mechanisms” for identifying dangerous nucleotide sequences, and establishing mechanism for doing at-scale screening of synthetic nucleotides.
  - Once such a framework is in place, all (government?) funding agencies that fund life science research will make compliance with that framework a condition of funding.
  - All of this, once set up, needs to be evaluated and stress tested, and then a report sent to the relevant agencies.
Commentary:
- The part about setting up a framework for mandatory screening of nucleotide sequences, seems non-fake. Or at least it is doing more than commissioning assessments and reports.
  - And it seems like a great idea to me! Even aside from AI concerns, my understanding is that the manufacture synthetic DNA is one major vector of biorisk. If you can effectively identify dangerous nucleotide sequences (and that is the part that seems most suspicious to me), this is one of the few obvious places to enforce strong legal requirements. These are not (yet) legal requirements, but making this a condition of funding seems like a great step.

4.5

Summary
- Aims to increase the general ability for identifying AI generated content, and mark all Federal AI generated content as such.
- a
  - The secretary of commerce will produce a report on the current and likely-future methods for, authenticating non-AI content, identifying AI content, watermarking AI content, preventing AI systems from “producing child sexual abuse material or producing non-consensual intimate imagery of real individuals (to include intimate digital depictions of the body or body parts of an identifiable individual)”
- b
  - Using that report, the Secretary of Commerce will develop guidelines for detecting and authenticating AI content.
- c
  - Those guidelines will be issued to relevant federal agencies
- d
  - Possibly those guidelines will be folded into the Federal Acquisitions Regulation (whatever that is)
Commentary
- Seems generally good to be able to distinguish between AI generated material and non-AI generated material. I’m not sure if this process will turn up anything real that meaningfully impacts anyone’s experience of communications from the government.

4.6

Summary
- The Secretary of Commerce is responsible for running a “consultation process on potential risks, benefits, other implications” of open source foundation models, and then for submitting a report to the president on the results.
Commentary
- More assessments and reports.
- This does tell me that someone in the executive department has gotten the memo that open source models mean that it is easy to remove the safeguards that companies try to put in them.

4.7

Summary
- Some stuff about federal data that might be used to train AI Systems. It seems like they want to restrict the data that might enable CBRN weapons or cyberattacks, but otherwise make the data public?
Commentary
- I think I don’t care very much about this?

4.8

Summary
- This orders a National Security Memorandum on AI to be submitted to the president. This memorandum is supposed to “provide guidance to the Department of Defense, other relevant agencies”
Commentary:
- I don’t think that I care about this?

Section 5 – Promoting Innovation and Competition.

5.1 – Attracting AI Talent to the United States.

Summary
- This looks like a bunch of stuff to make it easier for foreign workers with AI relevant expertise to get visas, and to otherwise make it easy for them to come to, live in, work in, and stay in, the US.
Commentary
- I don’t know the sign of this.
- Do we want AI talent to be concentrated in one country?
  - On the one hand that seems like it accelerates timelines some, especially if there are 99.9% top tier AI researchers that wouldn’t otherwise be able to get visas, but who can now work at OpenAI. (It would surprise me if this is the case? Those people should all be able to get O1 visas, right?)
  - On the other hand, the more AI talent is concentrated in one country the smaller jurisdiction of the regulatory regime that slows down AI. If enough of the AI talent is in the US, regulations that slow down AI development in the US only have a substantial impact, at least the the short term, before that talent moves, but maybe also in the long term, if researchers care more about continuing to live in the US than they do about making cutting edge AI progress.

5.2

Summary
- a –
  - The director of the NSF will do a bunch of things to spur AI research.
    - …”launch a pilot program implementing the National AI Research Resource (NAIRR)”. This is evidently something that is intended to boost AI research, but I’m not clear on what it is or what it does.
    - …”fund and launch at least one NSF Regional Innovation Engine that prioritizes AI-related work, such as AI-related research, societal, or workforce needs.”
    - …”establish at least four new National AI Research Institutes, in addition to the 25 currently funded as of the date of this order.”
- b –
  - The Secretary of Energy will make a pilot program for training AI scientists.
- c –
  - Under Secretary of Commerce for Intellectual Property and Director of the United States Patent and Trademark Office will sort out how generative AI should impact patents, and issue guidance. There will be some similar stuff for copyright.
- d –
  - Secretary of Homeland Security “shall develop a training, analysis, and evaluation program to mitigate AI-related IP risks”
- e –
  - The HHS will prioritize grant-making to AI initiatives.
- f –
  - Something for the veterans.
- g –
  - Something for climate change
Commentary
- Again. I don’t know how fake this is. My guess is not that fake? There will be a bunch of funding for AI stuff, from the public sector, in the next two years.
- Most of this seems like random political stuff.

5.3 – Promoting Competition.

Summary
- a –
  - The heads of various departments are supposed to promote competition in AI, including in the inputs to AI (NVIDIA)?
- b
  - The Secretary of Commerce is going to incentivize competition in the semi-conductor industry, via a bunch of methods including
    - “implementing a flexible membership structure for the National Semiconductor Technology Center that attracts all parts of the semiconductor and microelectronics ecosystem”
    - mentorship programs
    - Increasing the resources available to startups (including datasets)
    - Increasing the funding to R&D for superconductors
- c – The Administrator of the Small Business Administration will support small businesses innovating and commercializing AI
- d
Commentary
- This is a lot of stuff. I don’t know that any of it will really impact how many major players there are at the frontier of AI in 2 years.
- My guess is probably not much. I don’t think the government knows to to create NVIDAs or OpenAIs.
- What the government can do is break up monopolies, but they’re not doing that here.

My high level takeaways

My best guess is that this is basically an improvement?

I expect something like the following to happen:

The relevant department heads talk with a bunch of experts.
The write up very epistemically conservative reports in which they say “we’re pretty sure that our current models in early 2024 can’t help with making bioweapons, but we don’t know (and can’t really know) what capabilities future systems will have, and therefore can’t really know what risk they’ll pose.”
The sitting president will then be weighing those unknown levels of national security risks against obvious economic gains and competition with China.

In general, this executive order means that the Executive branch is paying attention. That seems, for now, pretty good.

My guess is that this comment would be much more readable with the central chunk of it in a google doc, or failing that a few levels fewer of indented bullets.

e.g. Take this section.

Thoughts:
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

I find it much more readable as the following prose rather than 5 levels of bullets. Less metacognition tracking the depth.

Thoughts
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?

The actual text or the order is 70 pages long and very hard to navigate. At the request of some DC friends, I made a tool for navigating the text that adds

Sidebar for navigation
Tooltips for definitions defined in section 3
Deep linking to any section/sentence of the text

Hope it's useful for some of you here! https://www.aijobstracker.com/ai-executive-order

My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.

All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.

Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.

Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to "wow, this was way more than I expected".

Wow, that's actually great!

this is crazy, perhaps the most sweeping action taken by government on AI yet.

But of course this is an executive order, so if enough companies aren’t subject to it based on technicalities, it could easily be changed and re-issued. I don’t know how common this is though.

I agree, I was trying to highlight it as one of the most specific, useful policies from the EO. Understand the confusion given my comment was skeptical overall.

UK’s proposal for a joint safety institute seems maybe more notable:

Sunak will use the second day of Britain's upcoming two-day AI summit to gather “like-minded countries” and executives from the leading AI companies to set out a roadmap for an AI Safety Institute, according to five people familiar with the government’s plans.

The body would assist governments in evaluating national security risks associated with frontier models, which are the most advanced forms of the technology.

The idea is that the institute could emerge from what is now the United Kingdom’s government’s Frontier AI Taskforce, which is currently in talks with major AI companies Anthropic, DeepMind and OpenAI to gain access to their models. An Anthropic spokesperson said the company is still working out the details of access, but that it is “in discussions about providing API access.”

https://www.politico.eu/article/uk-pitch-ai-safety-institute-rishi-sunak/

171

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

171

Existential Risk Related Actions:

Non-Existential Risk Actions:

General

Discrimination

Healthcare

Jobs

Privacy

171

Section by section

Section 4 – Ensuring the Safety and Security of AI Technology.

Section 5 – Promoting Innovation and Competition.

My high level takeaways

Thoughts

171

Section by section

Section 4 – Ensuring the Safety and Security of AI Technology.

Section 5 – Promoting Innovation and Competition.

My high level takeaways

Thoughts