AI Governance Needs Technical Work

Mau

Summary and Introduction

People who want to improve the trajectory of AI sometimes think their options for object-level work are (i) technical work on AI alignment or (ii) non-technical work on AI governance. But there is a whole other category of options: technical work in AI governance. This is technical work that mainly boosts AI governance interventions, such as norms, regulations, laws, and international agreements that promote positive outcomes from AI. This piece provides a brief overview of some ways to do this work—what they are, why they might be valuable, and what you can do if you’re interested. I discuss:

Engineering technical levers to make AI coordination/regulation enforceable (through hardware engineering, software/ML engineering, or heat/electromagnetism-related engineering)
Information security: Developing and implementing systems and best practices for securing model weights and other AI technology
Forecasting AI development
Technical standards development
Grantmaking or management to get others to do the above well
Advising on the above
Other work

[Update] Additional categories which the original version of this piece (from 2022) under-emphasized or missed are:

AI control: Developing systems and best practices for overseeing and constraining AI systems that may not be trustworthy (example)
Model evaluations: Developing technical evaluations of the safety of AI systems (discussion, examples)
Forecasting hardware trends (examples)
Cooperative AI: Research in game theory, ML, and decision theory for designing AI systems in ways that avoid costly coordination failures (discussion, examples)

I expect there will likely be one or more resources providing more comprehensive introductions to many of these topics in early 2024. For now, see the above links to learn more about the topics added in the update, and see below for more discussion of the originally listed topics.

Acknowledgements

Thanks to Lennart Heim, Jamie Bernardi, Gabriel Mukobi, Girish Sastry, and others for their feedback on this post. Mistakes are my own.

Context

What I mean by “technical work in AI governance”

I’m talking about work that:

Is technical (e.g. hardware/ML engineering) or draws heavily on technical expertise; and
Contributes to AI’s trajectory mainly by improving the chances that AI governance interventions succeed^[1] (as opposed to by making progress on technical safety problems or building up the communities concerned with these problems).

Neglectedness

As of writing, there are (by one involved expert’s estimate) ~8-15 full-time equivalents doing this work with a focus on especially large-scale AI risks.^[2]

Personal fit

For you to have a strong personal fit for this type of work, technical skills are useful, of course (including but not necessarily in ML), and interest in the intersection of technical work and governance interventions presumably makes this work more exciting for someone.

Also, whatever it takes to make progress on mostly uncharted problems in a tiny sub-field^[3] is probably pretty important for this work now, since that’s the current nature of these fields. That might change in a few years. (But that doesn’t necessarily mean you should wait; time’s ticking, someone has to do this early-stage thinking, and maybe it could be you.)

What I’m not saying

I’m of course not saying this is the only or main type of work that’s needed. (Still, it does seem particularly promising for technically skilled people, especially under the debatable assumption that governance interventions tend to be more high-leverage than direct work on technical safety problems.)

Types of technical work in AI governance

Engineering technical levers to make AI coordination/regulation enforceable

To help ensure AI goes well, we may need good coordination and/or regulation.^[4] To bring about good coordination/regulation on AI, we need politically acceptable methods of enforcing them (i.e. catching and penalizing/stopping violators).^[5] And to design politically acceptable methods of enforcement, we need various kinds of engineers, as discussed in the next several sections.^[6]

Hardware engineering for enabling AI coordination/regulation

To help enforce AI coordination/regulation, it might be possible to create certain on-chip devices for AI-specialized chips or other devices at data centers. As a non-exhaustive list of speculative examples:

Devices on network switches that identify especially large training runs could be helpful.
- They could help enforce regulations that apply only to training runs above a certain size (which, among other benefits, seem much easier politically than trying to regulate all uses of compute).
If there were on-chip devices tracking the number of computations done on chips, that could help an agency monitor how much compute various data centers and organizations are using.
- That could help enforce regulations whose application depends on the amount of compute being used by an AI developer or data center (which, among other benefits, seems much easier politically than trying to regulate everyone who uses compute).
Dead man’s switches on AI hardware (or other hardware-enabled authorization requirements) could peacefully keep rogue organizations from harmful AI development or deployment (e.g. by interfering early on in a training run).

Part of the engineering challenge here is that, ideally (e.g. for political acceptability), we may want such devices to not only work but to also be (potentially among other desired features):

Secure;
Privacy-preserving;
Cheap;
Tamper-indicating; and
Tamper-proof.^[7]

Software/ML engineering for enabling AI coordination/regulation

Software (especially ML) engineering could help enforce AI coordination/regulation in various ways^[8], including the following:

Methods/software for auditing ML models could help determine when and how regulations should be applied (e.g. it could help determine that some model may not be deployed yet because it has capabilities that current safety methods do not address) (see here for an example of such work);
ML applications to satellite imagery (visual and infrared) could help identify secret data centers;
Software (maybe ML) for analyzing hardware devices or perhaps video data could help detect efforts to tamper with the hardware devices discussed in the previous section; and
ML applications to open-source data or other types of data could help identify violations.

Heat/electromagnetism-related engineering for enabling AI coordination/regulation

For enforcing AI coordination/regulation against particularly motivated violators, it could be helpful to be able to identify hidden chips or data centers using their heat and electromagnetic signatures. People who know a lot about heat and electromagnetism could presumably help design equipment or methods that do this (e.g. mobile equipment usable at data centers, equipment that could be installed at data centers, methods for analyzing satellite data, and methods for analyzing data collected about a facility from a nearby road.)

Part of the challenge here is that these methods should be robust to efforts to conceal heat and electromagnetic signatures.

Information security

Information security could matter for AI in various ways, including the following:

It would be bad if people steal unsafe ML models and deploy them. It would also be bad if AI developers rush to deploy their own models (e.g. with little testing or use of safety methods) because they are scared that, if they wait too long, someone else will steal their models and deploy them first. Sufficiently good information security in AI developers (including their external infrastructure) would mitigate these problems.
Information security in regulatory agencies might help enable coordination/regulations on AI to be enforced in a politically acceptable way; it could assure AI developers that their compliance will be verified without revealing sensitive information, while assuring a regulator that the data they are relying on is authentic.
- This could include the use of cryptographic techniques in the hardware devices, model evaluation software, and other equipment discussed above.
Information security in hardware companies could help keep the semiconductor supply chain concentrated in a small number of allied countries, which might help enable governance of this supply chain.

See here, here, and here (Sections 3.3 and 4.1), and listen here [podcast] for more information. As these sources suggest, information security overlaps with—but extends beyond—the engineering work mentioned above.

Forecasting AI development

AI forecasters answer questions about what AI capabilities are likely to emerge when. This can be helpful in several ways, including:

Helping AI governance researchers account for ways in which near-term advances in AI will change the strategic landscape (e.g. through the introduction of new tools or new threats, or through raising how much attention various actors are paying to AI);
Helping determine the urgency and acceptable timelines for various kinds of work; and
Helping set parameters for (coordinated) AI regulations (e.g. if some regulation would only apply to models trained with at least some amount of compute, precisely how many FLOPs should be treated as highly risky? What are the cost penalties of decentralized training, which might change what regulators need to look for at each data center?)

Typically, this work isn’t engineering or classic technical research; it often involves measuring and extrapolating AI trends, and sometimes it is more conceptual/theoretical. Still, familiarity with relevant software or hardware often seems helpful for knowing what trends to look for and how to find relevant data (e.g. “How much compute was used to train recent state–of-the-art models?”), as well as for being able to assess and make arguments on relevant conceptual questions (e.g. “How analogous is gradient descent to natural selection?”).

See here (Section I) and here^[9] for some collections of relevant research questions; see [1], [2], [3], [4], and [5] for some examples of AI forecasting work; and listen here [podcast] for more discussion.

Technical standards development

One AI risk scenario is that good AI safety methods will be discovered, but they won’t be implemented widely enough to prevent bad outcomes.^[10] To help with this, translating AI safety work into technical standards (which can then be referenced by regulations, as is often done) might help. Relatedly, standard-setting could be a way for AI companies to set guardrails on their AI competition without violating antitrust laws.

Technical expertise (specifically, in AI safety) could help standards developers (i) identify safety methods that it would be valuable to standardize, and (ii) translate safety methods into safety standards (e.g. by precisely specifying them in widely applicable ways, or designing testing and evaluation suites for use by standards^[11]).

Additionally, strengthened cybersecurity standards for AI companies, AI hardware companies, and other companies who process their data could help address some of the information security issues mentioned above.

See here for more information.

Grantmaking or management to get others to do the above well

Instead of doing the above kinds of work yourself, you might be able to use your technical expertise to (as a grantmaker or manager) organize others in doing such work. Some of the problems here appear to be standard, legible technical problems, so it might be very possible for you to leverage contractors, grantees, employees, or prize challenge participants to solve these problems, even if they aren’t very familiar with or interested in the bigger picture.

Couldn’t non-experts do this well? Not necessarily; it might be much easier to judge project proposals, candidates, or execution if you have subject-matter expertise. Expertise might also be very helpful for formulating shovel-ready technical problems.

Advising on the above

Some AI governance researchers and policymakers may want to bet on certain assumptions about the feasibility of certain engineering or infosec projects, on AI forecasts, or on relevant industries. By advising them with your relevant expertise, you could help allies make good bets on technical questions. A lot of this work could be done in a part-time or “on call” capacity (e.g. while spending most of your work time on what the above sections discussed, working at a relevant hardware company, or doing other work).

Others?

I’ve probably missed some kinds of technical work that can contribute to AI governance, and across the kinds of technical work I identified, I’ve probably missed many examples of specific ways they can help.

Potential next steps if you’re interested

Contributing in any of these areas will often require you to have significant initiative; there aren’t yet very streamlined career pipelines for doing most of this work with a focus on large-scale risks. Still, there is plenty you can do; you can:

Learn more about these kinds of work, e.g. by following the links in the above sections (as well as this link, which overlaps with several hardware-related areas).
Test your fit for these areas, e.g. by taking an introductory course in engineering or information security, or by trying a small, relevant project (say, on the side or in a research internship).
Build relevant expertise, e.g. by extensively studying or working in a relevant area.
- Grantmakers like the Long-Term Future Fund might be interested in supporting relevant self-education projects.
Learn about and pursue specific opportunities to contribute, especially if you have a serious interest in some of this work or relevant experience, e.g.:
- Reach out to people who work in related areas (e.g. cold-email authors of relevant publications, or reach out at community conferences).
- Apply for funding if you have a project idea.
  - Georgetown’s Center for Security and Emerging Technology (CSET) might be interested in funding relevant projects (though, speculating based on a public announcement from the relevant grantmaker, they might have limited capacity in this area for the next few months).
- Keep an eye out for roles on relevant job boards.
Feel free to reach out to the following email address if you have questions or want to coordinate with some folks who are doing closely related work^[12]:
- technical-ai-governance [ät] googlegroups [döt] com

Notes

This includes creating knowledge that enables decision-makers to develop and pursue more promising AI governance interventions (i.e. not just boosting interventions that have already been decided on). ↩︎
Of course, there are significantly more people doing most of these kinds of work with other concerns, but such work might not be well-targeted at addressing the concerns of many on this forum. ↩︎
courage? self-motivation? entrepreneurship? judgment? analytical skill? creativity? ↩︎
To elaborate, a major (some would argue central) difficulty with AI is the potential need for coordination between countries or perhaps labs. In the absence of coordination, unilateral action and race-to-the-bottom dynamics could lead to highly capable AI systems being deployed in (sometimes unintentionally) harmful ways. By entering enforceable agreements to mutually refrain from unsafe training or deployments, relevant actors might be able to avoid these problems. Even if international agreements are infeasible, internal regulation could be a critical tool for addressing AI risks. One or a small group of like-minded countries might lead the world in AI, in which case internal regulation by these governments might be enough to ensure highly capable AI systems are developed safely and used well. ↩︎
To elaborate, international agreements and internal regulation both must be enforceable in order to work. The regulators involved must be able to catch and penalize (or stop) violators—as quickly, consistently, and harshly as is needed to prevent serious violations. But agreements and regulations don’t “just” need to be enforceable; they need to be enforceable in ways that are acceptable to relevant decision-makers. For example, decision-makers would likely be much more open to AI agreements or regulations if their enforcement (a) would not expose many commercial, military, or personal secrets, and (b) would not be extremely expensive. ↩︎
After all, we currently lack good enough enforcement methods, so some people (engineers) need to make them. (Do you know of currently existing and politically acceptable ways to tell whether AI developers are training unsafe AI systems in distant data centers? Me neither.) Of course, we also need others, e.g. diplomats and policy analysts, but that is outside the scope of this post. As a motivating (though limited) analogy, the International Atomic Energy Agency relies on a broad range of equipment to verify that countries follow the Treaty on the Non-Proliferation of Nuclear Weapons. ↩︎
Literally “tamper-proof” might be infeasible, but “prohibitively expensive to tamper with at scale” or “self-destroys if tampered with” might be good enough. ↩︎
This overlaps with cooperative AI. ↩︎
Note the author of this now considers it a bit outdated. ↩︎
In contrast, some other interventions appear to be more motivated by the worry that there won’t be time to discover good safety methods before harmful deployments occur. ↩︎
This work might be similar to the design of testing and evaluation suites for use by regulators, mentioned in the software/ML engineering section. ↩︎
I’m not managing this email; a relevant researcher who kindly agreed to coordinate some of this work is. They have a plan that I consider credible for regularly checking what this email account receives. ↩︎

[-]SteveZ4y30

I’m finishing my PhD in hardware/ML and I’ve been thinking vaguely about hardware approaches for AI safety recently, so it’s great to see other people are thinking about this too! I hope to have more free time once I finish my thesis in a few weeks, and I’d love to talk more to anyone else who is interested in this approach and perhaps help out if I can.

41