I agree with almost all of this analysis, but I’m surprised at any suggestion that government shouldn’t be encouraged to pay more attention to AI.
The common tendency in the tech sphere to downplay government involvement seems maladaptive when applied to AGI. It was a useful instinct when resisting regulation that could stifle harmless innovation; it is an unhelpful one when applied to the dangerous development now taking place.
AGI seems like a scenario that governmental bodies are better calibrated towards handling than corporate ones, as governments are at least partially incentivised to account for the public good. Meanwhile, the entire history of the corporation is one of disregard for negative externalities. From the East India Company’s indifference to millions of deaths in Bengal to modern tobacco and fossil fuel companies, corporations have repeatedly failed to safeguard against even the most catastrophic consequences. This is less due to some moral failing than the fact that they just lack the programming to make them do so. Good actors may start to develop some of this programming in their internal governance structures, but the record overwhelmingly suggests that this will neither be wholly reliable nor see adoption en masse.
Currently, AI development looks likely to continue in a low-oversight environment where companies face little external pressure to weigh up the existential risks they are hurtling us towards. Here, like in other sectors, the government has a part to play in encouraging an environment where AI risk is highly salient to the groups developing them. There is a general consensus that we are stepping down a dark path, and a healthy fear of the consequences of each step is beneficial across each institution that can influence our future here.
I think there's truth to what you're saying, but I think the downsides of premature government involvement are big too. I discuss this more in a followup post.
My view on this is that politics/government work is both very important and also wildly intractable, for both fundamental and practical reasons.
It's sad to say that a very important cause should ultimately be left on the table, but unfortunately I think this is the case.
I’ve been writing about tangible things we can do today to help the most important century go well. Previously, I wrote about helpful messages to spread and how to help via full-time work.
This piece is about what major AI companies can do (and not do) to be helpful. By “major AI companies,” I mean the sorts of AI companies that are advancing the state of the art, and/or could play a major role in how very powerful AI systems end up getting used.1
This piece could be useful to people who work at those companies, or people who are just curious.
Generally, these are not pie-in-the-sky suggestions - I can name2 more than one AI company that has at least made a serious effort at each of the things I discuss below (beyond what it would do if everyone at the company were singularly focused on making a profit).3
I’ll cover:
I previously laid out a summary of how I see the major risks of advanced AI, and four key things I think can help (alignment research; strong security; standards and monitoring; successful, careful AI projects). I won’t repeat that summary now, but it might be helpful for orienting you if you don’t remember the rest of this series too well; click here to read it.
Some basics: alignment research, strong security, safety standards
First off, AI companies can contribute to the “things that can help” I listed above:
The challenging of securing dangerous AI (Details not included in email - click to view on the web)
Avoiding hype and acceleration
It seems good for AI companies to avoid unnecessary hype and acceleration of AI.
I’ve argued that we’re not ready for transformative AI, and I generally tend to think that we’d all be better off if the world took longer to develop transformative AI. That’s because:
By default, I generally think: “The fewer flashy demos and breakthrough papers a lab is putting out, the better.” This can involve tricky tradeoffs in practice (since AI companies generally want to be successful at recruiting, fundraising, etc.)
A couple of potential counterarguments, and replies:
First, some people think it's now "too late" to avoid hype and acceleration, given the amount of hype and investment AI is getting at the moment. I disagree. It's easy to forget, in the middle of a media cycle, how quickly people can forget about things and move onto the next story once the bombs stop dropping. And there are plenty of bombs that still haven't dropped (many things AIs still can't do), and the level of investment in AI has tons of room to go up from here.
Second, I’ve sometimes seen arguments that hype is good because it helps society at large understand what’s coming. But unfortunately, as I wrote previously, I'm worried that hype gives people a skewed picture.
I also am generally skeptical that there's much hope of society adapting to risks as they happen, given the explosive pace of change that I expect once we get powerful enough AI systems.
I discuss some more arguments on this point in a footnote.4
I don’t think it’s clear-cut that hype and acceleration are bad, but it’s my best guess.
Preparing for difficult decisions ahead
I’ve argued that AI companies might need to do “out-of-the-ordinary” things that don’t go with normal commercial incentives.
Today, AI companies can be building a foundation for being able to do “out-of-the-ordinary” things in the future. A few examples of how they might do so:
Public-benefit-oriented governance. I think typical governance structures could be a problem in the future. For example, a standard corporation could be sued for not deploying AI that poses a risk of global catastrophe - if this means a sacrifice for its bottom line.
I’m excited about AI companies that are investing heavily in setting up governance structures - and investing in executives and board members - capable of making the hard calls well. For example:
It could pay off in lots of ways to make sure the final calls at a company are made by people focused on getting a good outcome for humanity (and legally free to focus this way).
Gaming out the future. I think it’s not too early for AI companies to be discussing how they would handle various high-stakes situations.
Establishing and getting practice with processes for particularly hard decisions. Should the company publish its latest research breakthrough? Should it put out a product that might lead to more hype and acceleration? What safety researchers should get access to its models, and how much access?
AI companies face questions like this pretty regularly today, and I think it’s worth putting processes in place to consider the implications for the world as a whole (not just for the company’s bottom line). This could include assembling advisory boards, internal task forces, etc.
Managing employee and investor expectations. At some point, an AI company might want to make “out of the ordinary” moves that are good for the world but bad for the bottom line. E.g., choosing not to deploy AIs that could be very dangerous or very profitable.
I wouldn’t want to be trying to run a company in this situation with lots of angry employees and investors asking about the value of their equity shares! It’s also important to minimize the risk of employees and/or investors leaking sensitive and potentially dangerous information.
AI companies can prepare for this kind of situation by doing things like:
Internal and external commitments. AI companies can make public and/or internal statements about how they would handle various tough situations, e.g. how they would determine when it’s too dangerous to keep building more powerful models.
I think these commitments should generally be non-binding (it’s hard to predict the future in enough detail to make binding ones). But in a future where maximizing profit conflicts with doing the right thing for humanity, a previously-made commitment could make it more likely that the company does the right thing.
Succeeding
I’ve emphasized how helpful a successful, careful AI projects could be. So far, this piece has mostly talked about the “careful” side of things - how to do things that a “normal” AI company (focused only on commercial success) wouldn’t, in order to reduce risks. But it’s also important to succeed at fundraising, recruiting, and generally staying relevant (e.g., capable of building cutting-edge AI systems).
I don’t emphasize this or write about it as much because I think it’s the sort of thing AI companies are likely to be focused on by default, and because I don’t have special insight into how to succeed as an AI company. But it’s important, and it means that AI companies need to walk a sort of tightrope - constantly making tradeoffs between success and caution.
Some things I’m less excited about
I think it’s also worth listing a few things that some AI companies present as important societal-benefit measures, but which I’m a bit more skeptical are crucial for reducing the risks I’ve focused on.
When an AI company presents some decision as being for the benefit of humanity, I often ask myself, “Could this same decision be justified by just wanting to commercialize successfully?”
For example, making AI models “safe” in the sense that they usually behave as users intend (including things like refraining from toxic language, chaotic behavior, etc.) can be important for commercial viability, but isn’t necessarily good enough for the risks I worry about.
Footnotes
Disclosure: my wife works at one such company (Anthropic) and used to work at another (OpenAI), and has equity in both. ↩
Though I won’t, because I decided I don’t want to get into a thing about whom I did and didn’t link to. Feel free to give real-world examples in the comments! ↩
Now, AI companies could sometimes be doing “responsible” or “safety-oriented” things in order to get good PRs, recruit employees, make existing employees happy, etc. In this sense, the actions could be ultimately profit-motivated. But that would still mean there are enough people who care about reducing AI risk that actions like these have PR benefits, recruiting benefits, etc. That’s a big deal! And it suggests that if concern about AI risks (and understanding of how to reduce them) were more widespread, AI companies might do more good things and fewer dangerous things. ↩
You could argue that it would be better for the world to develop extremely powerful AI systems sooner, for reasons including:
A key reason I believe it’s best to avoid acceleration at this time is because it seems plausible (at least 10% likely) that transformative AI will be developed extremely soon - as in within 10 years of today. My impression is that many people at major AI companies tend to agree with this. I think this is a very scary possibility, and if this is the case, the arguments I give in the main text seem particularly important (e.g., many key interventions seem to be in a pretty embryonic state, and awareness of key risks seems low).
A related case one could make for acceleration is “It’s worth accelerating things on the whole to increase the probability that the particular company in question succeeds” (more here: the “competition” frame). I think this is a valid consideration, which is why I talk about tricky tradeoffs in the main text. ↩
Note that my wife is a former employee of OpenAI, the company I link to there, and she owns equity in the company. ↩