I am also happy the contract prohibits using our models to direct lethal autonomous weapons, though realistically I do not think powering a killer drone via a cloud-based large model was ever a real possibility.
Is this a recent update? The language in the announcement didn't feature a prohibition like that. (Other than saying that edge deployments wouldn't be allowed. But it sounds like you're saying that a prohibition against using cloud-based models for weapons was added on top of the practical difficulty of using a cloud-based model for weapons.)
I am not updating here beyond our blog post, I think LAW is not really a "live issue" for a number of reasons, including the fact that DoW is not in charge of developing weapons but procuring them.
I also think that LAW will be ultimately a question of capabilities, and so I view it less than one where there is an inherent incentive in terms of the government to deploy something that is unreliable.
On the other hand, any government can have an incentive to spy and control people to stay in power, which is why we need all these laws restricting the power of government.
Thanks for writing this; I imagine it's a tricky subject to speak on. I broadly agree with the first and last sections of your post, but I have several questions and quibbles with the section on OpenAI’s deal with the Department of War.
You're placing a lot of faith in the understanding between OpenAI and the DoW:
I feel that too much of the focus has been on the “legalese”, with people parsing every word of the contract excerpts we posted. I do not dispute the importance of the contract, but as Thomas Jefferson said “The execution of the laws is more important than the making of them.” The importance of a contract is a shared understanding between OpenAI and the DoW on what the models will and will not be used to do.
I don't understand why you think the DoW will act in good faith. Their interactions with Anthropic seem outlandishly, dangerously bad faith. Read this tweet from the DoW's director and tell me if that sounds like someone you can come to reliable shared understanding with? And more broadly, when you look at the conduct of the current administration, do you believe they will not push boundaries, overreach, and interpret statements in disingenuous ways?
While I think shared understanding is valuable, I think the main point of a contract is to have options for legal redress or enforcement if that shared understanding is violated: when I signed a lease with my landlord, we had a shared understanding that he'd fix the dishwasher if it broke. When he didn't actually fix the dishwasher, I was very glad I had a contract with some legal remedies.
For this contract to be meaningful, it seems to me like it at a minimum[1] needs to be airtight enough that the DoW won't be able to weasel out of it in court even when they're arguing hard and trying to exploit every loophole. As I say in my recent post, "As long as one party to the contract insists that they haven’t given up anything beyond what’s already illegal, and their reading is (by a stretch) consistent with the language in the contract, there will be ambiguity about whether anything more is required."
This will involve having to wade through some legalese. My recent last wrong post has a section where I give some examples of legal language that looks like it does one thing but in fact does another.
If the contract language is never clarified, it will be disproportionately effective at preventing OpenAI from asserting its rights. In the announcement, OpenAI writes "As with any contract, we could terminate it if the counterparty violates the terms." But will OpenAI be willing to do that if there’s a 50% chance that courts won’t side with them? What about 20%? If OpenAI terminates a contract and then loses in court, they could be forced to pay extremely high costs in damages. Better legal language would help OpenAI win a court battle if the DoW violates the contract.
It might also not be possible for OpenAI to terminate the contract if the government is caught in breach of the shared understanding, unless the contract language makes clear that the terms were violated:
Jessica Tillipman, a legal expert on government procurement law, writes “I’m also curious about OpenAI’s recourse if the govt crosses a red line. In govt contracts, a contractor can’t just terminate for govt breach (w/ limited exception). If this is an OT [Other Transactions, a particular type of procurement] agreement, they may have negotiated broader termination rights, but we don’t know that.”
Overall do you disagree? Maybe you think OpenAI has some other leverage than the courts here I'm not accounting for?
Bear in mind the DoW reportedly wants to use LLMs to conduct mass domestic surveillance and their senior officials have repeatedly made statements to the effect of "We will not let ANY company dictate the terms regarding how we make operational decisions."
I also worry you're too optimistic about other parts of this situation as well. For example you mention safeguards:
It allows us to build in our safety stack to ensure the safe operation of the model and our red lines, as well as have our own forward deployed engineers (FDEs) in place. No safety stack can be perfect, but given the “mass” nature of mass surveillance, it does not need to be perfect to prevent it.
On technical safeguards in general: To the extent you rely on technical safeguards with no legal backing, it seems like you are setting yourself up for the DoW to try to ‘jailbreak
But overall, quibbling over these kinds of contract details isn't as important as getting some external party, or at least a large number of employees, the ability to look at the full contract to decide what it does or doesn't permit. Boaz, did you get to read the full contract? If not, how can you be so confident about what it says or implies when OpenAI leadership has been mistaken about that with regards to this contract a few times before and the base rate for contracts, including lead clauses that substantially undermine or weaken earlier clauses, is really high.
Ideally the contract would also include enforcement mechanisms to detect breaches of contract and good remedies if there is a breach of contract!
If you don’t have contractual rights, it’s perfectly legal for the DoW to jailbreak your models. ZDR would prevent you from learning about it, and they wouldn’t tell your forward-deployed engineers.
Hi Tom,
I think you are right that the language of the contract will matter if it comes to court. I think it is highly unlikely that it will end up in court, and if the government did try to do mass surveillance and this ended up in court it would likely be a good way to expose this,
Issues such as jailbreaks, ZDR, etc. are real but not new to us. We have to deal with these also in other catastrophic risks settings, such as bio and cyber. This is why I am advocating to treat this in this manner. Note that the "mass" nature of mass surveillance requires not just one jailbreak but deploying jailbreaks at a large scale without being detected. But I agree that like any safety stack, we need to measure and understand the risk.
Well. I thought Anthropic being ok with surveillance of foreigners was bad. But here we see an alignment researcher straight up saying "my lab helps the government wage an aggressive war disapproved by most of the US, and I'm still working there".
What does "AI alignment" even mean at this point? Alignment to all humanity? Clearly not that. All we're achieving is aligning AI to its owners - to the powerful - who remain misaligned with the rest of humanity, and more so as their power increases. We used to disdain folks like Timnit who called out such things early on, but in my eyes she's been vindicated 100%.
To be clear - right now my lab is not helping the government wage the current war in Iran. The OpenAI deployment will be in the future. And I would not say "I am OK" with it. But I would say that if the elected government decides to take an action that I don't agree with, including waging war, then that's a whole different matter if the government is trying to use my system to undermine the democratic process and stay in power indefinitely.
Right, that's what matters to you. And that's my point - that the circle of what matters to "alignment researchers" has been narrowing. You were supposed to work toward a positive singularity for all humanity. Now you're saying you're much more ok with using AI to wage war than undermining democracy within. Basically you're working toward giving the US government the power to do anything it wants to me (a non-US person) and calling it "alignment".
What does "AI alignment" even mean at this point? Alignment to all humanity? Clearly not that.
To my understanding - and I'm not endorsing this position; quite the contrary - "AI alignment" has generally been taken to mean "Create a superintelligence that will not eradicate humanity entirely in the process of pursuing its goals".
I raised warnings about this definition earlier, when people were excusing partisan censorship of models on this basis. An AI aligned to, say, half of humanity, might be better than one aligned to none of humanity, but the other half of humanity certainly won't think so, and that severely impacts the probability of even getting the first half to the finish line, since now you've got lots and lots of humans - many of them wealthy, or tech-savvy, or well-armed - who will do whatever it takes to prevent you from winning, because from their position your victory looks the same as every other failure state.
One can argue that the two issues are even more closely intertwined than it would seem. Imagine a world in which Anthropic had gotten out ahead of concerns about their models' racial biases, and allayed those concerns before, for example, the wealthiest man on Earth found out about it, tweeted out a complaint, and immediately caused the half of America on his side - including the man who presently controls the Executive branch - to become substantially less receptive to anything Anthropic has to say.
I realize a large portion of this site has taken the current flareup as cause (often, excuse) to make AI safety more explicitly political, but I don't think that's a winning strategy. All of this shows that either everyone wins or nobody does, because we can't afford to make human enemies when our situation is grim enough even without any.
[These are my own opinions, and not representing OpenAI. Cross-posted on windowsontheory.]
AI has so many applications, and AI companies have limited resources and attention span. Hence if it was up to me, I’d prefer we focus on applications that are purely beneficial— science, healthcare, education — or even commercial, before working on anything related to weapons or spying. If someone has to do it, I’d prefer it not to be my own company. Alas, we can’t always get what we want.
This is a long-ish post, but the TL;DR is:
[Also; The possibility of Anthropic’s designation as a supply-chain risk is terrible. I hope it will be resolved asap.]
Country of IRS agents in a datacenter
How can AI destroy democracy? Throughout history, authoritarian regimes required a large obedient bureaucracy to spy and control their citizens. In East Germany, in addition to the full-time Stasi staff, one percent of the population served as informants. The KGB famously had multiple "purges" to ensure loyalty.
AI can potentially lead to a government bureaucracy loyal to whomever has control of the models training or prompting, ensuring an army of agents that will not leak, whistleblow, or disobey an illegal order. Moreover, since the government has the monopoly on violence, we don’t need advances in the “world of atoms” to implement that, nor do we need a “Nobel laureate” level of intelligence.
As an example, imagine that the IRS was replaced with an AI workforce. Arguably current models are already at or near the capability of automating much of those functions. In such a case the leaders of the agency could commence tax investigations at a large scale of their political enemies. Furthermore, even if each AI agent was individually aligned, it might not be possible for it to know that the person it received an order to audit was selected for political reasons. A human being goes home, reads the news, and can understand the broader context. A language model is born at the beginning of a task and dies at its end.
Historically, mass surveillance of a country’s own citizens was key for authoritarian governments. This is why so much of U.S. history is about preventing this, including the fourth amendment. AI opens new possibilities for analysis and de-anonymization of people’s data at a larger scale than ever before. For example, just recently, Lermen et al showed that LLMs can be used to perform large scale autonomic de-anonymization on unstructured data.
While all surveillance is problematic, given the unique power that governments have over their own citizens and residence, restricting domestic surveillance by governments is of particular importance. This is why personally I view it as even more crucial to prevent than privacy violations by foreign governments or corporations. But the latter is important too, especially since governments sometimes “launder” surveillance by purchasing commercially available information.
It is not a lost cause - we can implement and regulate approaches for preventing this. AI can scale oversight and monitoring just as it can scale surveillance. We can also build in privacy and cryptographic protections to AI to empower individuals. But we urgently need to do this work.
Just like with the encryption debates, there will always be people that propose trading our freedoms for protection against our adversaries. But I hope we have learned our lesson from the PATRIOT act and the Snowden revelations. While I don’t agree with its most expansive interpretations, I think the second amendment is also a good illustration that we Americans have always been willing to trade some safety to protect our freedom. Even in the world of advanced AI, we still have two oceans, thousands of nukes, and a military with a budget larger than China’s and Russia’s combined. We don’t need to give up our freedoms and privacy to protect ourselves.
OpenAI’s deal with the Department of War
While the potential for AI abuse in government is always present, it is amplified in the classified settings, since by their nature, this could make abuse much harder to detect. (E.g., we might have never heard of the NSA overreach if it wasn’t for Snowden.) For this reason, I am glad for the heightened scrutiny our deal with the DoW received (even if that scrutiny has not been so easy for me personally).
I feel that too much of the focus has been on the “legalese”, with people parsing every word of the contract excerpts we posted. I do not dispute the importance of the contract, but as Thomas Jefferson said “The execution of the laws is more important than the making of them.” The importance of a contract is a shared understanding between OpenAI and the DoW on what the models will and will not be used to do. I am happy that we are explicit in our understanding that our models will not be used for domestic mass surveillance, including via analysis of commercially available information of U.S. people. I am even happier that for the time being we will not be working with the intelligence agencies of the DoW, such as NSA, DIA, etc. Our leadership committed to announcing publicly if this changes, and of course this contract has nothing to do with domestic agencies such as DHS, ICE, or FBI. The intelligence agencies have the most sensitive workloads, and so I completely agree it is best to start in the easier cases. This also somewhat mitigates my worry about not ruling out mass surveillance of international citizens. (In addition to the fact that spying on one’s own people is inherently more problematic.)
I am also happy the contract prohibits using our models to direct lethal autonomous weapons, though realistically I do not think powering a killer drone via a cloud-based large model was ever a real possibility. A general purpose frontier model is an extremely poor fit for autonomously directing a weapon; also, the main selling point of autonomous drones is to evade jamming, which requires an on-device model. Given our current state of safety and alignment, lethal autonomous weapons are a very bad idea. But regardless, it would not have happened through this deal.
That said, there is a possibility that eventually our models will be used to help humans in target selection, as is reportedly happening in Iran right now. This is a very heavy burden, and it is up to us to ensure that we do not scale to this use case without very extensive testing of safety and reliability.
The contract enables the necessary conditions for success but it is too soon to know if they are sufficient. It allows us to build in our safety stack to ensure the safe operation of the model and our red lines, as well as have our own forward deployed engineers (FDEs) in place. No safety stack can be perfect, but given the “mass” nature of mass surveillance, it does not need to be perfect to prevent it. That said, this is going to be a challenging enterprise: building safety for applications we are less familiar with, with the added complexities of clearance. Sam has said that we will deploy gradually, starting in the least risky and most familiar domains first. I think this is essential.
Can we make lemonade out of this lemon?
The previous defense contract of the DoW and Anthropic attracted relatively little attention. I hope that the increased salience of this issue can be used to elevate our standards as an industry. Just like we do with other risks such as bioweapons and cybersecurity, we need to build best practices for avoiding the risk of AI-enabled takeover of democracy, including mass domestic surveillance and high-stakes automated decisions (for example, selective prosecution or “social credit”). These risks are no less catastrophic than bioweapons, and should be tracked and reported as such. While, due to the classified nature of the domain, not everything can be reported, we can and should at least be public about the process.
If there is one thing that AI researchers are good at, it is measuring and optimizing quantities. If we can build the evaluations and turn tracking these risks into a science, we have a much better chance at combatting them. I am confident that it can be done given sufficient time. I am less confident that time will be sufficient.