By the time you have AIs capable of doing substantial work on AI r&d, they will also be able to contribute effectively to alignment research (including, presumably, secret self-alignment).
Even if takeoff is harder than alignment, that problem becomes apparent at the point where the amount of AI labor available to work on those problems begins to explode, so it might still happen quickly from a calendar perspective.
Some relevant pieces on this subject I’ve read somewhat recently:
A central pillar of the Democratic Party has been that Republicans will destroy democracy and take the country down with it (somewhat ditto the Republican line on immigration). Both parties are obsessed with the end of American greatness, and motivate their voters through that narrative. To a lesser extent, they’re also nebulously united on “beating China”.
Where I agree is that there’s an absence of a positive vision for the future (something this just isn’t the world today + better healthcare). I think this is especially true on the American left, which has basically mired itself into an anti-progress position through its natural distrust of billionaires and its reaction to the tech-right rising in political prominence. It’s hard to accept radical change is possible—except through the existing lens of concentration of wealth or environmental impact—when accepting that change means elevating the importance of people in your cultural outgroup. ASI is a silly concern for fringe thinkers in San Francisco; real writers ask the pressing questions about electricity costs, copyright, and corporate influence on the Trump administration.
Compare what the writers of places like the Atlantic, the NYT, or Times have to say about AI compared to people like Steve Bannon. It’s incredible near term and sanded down, while the right has been generally more willing to engage with superintelligence being possible.
While it wouldn't be ideal for international security, middle powers will also probably feel a lot of pressure to acquire and commit to using weapons of mass destruction. It's probably much cheaper to develop powerful weapons and elaborate fail-deadlies than it is to kickstart your own AI infrastructure (particuarly if it's viable to steal Chinese/American models), so it's attractive to bank on staying geopolitically relevant through deterrence instead of having to coordinate.
I think this is particularly likely to happen in worlds where misalignment isn't seen as omnicidal, and where the primary perceived risk is loss of sovereignty. Since being part of a coalition itself erodes some sovereignty (especially one that carries security commitments), it might seem easier to turn investment inwards.
Those aren't necessarily contradictory: you could have big jumps in unemployment even with increases in average productivity. You already see this happening in software development, where increasing productivity for senior employees has also coincided with fewer junior hires. While I expect that the effect of this today is pretty small and has more to do with the end of ZIRP and previous tech overhiring, you'll probably see it play out in a big way as better AI tools take up spots for newgrads in the runup to AGI.
I particularly like the idea that AI incompetence will become associated with misalignment. The more capable agents become, the more permissions they'll get, which will have a strange "eye of the storm" effect where AIs are making a bunch of consequential decisions poorly enough to sway the public mood.
I think a lot of potential impact from public opinion is cruxed on what the difference between the publically available models and frontier models will be. In my expectation, the most powerful models end up subject to national security controls and are restricted: by the time you have a bunch of stumbling agents, the frontier models are probably legitimately dangerous. The faster AI progress goes, or the more control the USG exerts, the greater the difference between the public perception of AI and its real capabilities will be. And being far removed from public participation, these projects are probably pretty resilient to changes in the public mood.
With that in mind, anything that either gives transparency into what the ceiling of capabilities are (mandatory government evals that are publicly read out in congress?) or gets the public worried earlier seems to be pretty important. I particularly like the idea of trying to spark concern about job losses before they happen: maybe this happens by constantly asking politicians what their plans are for a future when people can't work, and pointing to these conversations as examples of the fact that the government isn't taking the threat of AI to you, the voter, seriously.
I feel like this scenario somewhat overplays how easily China is able to push over the U.S with their industrial lead (in much the same way that I feel a lot of other AI predictions, a la Situational Awareness, overplay the U.S's position when it takes the lead).
The ASI directs successful sabotage on the US training runs to keep them from training their own ASI, and in response the US makes threats of kinetic strikes. China doesn’t back down at this because its ASI has created a robust air defense system spanning all of China that scales to ICBMs.
Most of my skepticism has to do with my belief that future AI technologies are going to allow for the development of extremely powerful asymmetric weapons, and that this will negate most of the advantage from having a giant industrial lead (even if you are the first to set off your industrial explosion). Nuclear weapons are a clear example of this: even if you cloud the sky with interceptors, you can negate all of that defensive investment through simple strategies like putting your nuke in a shipping container and docking it next to Beijing/Washington. Defensive investment is a poor strategy for trying to outmatch offense-dominant technology.
Even more importantly, the leading country needs to be robustly defense dominant against all possible asymmetric weapons. From the perspective of MAD, it doesn't matter if your supply chains can protect you against nukes if they can't also protect you against alternative threats like targeted bioweapons. And if the loser is really desperate, they can also make it even harder to defend through employing fail-deadly tools like mirror life or an intentionally misaligned superintelligence. Otherwise, the other side (the U.S in this example) is able to limit the extent of your sabotage and buy time with deterrence, which they can use to build out their own fully-optimized ASI.[1] I'd imagine that the U.S could use it's substantially worse, but still superhuman AIs to develop several of these cheap and highly destructive weapons, and then commit to retaliating with them as leverage.
At that point, even if China has the industrial lead, we still end up in a bipolar world dominated by the promise of MAD between the two parties, with each country needing to respect what the other does in their own sphere of influence.
A historical example I often use for this is the U.S refusal to invade North Korea in 1994, and it's subsequent acquisition of nuclear weapons. Although the U.S was and is orders of magnitude more superior conventionally, once Pyongyang had reached a level of military sophistication with regular artillery to guarantee that they could destroy Seoul at a moment's notice, the immediate costs became too high to justify a military operation to make sure that their enrichment facilities were destroyed. Instead you ended up with an ineffective diplomatic compromise, which was later subverted at the whim of the North Korean government.
Even though Seoul was able to invest enough in the following decades to become mostly secure against a regular artillery strike, it doesn't matter: North Korea bought enough time to get their nukes online.
That is the hope that the US and China, out of similar self-interest, agree that only they will be allowed to develop dangerous AGI (or in your framing, to restrict development of AI that can develop dangerous weapons),. This currently seems unlikely, but if the dangers of proliferation are really as severe as I fear, it seems possible they'll be recognized in time, and those governments will take rational actions to prevent proliferation while preserving their own abilities to pursue AGI. The uneasy alliance between the US and China might be possible because there isn't really a lot of animosity; neither of us really hates the other (I hope - at least the US seems wary of china but not to really hate it, despite it being fairly totalitarian).
There are many reasons that I believe that U.S-China cooperation on nonproliferation is likely, not all of which made it into the post. Specifically:
1. Great power agreements on nonproliferation are how we've handled all the most powerful dual-use technologies in the past. The U.S and Soviet Union could both recognize that the spread of nukes (and later, bioweapons) would be existentially dangerous to them both, and were able to work together to restrict them. And this was despite the fact that the Soviets were much more ideologically and economically distant than the U.S and China are today.
2. The relationship between China and the U.S has been tested before and not broken. Even when the public and most of his own party turned against Bush in 1989 over the response to Tiananmen, the executive branch was able to stay steady enough to continue to secretly work with top Chinese officials (to prevent them from turning to the Soviet Union for allies). Even if animosity between the countries reaches a similar boiling point again, there's probably still room to handle natsec-critical talks privately.
3. The last concern, and where I'm the most uncertain, is on the question of speed: do officials in both countries recognize what's going on fast enough to start doing nonproliferation work? On the one hand, the gov might get completely blindsided by the speed of progress, or perhaps become entirely politically captured by Nvidia's lobbying. On the other hand, we haven't seen any strategically relevant capabilities actually demonstrated yet: once they are, it seems difficult for the natsec apparatus to justify leaving them in private hands and not restricting who has access to them. Any steps the state takes to assert its domestic monopoly of violence will naturally dovetail into trying to do the same overseas.
If the government gets involved at all, asserting control over dual-use AIs seems like the most primal, basic step they can take: if they don't have control of the ASIs/superweapons, then they're probably not in charge enough to do anything else.
Of course this leaves the good reasons for Fear of centralized power. But it may be the lesser of two dangers.
Building off that previous point: it seems very difficult to imagine the government doing anything without the monopolization of strategic power, at least at the domestic level. Throughout the entire process of assembling a nuclear weapon, for example, the government always maintains its monopoly on force---hence why we have policies in place like making sure that defense contractors for nukes are never allowed to finish assembling them in-house. The reasoning is obvious: if someone other than the state has nukes, how can the state enforce any rules on them?
In a similar defense-contractor setup with the frontier labs (such as in a national superintelligence project), the buck would have to stop with the government. If you want the state to do anything, the state needs to keep ultimate control of what the ASIs are being used for. If anyone other than the state has the final say on the ASI's commands, then the state isn't in charge anymore and there was no point in bringing the government on to regulate things to begin with.
Rather than solve centralization of power by having multiple ASI projects competing with each other, I think a lot more attention should be on how we can diffuse power of a single ASI initiative. With nukes, the government technically has the capacity to blackmail anyone it wants domestically: if enough people in the chain of command coordinated, they could hold the rest of society hostage. But by making that chain of command wide and complex enough, it becomes difficult to unilaterally use your monopoly on violence in harmful ways.[1] This gives you most of the upsides with fewer risks: the gov is powerful enough to enforce nuclear nonproliferation, but has difficulty using its strategic monopoly to extractively bargain with people.
When combining this problem (the fact that the government needs to maintain its monopoly on force to be an effective regulator) with the fact that future AI systems will enable all sorts of offense-dominant strategies (such that d/acc-style proposals to build defensive capacity fail because too many people have AIs capable of designing cheap WMDs), it seems like the focus should be on making the government more coup-proof, rather than increasing the number of intent-aligned superintelligences.[2]
My one nitpick is that the framing in this post seems to leave aside the possibility of general-purpose AI, that is, real AGI or ASI. That presents solutions as well as problems; it can be used to improve security dramatically, and to sabatoge other nations' attempts at creating powerful AI in a variety of ways. This may add another factor that goes against proliferation as the default outcome, while adding risk of totalitarian takeover from whoever controls that intent aligned AGI.
For proliferation as a default outcome, I mean as an outcome without much government intervention. Prices get low, more people build/steal strong models. I agree that you could use an ASI to sabotage the AI projects of other countries, but it is certainly a huge amount of government intervention. This is a possible "permanent solution" I mentioned in the post though, and I'll spend some time in future articles weighing out the merits of strategies like the front-runners sabotaging everyone else, taking over supply chains, etc to enforce nonproliferation.
Also worth noting that the government has a long cultural tradition and political experience with managing the use of powerful weapons. This might still not be robust enough for superintelligence, but it's a much more likely bet than the AI labs' internal political structures, which immediately collapsed the first time they were seriously tested.
I suppose you’re on the money with distaste for other’s utopias, because I think the idea of allowing people to choose choices that destroy most of their future value (without some sort of consultation) is a terrible thing. Our brains and culture are not build to grasp the size of choices like “choosing not live to 80 instead of living forever” or “choosing a right to boredom vs. an optimized experience machine”. Old cultural values that “death brings meaning to life” or that the pain of suffering is intrinsically meaningfully will have no instrumental purpose in the future, so it seems harsh to let them continue to guide so many people’s lives.
Without some new education/philosophy/culture around these choices, many people will either be ignorant of their options or have preferences that make them much worse off. You shouldn’t just give the sentinelese the option of immortality, but provide some sort of education that makes the consequences of their choices clear beforehand.
This is a very difficult problem. I’m not a strict utilitarian, so I wouldn’t support forcing everyone to become hedonium—personal preferences should still matter. But it’s also clear that extrapolating our current preferences leaves a lot of value on the table, relative to how sublime life could be.