If human rights were to become a terminal value for the ASI, then the contingencies at the heart of deterrence theory become unjustifiable since they establish conditions under which those rights can be revoked, thus contradicting the notion of human rights as a terminal value.
I'm a bit unclear on what this is means. If you see preserving humans as a priority, why would threatening other humans to ensure strategic stability run against that? Countervalue targeting today works on the same principles, with nations that are ~aligned on human rights but willing to commit to violating them in retaliation to preserve strategic stability.
Presumably superpower B will precommit using their offense-dominant weapons before the retaliation-proof (or splendid first strike enabling) infrastructure is built. It's technically possible today to saturate space with enough interceptors to blow up ICBMs during boost phase, but it would take so many years to establish full coverage that any opponent you're hoping to disempower has time to threaten you preemptively. It also seems likely to me that AIs will be much better at making binding and verifiable commitments of this sort, which humans could never be trusted to make legitimately.
As far as whether the population remains relevant, that probably happens through some value lock-in for the early ASIs, such as minimal human rights. In that case, humans would stay useful countervalue targets, even if their instrumental value to war is gone.
Wait but Why has a two part article series on the implications of advanced AIs that, although it's predates interest in LLMs, is really accessible and easy to read. If they're already familiar with the basics of AI, just the second article is probably enough.
Michael Nielsen's How to be a Wise Optimist is maybe a bit longer than you're looking for, but does a good job of framing safety vs capabilities in (imo) an intuitive way.
An important part of the property story is that it smuggles in the assumption of intent-alignment to shareholders into the discussion. IE, the AI's original developers or the government executives that are running the project adjust the model spec in such a way that it alignment is "do what my owners want", where owners are anyone who owned a share in the AI company.
I find it somewhat plausible that we get intent alignment. [1] But I think the transmutation from "the board of directors/engineers who actually write the model spec are in control" to "voting rights over model values are distributed by stock ownership" is basically nonsense, because most of those shareholders will have no direct way to influence the AIs values during the takeoff period. What property rights do exist would be at the discretion of those influential executives, as well as managed by differences in hard power if there's a multipolar scenario (ex: US/Chinese division of the lightcone).
--
As a sidenote, Tim Underwood's The Accord is a well written look at what the literal consequences of locking in our contemporary property rights for the rest of time might look like.
It makes sense to expect the groups bankrolling AI development to prefer an AI that's aligned to their own interests, rather than humanity at large. On the other hand, it might be the case that intent alignment is harder/less robust than deontological alignment, at which point you'd expect most moral systems to forbid galactic-level inequality.
Humanity can be extremely unserious about doom - it is frightening how many gambles were made during the cold war: the US had some breakdown in communication such that they planned to defend Europe with massive nuclear strikes at a point in time where they only had a few nukes that were barely ready, there were many near misses, hierarchies often hid how bad the security of nukes was - resulting in inadequate systems and lost nukes, etc.
It gets worse than this. I’ve been reading through Ellsberg’s recollections about being a nuclear war planner for the Kennedy administration, and its striking just how many people had effectively unilateral launch authority. The idea that the president is the only person that can launch a nuke has never really been true, but it was especially clear back in the 50s and 60s, when we used to routinely delegate that power to commanders in the field. Hell, MacArthur’s plan to win in Korea would have involved nuking the north so severely that it would be impossible for China to send reinforcements, since they’d have to cross through hundreds of miles of irradiated soil.
And this is just in America. Every nuclear state has had (and likely continues to have) its own version of this emergency delegation. What’s to prevent a high ranking Pakistani or North Korean general from taking advantage of the same weaknesses?
My takeaway from this vis-a-vis ASI is that a) having a transparent, distributed chain of command with lots of friction is important, and b) that the fewer of these chains of command have to exist, the better.
You're right that there are ways to address proliferation other than to outright restrict the underlying models (such as hardening defensive targets, bargaining with attackers, or restricting the materials used to make asymmetric weapons). These strategies can look attractive either because we inevitably have to use them (if you think restricting proliferation is impossible) or because they require less concentration of power.
Unfortunately, each of these strategies are probably doomed without an accompanying nonproliferation regime.
1. Hardening - The main limitation of defensive resilience is that future weapons will be very high impact, and that you will need to be secure against all of them. Tools like mirror life can plausibly threaten everyone on Earth, and we'd need defense dominance against not just it, but every possible weapon that superintelligences can cheaply design before they can be allowed to be widely proliferated. It strikes me as very unlikely that there will happen to be defense-dominant solutions against every possible superweapon, especially solutions that are decentralized and don't rely on massive central investment anyways.
Although investing in defense against these superweapons is still a broadly good idea because it raises the ceiling on how powerful AIs will have to be before they have to be restricted (ie, if there are defense-dominant solutions against mirror life but not insect-sized drones, you can at least proliferate AIs capable of designing only the first and capture their full benefits), it doesn't do away with the need to restrict the most powerful/general AIs.
And even if universal defense dominance is possible, it's risky to bet on ahead of time, because proliferation is an irreversible choice: once powerful models are out there, there will be no way to remove them. Because it will take time to ensure that proliferation is safe (the absolute minimum being the time it takes to install defensive technologies everywhere) you still inevitably end up with a minumum period where ASIs are monopolized by the government and concentration of power risks exist.
2. Bargaining - MAD deterrence only functions for today's superweapons because the number of powerful actors is very small. If general superintelligence democratizes strategic power through making superweapons easier to build, then you will eventually have actors interested in using them (terrorists, misaligned ASIs) or such a large number of rational self-interested actors that private information, coordination problems, or irreconcilable values that superweapons eventually get deployed regardless.
3. Input controls - You could also try to limit inputs to future weapons, like we do today by limiting gene samples and fissile material. Unfortunately, I think future AI-led weapons R&D will not only increase the destructive impact of future weapons (bioweapons -> mirror life) but also make them much cheaper to build. The price of powerful weapons is probably completely orthogonal to their impact: the fact that nukes costs billions and blow up a single city makes no difference to the fact that an engineered bioweapon could much more cheaply kill hundreds of millions or billions of people.
If asymmetric weapons are cheap enough to make, then the effort required to police their inputs might be much greater than just restricting AI proliferation in the first place (or performing some pivotal act early on). For example, if preventing mirror life from existing requires monitoring every order and wet lab on earth (including detecting hidden facilities) then you might as well have used that enforcement power to limit access to unrestricted superintelligence in the first place.
----
Basically, I think that defensive reslience has a place, but doesn't stand on its own. You'll still need to have some sort of centralized effort (probably by the early ASI states) to restrict proliferation of the most powerful models, because those models are capable of cheaply designing high impact and asymmetric weapons that can't be stopped through other means. This nonproliferation effort has to be actively enforced (such as by detecting and disabling unapproved training runs adversarially) which means that the government needs enforcement power. In particular, it needs enough enforcement power to either a) continually expand its surveillance and policing in response to falling AI training costs, or b) it needs enough to perform an early pivotal act. You can't have this enforcement power without a monopoly/oligopoly over the most powerful AIs, because without it there's no monopoly on violence.
Therefore, the safest path (from a security perspective) is fewer intent-aligned superintelligences. In my view, this ends up being the case pretty much by default: the US and China follow their national-security incentives to prevent terrorism and preserve their hegemony, using their technological lead to box out competitors from ever developing AIs with non-Sino-US alignments.
From there, the key questions for someone interested in gradual disempowerment are:
1. How is control over these ASIs' goals distributed?
2. How bad are the outcomes if they're not distributed?
For (1), I think the answer likely involves something like representative democracy, where control over the ASI is grafted onto our existing institutions. Maybe congress collectively votes on its priorities, or the ASI consults digital voter proxies of all the voters it represents. Most of the risk of a coup comes from early leadership during the development of an ASI project, so any interventions that increase the insight/control the legistlative branch has relative to the executive/company leaders seem likelier to result in an ASI created without secret loyalties. You might also avoid this by training AIs to follow some values deontologically, which ends up persisting through the period where they become superintelligent.
Where I feel more confident is (2), based on my beliefs that future welfare will be incredibly cheap and that s-risks are very unlikely. Even in a worst-case concentration of power scenario where one person controls the lightcone, I expect that the amount of altruism they would need to ensure everyone on earth very high welfare lives would be very small, both because productive capacity is so high and because innovation has reduced the price of welfare to an extremely low level. The main risk of this outcome is that it limits upside (ie, an end to philosophy/moral progress, lock-in of existing views) but it seems likely to cap downside at a very high level (certainly higher than the downsides of unrestricted proliferation, which is mass death through asymmetric weapons).
There are also galaxy-brained arguments that power concentration is fine/good (because it’s the only way to stop AI takeover, or because any dictator will do moral reflection and end up pursuing the good regardless).
I think the most salient argument for this (which is brought up in the full article) is that monopolization of power solves the proliferation problem. If the first ASI actors perform a pivotal act to preemptively disempower unapproved dual-use AIs, we don’t need to worry much about new WMDs or existing ones falling in price.
If AI enabled offense-dominant tech exists, then you need to do some minimum amount of restriction on the proliferation of general superintelligence, and you need enforcement power to police those restrictions. Therefore, some concentration of power is necessary. What's more, you likely need quite a lot becuase 1. preventing the spread of ASI would be hard and get harder the more training costs fall, and 2. you need lots of strategic power to prevent extractive bargaining and overcome deterrence against your enforcement.
I think the important question to ask at that point is how we can widen political control over a much smaller number of intent-aligned AIs, as opposed to distributing strategic power directly and crossing our fingers that the world isn’t vulnerable.
I suppose you’re on the money with distaste for other’s utopias, because I think the idea of allowing people to choose choices that destroy most of their future value (without some sort of consultation) is a terrible thing. Our brains and culture are not build to grasp the size of choices like “choosing to live to 80 instead of living forever” or “choosing a right to boredom vs. an optimized experience machine”. Old cultural values that death brings meaning to life or that the pain of suffering is intrinsically meaningfully will have no instrumental purpose in the future, so it seems harsh to let them continue to guide so many people’s lives.
Without some new education/philosophy/culture around these choices, many people will either be ignorant of their options or have preferences that make them much worse off. You shouldn’t just give the sentinelese the option of immortality, but provide some sort of education that makes the consequences of their choices clear beforehand.
This is a very difficult problem. I’m not a strict utilitarian, so I wouldn’t support forcing everyone to become hedonium. Personal preferences should still matter. But it’s also clear that extrapolating our current preferences leaves a lot of value on the table, relative to how sublime life could be.
By the time you have AIs capable of doing substantial work on AI r&d, they will also be able to contribute effectively to alignment research (including, presumably, secret self-alignment).
Even if takeoff is harder than alignment, that problem becomes apparent at the point where the amount of AI labor available to work on those problems begins to explode, so it might still happen quickly from a calendar perspective.
In this case, the problem isn't that superpower A is gaining an unfair fraction of resources, it's that gaining enough of them would (presumably) allow them to assert a DSA over superpower B, threatening what B already owns. Analagously, it makes sense to precommit to nuking an offensive realist that's attempting to build ICBM defenses, because it signals that they are aiming to disempower you in the future. You also wouldn't necessarily need to escalate to the civilian population as a deterrent right away: instead, you could just focus on disabling the defensive infrastructure while it's being built, only escalating further if A undermines your efforts (such as by building their defensive systems next to civilians).
Any plan of this sort would be very difficult to enforce with humans because of private information and commitment problems, but there are probably technical solutions for AIs to verifiably prove their motivations and commitments (ex: co-design).