Felix C. — LessWrong

I’d argue that the way force is applied in each of these contexts has very different implications for the openness/rightness/goodness of the future. In von neumann’s time, there was no path to forcibly preventing Russia from acquiring nuclear weapons that did not involve using your own nuclear weapons to destroy an irrecoverable portion of their infrastructure, especially considering the fact that their economy was already blockaded off from potential sanctions.

Raemon is right that you cannot allow the proliferation of superintelligent AIs (because those AIs will allow you to cheaply produce powerful weapons). To stop this from happening ~permanently, you do probably need a single actor or very small coalition of actors to enforce that non-proliferation forever, likely through using their first to ASI position to permanently monopolize it and box out new entrants.

While the existence of this coalition would necessarily reduce the flexibility of the future, it would probably look a lot more like the IAEA and less like a preemptive nuclear holocaust. The only AI capabilities that need to be restricted are those related to weapons development, which means that every other non-coalition actor still gets to grab the upside of most AI applications. Analogously, the U.N security council have been largely successful at preventing nuclear proliferation to other countries by using their collective economic, political, and strategic position, while still allowing beneficial nuclear technology to be widely distributed. You can let the other countries build nuclear power plants, so long as you use your strategic influence to make sure they’re not enrichment facilities.

In practice, I think this (ideally) ends up looking something like the U.S and China agreeing on further non-proliferation of ASI, and then using their collective DSA over everybody else to monopolize the AI supply chain. From there, you can put a bunch of hardware-bound restrictions, mandatory verification/monitoring for data centers, and backdoors into every new AI application to make sure they’re aligned to the current regime. There’s necessarily a lot of concentration of power, but that’s only because it explicitly trades off with the monopoly of violence (ie, you can’t just give more actors more actors access to ASI weapons capabilities for self-determination without losing overall global security, same as with nukes).

I’m currently writing up a series of posts on the strategic implications of AI proliferation, so I’ll have a much more in-depth version of this argument here in a few weeks. I’m also happy to dm/call directly to talk about this in more detail!

AI and Cheap Weapons

Felix C.3d*40

I'll talk more about this in follow up posts, but I don't think the main danger is that the models will be voluntarily released. Instead, it'll just get cheaper and cheaper to train the models that have weapons capabilities as the algorithms get more efficient, which will eventually democratize those weapons.

Analogously, we can think about how cryptography was once a government controlled technology because of its strategic implications, but became widespread as the computing power required to host cryptographic algorithms became extremely cheap.

[Question] What the discontinuity is, if not FOOM?

Felix C.4d10

I'll make the point that safety engineering can have discontinuous failure modes. The reason the Challenger collapsed was because some o-ring seals in a booster had gotten too cold before launch, preventing them from sealing off the flow of hot gas to the main engine and blowing up the rocket. The function of these o-rings is pretty binary: either gas is kept in and the rocket works, or it's let out and the whole thing explodes.

AI research might end up with similar problems. It's probably true that there is such a thing as good enough alignment, but that doesn't necessarily imply that progress on solving it can be made incrementally and doesn't have all or nothing stakes in deployment.

Felix C.'s Shortform

Felix C.7d10

Is there a way to bring over the lesswrong format (the left sidebar table of contents, the floating citations off to the right) onto a personal blog? As far as I can tell the AI 2027 project forecasts are similar in structure, so it seems like someone's already managed it before.

I love the lesswrong formatting and plan to crosspost between my blog and here, but I want to be able to share the work either without the immediate connotations of lesswrong (for some policy audiences) or with personal connotations through my blog (as part of a job search, for instance).

The Culture Novels as a Dystopia

Felix C.20d*100

I'd mentioned the Golden Age novels by Wright before when we'd gone hiking together, so I thought it'd be worth looking at some related flaws in his utopia.

The Sophotechs (the trilogy's equivalent of the minds), are philosophically libertarian. While they do intervene to stop direct violence, they otherwise enforce a system of complete freedom over the self, as well as the maintenance of property rights. This has some interesting consequences, the most detrimental of which is that everyone in the Golden Oecumene lives their otherwise utopian lives with metaphorical handguns on their nightstand. At any point, any citizen can make a decision which would destroy their life, identity, or cause them to suffer for eternity, and the Sophotechs will rigidly prevent anyone else from doing anything about it on the basis that it was their free will to do so. While there are precautions (you can be informed of the consequences of your actions, or enter a contract to be restrained from doing anything that would destroy you) the people with the wrong temperment to use these tools run the risk of what is essentially damnation.

Some examples from the books:

Permanently destroying your identity by believing yourself to be someone else.
Falling into uber-hedonism, where you wirehead yourself compulsively.
Modifying your brain so that your values will never shift.
Removing your ability to fill pity or empathy.

There's actually entire factions of society which exhibit these faults. There are obsessive hedonists which malicously try to spread the gospel of wireheading, and the Invariants, people whose brains are designed so that they don't have emotional conflicts and always act optimally in their best interests.

The system of property rights and limited government also has its own knock-on effects. Patents still exist and are permanent, so anyone who has ever developed a technology can maintain a permanent monopoly over it, including humans who are so old that they existed before the Sophotechs came along (~10,000 years old or so). Money still exists, although it's represented by access to the computing time of the Sophotechs instead of by trust in a government.

Because the role of the government is so limited (it exists to fund the commonwealth military with an extremely low tax, which we'll get to), there's no social safety net either. Everyone has to pay for all of the goods they consume, either from the rent on a patent or property, or from work. Work still exists since, at least in Wright's view, the Sophotechs have limited attention, and so humans can be paid extremely below-market rates for doing very specialized work. Combined with the fact that goods are so cheap and the fact that most people can hope to patent an extremely niche product that even a small slice of the trillions of people in the solar system use, most people enjoy a very comfortable existence (The median income from the books is the budget equivalent of a modern earth military per capita).

If for some reason you can't/don't want to pay for things though, then you do actually starve to death. This happens to one character directly in the novels (he spends all of his compute on maintaining a delusion that he's someone else), and presumably to others. In fact, one of the major motivations of many of the characters is to amass as many resources as possible, so that as the universe approaches inevitable heat death, they will be able to buy out more resources and stay alive longer than everyone else.

All of this is supposed to be balanced out by public activism. Society is basically organized into large factions which each have their own philosophical viewpoints, and they use their collective boycott power and social influence to try to control what they each see as socially degrading. The factions advocating for wireheading, for instance, are essentially sanctioned by the much wealthier and more powerful Horators, who are traditionalists (still transhumanists by today's standards, but who want to maintain respect for legacy human emotions, history, and the security of the commonwealth). Because wealth is somewhat ossified (all the major breakthroughs were patented thousands of years ago, and most of those people are Horators), this state of affairs is semi-stable. Individual rogue actors and inter-factional disputes still happen though, so there's no permanent solution to ensuring that the Golden Oecumene does actually remain both perfectly free and utopian.

The main conflict of the novels, in fact, is about the protagonist wanting to set up a civilization in another solar system, where the influence of the Horators will be greatly limited. His perspective is that he wants insurance against an alien attack on humanity to ensure that human life will be able to continue in the universe, while the Horators are worried that they won't be able to effectively ensure that their social rules against self torture and illiberalism are maintained light years away. The Sophotechs in the book are still constrained by light speed communication, so cultural drift is another huge problem the civilization is going to have to eventually deal with. Even if the original solar system remains basically utopian, they have no guarantee against the suffering of other galactic polities (since the people who colonized them can set things up of their own free will, similar theming to The Accord's habs).

All told, the libertarian value lock-in that the Golden Oecumene was created with is mostly extraordinarily utopian for ~everyone, although with the potential for basically arbitrary suffering, even though the Sophotechs are powerful enough to model anyone's mind and understand the actions they'd choose.

Spoilers for the end of the trilogy below. If you thought the conflicts I was describing above sound interesting, it's really a great series worth reading, and also available for free online on the author's website.

BREAK

At the very end of the final novel, and in the sequel short story The Far End of History, it becomes apparent that Wright's world is actually incredibly dystopian. The reason stems from the lock-in of a military law in the creators of the Sophotechs, which stipulates that they themselves are not allowed to directly use force. Their workaround is to use a man called Atkins for any violent legal or military ends they might need. Atkins is even older than the Sophotechs themselves, and having been a soldier, had voluntarily commited to the removal of his human rights for the purposes of war. In much the same sense that a soldier of the U.S can be compelled to risk their life on the battlefield despite their rights as a citizen, Atkins can basically be used for anything the Sophotechs need him to so long as it has strategic value.

The culmination of this is entire civilizations of just Atkins and his identical clones, which are created over hundreds of thousands of years as distractions from the Golden Oecumene. The citizens of these polities are variously tortured for information and exterminated by the Silent Oecumene (long story, but a divergent extra-solar faction of humanity from before the Sophotechs existed with philosophical differences). While the original Oecumene and its sisters are composed of humans with rights and so are presumably still utopian, 99% of all sentient human life in the universe pretty much ends up being constripted human soldiers, who have no guarantees against suffering. Even if it's all technically the same guy, it's hard to say that this is really the best things could have ended up.

Ryan Kidd's Shortform

Felix C.4mo10

What are your thoughts on the relative value of AI governance/advocacy vs. technical research? It seems to me that many of the technical problems are essentially downstream of politics; that intent alignment could be solved, if only our race dynamics were mitigated, regulation was used to slow capabilities research, and if it was given funding/strategic priority.

davekasten's Shortform

Felix C.6mo*10

Thank you for posting this. Are there any opportunities for students about to graduate to apply themselves, particularly without a C.S background? My undergraduate experience was focused on Business and IR (Cold War history, Sino-U.S relations) before I pivoted my long term focus to AI safety policy, and it's been difficult to find good entry points for EA work in this field as a new grad.

I've been monitoring 80,000 hours and applying to research fellowships where I can so far, but I'm always looking for new positions. If you or anyone else knows an org looking to onboard some fresh talent, I'd be happy to help.

Edit: Application submitted.

An artificially structured argument for expecting AGI ruin

Felix C.2y55

I've read so many posts highlighting the dangers of AGI that I often feel terribly anxious about it. I'm pretty young, and the idea that there's a possible utopia waiting for us that seems to be slipping through our fingers kills me. But even more than that, I worry that I won't have the chance to enjoy much of my life. That the work I've put in now won't amount to much, and that the relationships I've cultivated will never really get the chance to grow for the decades that should be every human's right.

Even just earlier today, I was reading an article when my cat came up to me and started rolling around next to my leg, purring and playing with me. She's pretty old- certainly not young enough for any chance at biological immortality. I was struck by the sense that I should put down my laptop and play with her, because the finite life she has here deserves to be filled with joy and love. That even if there's no chance for her to live forever, that what she has should and has been made better by me. A long, full life of satisfaction is enough for her.

I don't necessarily mind on missing out on utopia. I'd obviously like it to happen, but its inconceivable to me. So if a billion years of technologically-enhanced superhumanity isn't in the cards for me? I'll be okay.

But there's no one there to make sure that I get the full allotment of life that I've got left. I feel overwhelmed by the sense that in a few decades from now, if this problem isn't solved, the flame of my life will be snuffed out by a system I don't understand and could never defeat. I'll never have that long-term marriage, that professional career, or the chance to finally achieve the expert level in my hobbies. I'll just be gone, along with everything else that could possibly matter to me.

If I can't have immortality, I at least want a long, peaceful life. But the threat of AGI robs me of even that possibility, if its as certain a disaster as I've come to believe.

Devil's Offers

Felix C.2y10

I think the general point he's making still stands. You can always choose to remove the Werewolf Contract of your own volution, then force any sort of fever dream or nightmare onto yourself.

Moreover, The Golden Age also makes a point about the dangers of remaining unchanged. Orpheus, the most wealthy man in history, has modified his brain such that his values and worldview will never shift. This puts him in sharp contrast to Phaethon as the protagonist, whose whole arc is about shifting the strict moral equilibrium of the public to make important change happen. Orpheus, trapped in his morals, is as out of touch in the era of Phaethon as would be a Catholic crusader in modern Rome.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments