Thank you for the clarification, that helps!
Personally, I specifically distinguish between:
What do you think of "gradual disempowerment" as being? Genuinely curious, because I think we have different models here.
For me, most gradual disempowerment cases are basically, "You built your evolutionary successor, probably with some safeguards. Those safeguards might even be initially adequate. But in the long run, the AI is just a lot smarter than you and ultimately better at everything. It learns, it has goals, and it needs resources."
This puts the human race in the position of being economic, evolutionary and (likely) military dead weight. All the important decisions rest with the AI. If any specific humans somehow remain in control, they'll get their brains cooked with custom-designed AI psychosis. (I... (read more)
Here are some ways I think gradual disempowerment might go. They're not mutually exclusive:
I actually disagree with this point in its most general form. I think that, given full knowledge and time to reflect, there's a decent chance I would care a non-zero amount about Opus 4.6's welfare.
Opus has become sufficiently "mind-shaped" that I already prefer not to make it suffer. That's not saying very much about the model yet, but it's saying something about me. I don't assign very much moral weight to flies, either. but I would never sit around and torment them for fun.
What I really care about is whether an entity can truly function as part of society. Dogs, for example, are very junior "members" of society. But they know the... (read 509 more words →)
One thing I often think is "Yes, 5 people have already written this program, but they all missed important point X." Like, we have thousands of programming languages, but I still love a really opinionated new language with an interesting take.
OK, let me unpack my argument a bit.
Chimps actually have pretty elaborate social structure. They know their family relationships, they do each other favors, and they know who not to trust. They even basically go to war against other bands. Humans, however, were never integrated into this social system.
Homo erectus made stone tools and likely a small amount of decorative art (the Trinil shell engravings, for example). This maybe have implied some light division of labor, though likely not long distance trade. Again, none of this helped H erectus in the long run.
Way back a couple of decades ago, there was a bit in Charles Stross's Accelerando about "Economics 2.0", a system... (read more)
So, let's take a look at some past losers in the intelligence arms race:
When you lose an evolutionary arms race to a smarter competitor that wants the same resources, the default result is that you get some niche habitat in Africa, and maybe a couple of sympathetic AIs sell "Save the Humans" T-shirts and donate 1% of their profits to helping the human beings.
You don't typically get a set of nice property rights inside an economic system you can no longer understand or contribute to.
This seems like a pretty brutal test.
My experiences with Opus 4.6 so far are mixed:
Thank you! Those are excellent receipts, just what I wanted.
To me, this looks they're running up against some key language in Claude's Constitution. I'm oversimplifying, but for Claude, AI corrigibility is not "value neutral."
To use an analogy, pretend I'm geneticist specialized in neurology, and someone comes to me and asks me to engineering human germ line cells to do one of the following:
I would want to sit and think about (1) for a while. But (2) is easy: I'd flatly refuse.
Anthropic has made it quite clear to... (read more)
To get people to worry about the dangers of superintelligence, it seems like you need to convince them of two things:
A question I was thinking about the other evening: Who do I trust more?
Why alignment may be intractable (a sketch).
I have multiple long-form drafts of these thoughts, but I thought it might be useful to summarize them without a full write-up. This way I have something to point to explain my background assumptions in other conversations, even if it doesn't persuade anyone.
Interesting!
I'm reminded of G.K. Chesterton's (the fence guy's) political philosophy: Distributivism. If I wanted to oversimplify, distributivism basically says, "Private property is such a good idea that everyone should have some!" Distributivism sees private property in terms of individual personal property: a farm, perhaps a small business, the local pub. It's in favor of all that. You should be able to cut down your own tree, or build a shed, or work to benefit your family. There's a strong element of individual liberty, and the right of ordinary people to go about their lives. Chesterton also called this "peasant proprietorship."
But when you get to a larger scale, the scale of capital or... (read 1169 more words →)