3 Answers sorted by
top scoring

Nov 28, 2023

What reason is there to expect Bob is at all likely to succeed? Many people have tried making AGI over the years and none has succeeded. Aligning the AI would be even harder. Does Bob have a solution to the alignment problem? If so, that seems like the dominant consideration, is there a way to make Bob's solution to the alignment problem available to others? If Bob doesn't have a solution to the alignment problem, then why expect Bob to be able to steer the AGI?
Has Alice considered that this whole setup might be a ruse? As in, there is no credible plan to build AGI or align the AI, and it's basically for hype, and she's getting memed into possibly killing herself by a total lie? Perhaps scaring people is part of Bob's political strategy for maintaining control and taking out people who could block him in some way?
What about the decision theory of extortion? Classically, you shouldn't negotiate with terrorists because being the type of person to pay off terrorists is what gives terrorists an incentive to threaten you in the first place. Maybe Alice gets tortured less overall by not being the type of person to fold this easily with this non-credible of a threat? I mean, if someone were controlled that easy by such a small probability of torture, couldn't a lot of their actions be determined by a threatening party that would make things worse for them?
There are unsolved ethical issues regarding the balance of pain and pleasure. There are optimized negative experiences and optimized positive experiences. Without AGI it's generally easier for people to create negative experiences than positive ones. But with AGI it's possible to do both, because the AGI would be so powerful. See Carl Shulman's critique of negative utilitarianism. If in some possible worlds there are aligned AGIs that create positive experiences for Alice, this could outweigh the negative experiences by other AGIs.
To get into weirder theoretical territory, even under the assumption that AGIs can create negative experiences much more efficiently than positive experiences, reducing the total amount of negative experience involves having influence over which AGI is created. Having control of AGI in some possible worlds gives you negotiating power with which you can convince other AGIs (perhaps even in other branches of the multiverse) to not torture you. If you kill yourself you don't get to have much influence over the eventual AGI that is created, so don't get to be at the negotiating table, so to speak. You already exist in some possible worlds (multiverse branches etc, depending on physics/philosophy assumptions) so reducing the degree to which you're tortured to 0 is infeasible, but reducing it is still possible.
At some level there's a decision people have to make about whether life is good or bad. Life is good in some ways and bad in other ways. It's hard to make abstract arguments about the balance. At some point people have to decide whether they're in favor of or against life. This is a philosophy problem that goes beyond AGI, that people have been contemplating for a long time.
Maybe this is actually a mental health problem? I mean, I'm tempted to say that people who think superhuman AGI is likely to be created in the next 20 years are already crazy, though that's a popular opinion around here. But most of those people think alignment is unlikely, and so intentional torture scenarios are also correspondingly unlikely. If this is a mental health problem, then usual methods, such as therapy, meditation, and therapeutic drug regimes for depression and so on, might be helpful. Even very risky methods of therapy that could induce psychosis (e.g. certain drugs) are far less risky than killing yourself.

Nov 28, 2023

Could Alice befriend someone who is closer to building AGI than Bob? If so, perhaps they can protect Alice or at least offer some peace of mind.

canary_itm

2y

I factually disagree with those fabricated numbers. AI leaders aren't stupid. Solving the control problem is in the interest of AI leaders, and so massive efforts are being undertaken in AI safety. Limiting their own capabilities that are amplified with AI is not in their interest, and so no effort goes into that.
Progress in AI is incremental (albeit rapidly accelerating), so even before they achieve a "singularity", an AI leader can use their latest AI models to take powerful actions in the world.
"Your problem doesn't matter because we're all gonna die" does not meaningfully engage the question in the first place.

Rendering 1/2 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 5:59 AM

[-]canary_itm2y30

I’ve thought about the potentially information-hazardous nature of the post, and was hesitant about asking at first. Here’s why I think it will be net positive to discuss:

1. Only a limited number of people are in the situation where a powerful AI leader has a personal vendetta against them.

2. The people to whom this situation applies are already aware of the threat.

3. The unavailability of a counterargument to this threat is leading to negative outcomes.

There is, of course, the possibility that an unrelated reader could develop an irrational fear. But they could do that more plausibly with a scenario that applies to them. Among all the topics to be scared of, this one seems pretty safe because most people don’t qualify for the premise.

I am, however, slightly worried that we are headed towards dynamics where powerful people in AI can no longer be challenged or held accountable by those around them. This may warrant a separate discussion post (by people other than me), to catalyze a broader discussion about unchecked power in AI and what to do about potentially misaligned human actors.

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-19

[ Question ]

Help me solve this problem: The basilisk isn't real, but people are

-19

-19

3 Answers sorted by
top scoring

Nov 28, 2023

Nov 28, 2023

2y

-19

[ Question ]

Help me solve this problem: The basilisk isn't real, but people are

-19

-19

3 Answers sorted by top scoring

Nov 28, 2023

Nov 28, 2023

2y

3 Answers sorted by
top scoring