CanYouFeelTheBenefits

Thanks for your detailed and nuanced answer. I really appreciate how you distinguish between different forms of misalignment and how s-risks fit within that picture. Your comment helped clarify a lot.

If you have time, I’d love to hear you expand a bit more on the likelihood of s-risks relative to other AGI outcomes. You mentioned that s-risks seem much less likely than extinction or successful alignment, but could you give a rough probability estimate (even if it’s just an intuitive order-of-magnitude guess, like “1 in a thousand” or “1 in a million”)?

It would also be interesting to hear your thoughts on what factors most strongly influence that probability, for example, how much governance or alignment progress would need to fail for s-risks to become plausible, whether you think “instrumental torture” (as opposed to large-scale indifferent suffering) deserves separate consideration, and how much you think the risk depends on who ends up in control of early AGIs (e.g., sociopathic or sadistic actors).

Basically, I’m trying to understand not just whether s-risks are neglected, but how much weight they deserve compared to extinction in our overall AGI-risk prioritization.

Thanks again for engaging with these hard questions.

How likely are “s-risks” (large-scale suffering outcomes) from unaligned AI compared to extinction risks?

CanYouFeelTheBenefits1mo10

On Wireheading

CanYouFeelTheBenefits3mo*84

This is indeed an often overlooked approach to very effectively alleviate suffering.

Another key research question should be the optimal location of the electrode. The experiments done in rats only stimulated areas that induce intense wanting for the stimulation, the rats probably did not even experience pleasure but rather just craving for the stimulation.

Pleasure/joy itself is encoded in the medial orbitofrontal cortex, so this area might be worth looking into.

I could also imagine multiple electrodes stimulating all hedonic hotspots in the brain, e.g. in the nucleus accumbens, insula, ventral pallidum, orbitofrontal cortex. More on this here: https://www.pnas.org/doi/full/10.1073/pnas.1705753114

Electrode stimulation of the insula cortex has already been shown to produce bliss in humans, so this location might also be a good approach:

"Insular Stimulation Produces Mental Clarity and Bliss": https://pmc.ncbi.nlm.nih.gov/articles/PMC9300149/

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments