Many thanks for posting that link. It's clearly the most important thing I've read on LW in a long time, I'd upvote it ten times if I could.
It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct. The argument wasn't as clear to me before. Does anyone have good counterarguments? Why shouldn't we all focus on s-risk from now on?
(Unsong had a plot point where Peter Singer declared that the most important task for effective altruists was to destroy Hell. Big props to Scott for seeing it before the rest of us.)
I don't buy the "million times worse," at least not if we talk about the relevant E(s-risk moral value) / E(x-risk moral value) rather than the irrelevant E(s-risk moral value / x-risk moral value). See this post by Carl and this post by Brian. I think that responsible use of moral uncertainty will tend to push you away from this kind of fanatical view
I agree that if you are million-to-1 then you should be predominantly concerned with s-risk, I think they are somewhat improbable/intractable but not that improbable+intractable. I'd guess the probability is ~100x lower, and the available object-level interventions are perhaps 10x less effective. The particular scenarios discussed here seem unlikely to lead to optimized suffering, only "conflict" and "???" really make any sense to me. Even on the negative utilitarian view, it seems like you shouldn't care about anything other than optimized suffering.
The best object-level intervention I can think of is reducing our civilization's expected vulnerability to extortion, which seems poorly-leveraged relative to alignment because it is much less time-sensitive (unless we fail at alignment and so end up committi... (read more)
Paul, thank you for the substantive comment!
Carl's post sounded weird to me, because large amounts of human utility (more
than just pleasure) seem harder to achieve than large amounts of human
disutility (for which pain is enough). You could say that some possible minds
are easier to please, but human utility doesn't necessarily value such minds
enough to counterbalance s-risk.
Brian's post focuses more on possible suffering of insects or quarks. I don't
feel quite as morally uncertain about large amounts of human suffering, do you?
As to possible interventions, you have clearly thought about this for longer
than me, so I'll need time to sort things out. This is quite a shock.
large amounts of human utility (more than just pleasure) seem harder to achieve than large amounts of human disutility (for which pain is enough).
Carl gave a reason that future creatures, including potentially very human-like minds, might diverge from current humans in a way that makes hedonium much more efficient. If you assigned significant probability to that kind of scenario, it would quickly undermine your million-to-one ratio. Brian's post briefly explains why you shouldn't argue "If there is a 50% chance that x-risks are 2 million times worse, than they are a million times worse in expectation." (I'd guess that there is a good chance, say > 25%, that good stuff can be as efficient as bad stuff.)
I would further say: existing creatures often prefer to keep living even given the possibility of extreme pain. This can be easily explained by an evolutionary story, which suffering-focused utilitarians tend to view as a debunking explanation: given that animals would prefer keep living regardless of the actual balance of pleasure and pain, we shouldn't infer anything from that preference. But our strong dispreference for intense suffering has a similar evolutionary origin, and is no more reflective of underlying moral facts than is our strong preference for survival.
In support of this, my system 1 reports that if it sees more intelligent people
taking S-risk seriously it is less likely to nuke the planet if it gets the
chance. (I'm not sure I endorse nuking the planet, just reporting emotional
reaction).
4Kaj_Sotala6y
Can you elaborate on what you mean by this? People like Brian or others at FRI
don't seem particularly averse to philosophical deliberation to me...
I support this compromise and agree not to destroy the world. :-)
9Lukas_Gloor6y
Those of us who sympathize with suffering-focused ethics have an incentive to
encourage others to think about their values now, at least in crudely enough
terms to take a stance on prioritizing preventing s-risks vs. making sure we get
to a position where everyone can safely deliberate their values further and then
everything gets fulfilled. Conversely, if one (normatively!) thinks the
downsides of bad futures are unlikely to be much worse than the upsides of good
futures, then one is incentivized to promote caution about taking confident
stances on anything population-ethics-related, and instead value deeper
philosophical reflection. The latter also has the upside of being good from a
cooperation point of view: Everyone can work on the same priority (building safe
AI that helps with philosophical reflection) regardless of one's inklings about
how personal value extrapolation is likely to turn out.
(The situation becomes more interesting/complicated for suffering-focused
altruists once we add considerations of multiverse-wide compromise via
coordinated decision-making
[https://casparoesterheld.com/2017/05/16/talk-on-multiverse-wide-cooperation-via-correlated-decision-making/]
, which, in extreme versions at least, would call for being "updateless" about
the direction of one's own values.)
4paulfchristiano6y
People vary in what kinds of values change they would consider drift vs.
endorsed deliberation. Brian has in the past publicly come down unusually far on
the side of "change = drift," I've encountered similar views on one other
occasion from this crowd, and I had heard second hand that this was relatively
common.
Brian or someone more familiar with his views could speak more authoritatively
to that aspect of the question, and I might be mistaken about the views of the
suffering-focused utilitarians more broadly.
2ESRogs6y
Did you mean to say, "if the latter" (such that x-risk and s-risk reduction are
aligned when suffering-hating civilizations decrease s-risk), rather than "if
the former"?
I feel a weird disconnect on reading comments like this. I thought s-risks were a part of conventional wisdom on here all along. (We even had an infamous scandal that concerned one class of such risks!) Scott didn't "see it before the rest of us" -- he was drawing on an existing, and by now classical, memeplex.
It's like when some people spoke as if nobody had ever thought of AI risk until Bostrom wrote Superintelligence -- even though that book just summarized what people (not least of whom Bostrom himself) had already been saying for years.
I guess I didn't think about it carefully before. I assumed that s-risks were much less likely than x-risks (true) so it's okay not to worry about them (false). The mistake was that logical leap.
In terms of utility, the landscape of possible human-built superintelligences might look like a big flat plain (paperclippers and other things that kill everyone without fuss), with a tall sharp peak (FAI) surrounded by a pit that's astronomically deeper (many almost-FAIs and other designs that sound natural to humans). The pit needs to be compared to the peak, not the plain. If the pit is more likely, I'd rather have the plain.
Didn't you realize this yourself back in 2012
[http://lesswrong.com/lw/axj/the_ai_design_space_near_the_fai_draft/623p]?
1cousin_it6y
I didn't realize then that disutility of human-built AI can be much larger than
utility of FAI, because pain is easier to achieve than human utility (which
doesn't reduce to pleasure). That makes the argument much stronger.
1Wei_Dai6y
This argument doesn't actually seem to be in the article that Kaj linked to. Did
you see it somewhere else, or come up with it yourself? I'm not sure it makes
sense, but I'd like to read more if it's written up somewhere. (My objection is
that "easier to achieve" doesn't necessarily mean the maximum value achievable
is higher. It could be that it would take longer or more effort to achieve the
maximum value, but the actual maximums aren't that different. For example, maybe
the extra stuff needed for human utility (aside from pleasure) is complex but
doesn't actually cost much in terms of mass/energy.)
3cousin_it6y
The argument somehow came to my mind yesterday, and I'm not sure it's true
either. But do you really think human value might be as easy to maximize as
pleasure or pain? Pain is only about internal states, and human value seems to
be partly about external states, so it should be way more expensive.
7David Althaus6y
One of the more crucial points, I think, is that positive utility is – for most
humans – complex and its creation is conjunctive. Disutility, in contrast, is
disjunctive. Consequently, the probability of creating the former is smaller
than the latter – all else being equal (of course, all else is not equal).
In other words, the scenarios leading towards the creation of (large amounts of)
positive human value are conjunctive: to create a highly positive future, we
have to eliminate (or at least substantially reduce) physical pain and boredom
and injustice and loneliness and inequality (at least certain forms of it) and
death, etc. etc. etc. (You might argue that getting "FAI" and "CEV" right would
accomplish all those things at once (true) but getting FAI and CEV right is, of
course, a highly conjunctive task in itself.)
In contrast, disutility is much more easily created and essentially disjunctive.
Many roads lead towards dystopia: sadistic programmers or failing AI safety
wholesale (or "only" value-loading or extrapolating, or stable
self-modification), or some totalitarian regime takes over, etc. etc.
It's also not a coincidence that even the most untalented writer with the most
limited imagination can conjure up a convincing dystopian society. Envisioning a
true utopia in concrete detail, on the other hand, is nigh impossible for most
human minds.
Footnote 10 of the above mentioned s-risk-static
[https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/]
makes a related point (emphasis mine):
"[...] human intuitions about what is valuable are often complex and fragile
(Yudkowsky, 2011), taking up only a small area in the space of all possible
values. In other words, the number of possible configurations of matter
constituting anything we would value highly (under reflection) is arguably
smaller than the number of possible configurations that constitute some sort of
strong suffering or disvalue, making the inciden
2cousin_it6y
Yeah, I also had the idea about utility being conjunctive and mentioned it in a
deleted reply to Wei, but then realized that Eliezer's version (fragility of
value) already exists and is better argued.
On the other hand, maybe the worst hellscapes can be prevented in one go, if we
"just" solve the problem of consciousness and tell the AI what suffering means.
We don't need all of human value for that. Hellscapes without suffering can also
be pretty bad in terms of human value, but not quite as bad, I think. Of course
solving consciousness is still a very tall order, but it might be easier than
solving all philosophy that's required for FAI, and it can lead to other
shortcuts like in my recent post
[http://lesswrong.com/lw/p56/thought_experiment_coarsegrained_vr_utopia/] (not
that I'd propose them seriously).
4Lukas_Gloor6y
Some people at MIRI might be thinking about this under nonperson predicate
[https://arbital.com/p/nonperson_predicate/]. (Eliezer's view on which
computations matter morally is different from the one endorsed by Brian
[http://reducing-suffering.org/which-computations-do-i-care-about/], though.)
And maybe it's important to not limit FAI options too much by preventing
mindcrime at all costs – if there are benefits against other very bad failure
modes (or – cooperatively – just increased controllability for the people who
care a lot about utopia-type outcomes), maybe some mindcrime in the early stages
to ensure goal-alignment would be the lesser evil.
-1dogiv6y
Human disutility includes more than just pain too. Destruction of the humanity
(the flat plain you describe) carries a great deal of negative utility for me,
even if I disappear without feeling any pain at all. There's more disutility if
all life is destroyed, and more if the universe as a whole is destroyed... I
don't think there's any fundamental asymmetry. Pain and pleasure are the most
immediate ways of affecting value, and probably the ones that can be achieved
most efficiently in computronium, so external states probably don't come into
play much at all if you take a purely utilitarian view.
0[anonymous]6y
Our values might say, for example, that a universe filled with suffering insects
is very undesirable, but a universe filled with happy insects isn't very
desirable. More generally, if our values are a conjunction of many different
values, then it's probably easier to create a universe where one is strongly
negative and the rest are zero, than a universe where all are strongly positive.
I haven't seen the argument written up, I'm trying to figure it out now.
4Kaj_Sotala6y
Huh, I feel very differently. For AI risk specifically, I thought the
conventional wisdom was always "if AI goes wrong, the most likely outcome is
that we'll all just die, and the next most likely outcome is that we get a
future which somehow goes against our values even if it makes us very happy
[http://lesswrong.com/lw/xu/failed_utopia_42/]." And besides AI risk, other
x-risks haven't really been discussed at all on LW. I don't recall seeing any
argument for s-risks being a particularly plausible category of risks, let alone
one of the most important ones.
It's true that there was That One Scandal, but the reaction to that was quite
literally Let's Never Talk About This Again - or alternatively Let's Keep
Bringing This Up To Complain About How It Was Handled, depending on the person
in question - but then people always only seemed to be talking about that
specific incident and argument. I never saw anyone draw the conclusion that
"hey, this looks like an important subcategory of x-risks that warrants separate
investigation and dedicated work to avoid".
8Wei_Dai6y
There was some discussion back in 2012
[http://lesswrong.com/lw/ajm/ai_risk_and_opportunity_a_strategic_analysis/5ylx]
and sporadically since
[http://lesswrong.com/lw/hzs/three_approaches_to_friendliness/9e9g] then
[http://lesswrong.com/lw/e97/stupid_questions_open_thread_round_4/7ba4]. (ETA:
You can also do a search for "hell simulations" and get a bunch more results.)
I've always thought that in order to prevent astronomical suffering, we will
probably want to eventually (i.e., after a lot of careful thought) build an FAI
that will colonize the universe and stop any potential astronomical suffering
arising from alien origins and/or try to reduce suffering in other universes via
acausal trade etc., so the work isn't very different from other x-risk work. But
now that the x-risk community is larger, maybe it does make sense to split out
some of the more s-risk specific work?
2paulfchristiano6y
It seems like the most likely reasons to create suffering come from the
existence of suffering-hating civilizations. Do you think that it's clear/very
likely that it is net helpful for there to be more mature suffering-hating
civilizations? (On the suffering-focused perspective.)
7Wei_Dai6y
My intuition is that there is no point in trying to answer questions like these
before we know a lot more about decision theory, metaethics, metaphilosophy, and
normative ethics, so pushing for a future where these kinds of questions
eventually get answered correctly (and the answers make a difference in what
happens) seems like the most important thing to do. It doesn't seem to make
sense to try to lock in some answers (i.e., make our civilization
suffering-hating or not suffering-hating) on the off chance that when we figure
out what the answers actually are, it will be too late. Someone with much less
moral/philosophical uncertainty
[http://reducing-suffering.org/summary-beliefs-values-big-questions/] than I do
would perhaps prioritize things differently, but I find it difficult to motivate
myself to think really hard from their perspective.
1paulfchristiano6y
This question seems like a major input into whether x-risk reduction is useful.
1Wei_Dai6y
If we try to answer the question now, it seems very likely we'll get the answer
wrong (given my state of uncertainty about the inputs that go into the
question). I want to keep civilization going until we know better how to answer
these types of questions. For example if we succeed in building a correctly
designed/implemented Singleton FAI, it ought to be able to consider this
question at leisure, and if it becomes clear that the existence of mature
suffering-hating civilizations actually causes more suffering to be created,
then it can decide to not make us into a mature suffering-hating civilization,
or take whatever other action is appropriate.
Are you worried that by the time such an FAI (or whatever will control our
civilization) figures out the answer, it will be too late? (Why? If we can
decide that x-risk reduction is bad, then so can it. If it's too late to alter
or end civilization at that point, why isn't it already too late for us?) Or are
you worried more that the question won't be answered correctly by whatever will
control our civilization?
1paulfchristiano6y
If you are concerned exclusively with suffering, then increasing the number of
mature civilizations is obviously bad and you'd prefer that the average
civilization not exist. You might think that our descendants are particularly
good to keep around, since we hate suffering so much. But in fact almost all
s-risks occur precisely because of civilizations that hate suffering, so it's
not at all clear that creating "the civilization that we will become on
reflection" is better than creating "a random civilization" (which is bad).
To be clear, even if we have modest amounts of moral uncertainty I think it
could easily justify a "wait and see" style approach. But if we were committed
to a suffering-focused view then I don't think your argument works.
2Wei_Dai6y
It seems just as plausible to me that suffering-hating civilizations reduce the
overall amount of suffering in the multiverse, so I think I'd wait until it
becomes clear which is the case, even if I was concerned exclusively with
suffering. But I haven't thought about this question much, since I haven't had a
reason to assume an exclusive concern with suffering, until you started asking
me to.
Earlier in this thread I'd been speaking from the perspective of my own moral
uncertainty, not from a purely suffering-focused view, since we were discussing
the linked article, and Kaj had written
[http://lesswrong.com/lw/p5v/srisks_why_they_are_the_worst_existential_risks/duas]
:
What's your reason for considering a purely suffering-focused view? Intellectual
curiosity? Being nice to or cooperating with people like Brian Tomasik by
helping to analyze one of their problems?
1paulfchristiano6y
Understanding the recommendations of each plausible theory seems like a useful
first step in decision-making under moral uncertainty.
0Lukas_Gloor6y
Perhaps this, in case it turns out to be highly important but difficult to get
certain ingredients – e.g. priors or decision theory – exactly right. (But I
have no idea, it's also plausible that suboptimal designs could patch themselves
well, get rescued somehow, or just have their goals changed without much fuss.)
0komponisto6y
That sort of subject is inherently implicit in the kind of decision-theoretic
questions that MIRI-style AI research involves. More generally, when one is
thinking about astronomical-scale questions, and aggregating utilities, and so
on, it is a matter of course that cosmically bad outcomes are as much of a
theoretical possibility as cosmically good outcomes.
Now, the idea that one might need to specifically think about the bad outcomes,
in the sense that preventing them might require strategies separate from those
required for achieving good outcomes, may depend on additional assumptions that
haven't been conventional wisdom here.
0Kaj_Sotala6y
Right, I took this idea to be one of the main contributions of the article, and
assumed that this was one of the reasons why cousin_it felt it was important and
novel.
3[anonymous]6y
Thanks for voicing this sentiment I had upon reading the original comment. My
impression was that negative utilitarian viewpoints / things of this sort had
been trending for far longer than cousin_it's comment might suggest.
2Kaj_Sotala6y
The article isn't specifically negative utilitarian, though - even classical
utilitarians would agree that having astronomical amounts of suffering is a bad
thing. Nor do you have to be a utilitarian in the first place to think it would
be bad: as the article itself notes, pretty much all major value systems
probably agree on s-risks being a major Bad Thing:
3AlexMennen6y
Yes, but the claim that that risk needs to be taken seriously is certainly not
conventional wisdom around here.
0komponisto6y
Decision theory (which includes the study of risks of that sort) has long been a
core component of AI-alignment research.
0AlexMennen6y
No, it doesn't. Decision theory deals with abstract utility functions. It can
talk about outcomes A, B, and C where A is preferred to B and B is preferred to
C, but doesn't care whether A represents the status quo, B represents death, and
C represents extreme suffering, or whether A represents gaining lots of wealth
and status, B represents the status quo, and C represents death, so long as the
ratios of utility differences are the same in each case. Decision theory has
nothing to do with the study of s-risks.
0komponisto6y
The first and last sentences of the parent comment do not follow from the
statements in between.
0Kaj_Sotala6y
That doesn't seem to refute or change what Alex said?
0komponisto6y
What Alex said doesn't seem to refute or change what I said.
But also: I disagree with the parent. I take conventional wisdom here to include
support for MIRI's agent foundations agenda, which includes decision theory,
which includes the study of such risks (even if only indirectly or implicitly).
0[anonymous]6y
Fair enough. I guess I didn't think carefully about it before. I assumed that
s-risks were much less likely than x-risks (true) and so they could be
discounted (false). It seems like the right way to imagine the landscape of
superintelligences is a vast flat plain (paperclippers and other things that
kill everyone without fuss) with a tall thin peak (FAIs) surrounded by a pit
that's astronomically deeper (FAI-adjacent and other designs). The right
comparison is between the peak and the pit, because if the pit is more likely,
I'd rather have the plain.
4casebash6y
I think the reason why cousin_it's comment is upvoted so much is that a lot of
people (including me) weren't really aware of S-risks or how bad they could be.
It's one thing to just make a throwaway line that S-risks could be worse, but
it's another thing entirely to put together a convincing argument.
Similar ideas have been in other articles, but they've framed it in terms of
energy-efficiency
[http://reflectivedisequilibrium.blogspot.com.au/2012/03/are-pain-and-pleasure-equally-energy.html]
while defining weird words such as computronium or the two-envelopes problem
[http://reducing-suffering.org/two-envelopes-problem-for-brain-size-and-moral-uncertainty/]
, which make it much less clear. I don't think I saw the links for either of
those articles before, but if I had, I probably wouldn't have read them.
I also think that the title helps as well. S-risks is a catchy name, especially
if you already know x-risks. I know that this term has been used before
[https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/]
, but it wasn't used in the title. Further, while being quite a good article,
you can read the summary, introduction and conclusion without encountering the
idea that the author believes that s-risks are much greater than x-risks, as
opposed to being just yet another risk to worry about.
I think there's definitely an important lesson to be drawn here. I wonder how
many other articles have gotten close to an important truth, but just failed to
hit it out fo the park for some reason or another.
2Lukas_Gloor6y
Interesting!
I'm only confident about endorsing this conclusion conditional on having values
where reducing suffering matters a great deal more than promoting happiness. So
we wrote the "Reducing risks of astronomical suffering" article in a
deliberately 'balanced' way, pointing out the different perspectives. This is
why it didn't come away making any very strong claims. I don't find the
energy-efficiency point convincing at all, but for those who do, x-risks are
likely (though not with very high confidence) still more important, mainly
because more futures will be optimized for good outcomes rather than bad
outcomes, and this is where most of the value is likely to come from. The "pit"
around the FAI-peak is in expectation extremely bad compared to anything that
exists currently, but most of it is just accidental suffering that is still
comparatively unoptimized. So in the end, whether s-risks or x-risks are more
important to work on on the margin depends on how suffering-focused or not
someone's values are.
Having said that, I totally agree that more people should be concerned about
s-risks and it's concerning that the article (and the one on suffering-focused
AI safety) didn't manage to convey this point well.
0[comment deleted]6y
2Jiro6y
That sounds like a recipe for Pascal's Mugging.
3cousin_it6y
Only if you think one in a million events are as rare as meeting god in person.
2David Althaus6y
The article that introduced the term "s-risk"
[https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/]
was shared on LessWrong in October 2016
[http://lesswrong.com/lw/o0t/reducing_risks_of_astronomical_suffering_srisks_a/]
. The content of the article and the talk seem similar.
Did you simply not come across it or did the article just (catastrophically)
fail to explain the concept of s-risks and its relevance?
3cousin_it6y
I've seen similar articles before, but somehow this was the first one that shook
me. Thank you for doing this work!
1ignoranceprior6y
And the concept is much older than that. The 2011 Felicifia post "A few dystopic
future scenarios"
[https://web.archive.org/web/20141201202015/http://felicifia.org/viewtopic.php?p=4454]
by Brian Tomasik outlined many of the same considerations that FRI works on
today (suffering simulations, etc.), and of course Brian has been blogging about
risks of astronomical suffering since then. FRI itself was founded in 2013.
0Lumifer6y
Iain Banks' Surface Detail published in 2010 featured a war over the existence
of virtual hells (simulations constructed explicitly to punish the ems of
sinners).
2tristanm6y
The only counterarguments I can think of would be:
* The claim that the likelihood of s-risks being close to that of x-risks seems
not well argued to me. In particular, conflict seems to be the most plausible
scenario (and one which has a high prior placed on it as we can observe that
much suffering today is caused by conflict), but it seems to be less and less
likely of a scenario once you factor in superintelligence, as multi-polar
scenarios seem to be either very short-lived or unlikely to happen at all.
* We should be wary of applying anthropomorphic traits to hypothetical
artificial agents in the future. Pain in biological organisms may very well
have evolved as a proxy to negative utility, and might not be necessary in
"pure" agent intelligences which can calculate utility functions directly.
It's not obvious to me that implementing suffering in the sense that humans
understand it would be cheaper or more efficient for a superintelligence to
do instead of simply creating utility-maximizers when it needs to produce a
large number of sub-agents.
* High overlap between approaches to mitigating x-risk and approaches to
mitigating s-risks. If the best chance of mitigating future suffering is
trying to bring about a friendly artificial intelligence explosion, then it
seems that the approaches we are currently taking should still be the correct
ones.
* More speculatively: If we focus heavily on s-risks, does this open us up to
issues regarding utility-monsters? Can I extort people by creating a
simulation of trillions of agents and then threaten to minimize their
utility? (If we simply value the sum of utility, and not necessarily the
complexity of the agent having the utility, then this should be relatively
cheap to implement).
2cousin_it6y
I think the most general response to your first three points would look
something like this: Any superintelligence that achieves human values will be
adjacent in design space to many superintelligences that cause massive
suffering, so it's quite likely that the wrong superintelligence will win, due
to human error, malice, or arms races.
As to your last point, it looks more like a research problem than a
counterargument, and I'd be very interested in any progress on that front :-)
0Lumifer6y
Why so? Flipping the sign doesn't get you "adjacent", it gets you "diametrically
opposed".
If you really want chocolate ice cream, "adjacent" would be getting strawberry
ice cream, not having ghost pepper extract poured into your mouth.
3Good_Burning_Plastic6y
They said "adjacent in design space". The Levenshtein distance between return
val; and return -val; is 1.
-1Lumifer6y
So being served a cup of coffee and being served a cup of pure capsaicin are
"adjacent in design space"? Maybe, but funny how that problem doesn't arise or
even worry anyone...
7dogiv6y
More like driving to the store and driving into the brick wall of the store are
adjacent in design space.
0cousin_it6y
That's a twist on a standard LW argument, see e.g. here
[https://wiki.lesswrong.com/wiki/Complexity_of_value]:
It seems to me that fragility of value can lead to massive suffering in many
ways.
0Lumifer6y
You're basically dialing that argument up to eleven. From "losing a small part
could lead to unacceptable results" you are jumping to "losing any small part
will lead to unimaginable hellscapes":
0cousin_it6y
Yeah, not all parts. But even if it's a 1% chance, one hellscape might balance
out a hundred universes where FAI wins. Pain is just too effective at creating
disutility. I understand why people want to be optimistic, but I think being
pessimistic in this case is more responsible.
0Lumifer6y
So basically you are saying that the situation is asymmetric: the
impact/magnitude of possible bad things is much much greater than the
impact/magnitude of possible good things. Is this correct?
3cousin_it6y
Yeah. One sign of asymmetry is that creating two universes, one filled with
pleasure and the other filled with pain, feels strongly negative rather than
symmetric to us. Another sign is that pain is an internal experience, while our
values might refer to the external world (though it's very murky), so the former
might be much easier to achieve. Another sign is that in our world it's much
easier to create a life filled with pain than a life that fulfills human values.
3dogiv6y
Yes, many people intuitively feel that a universe of pleasure and a universe of
pain add to a net negative. But I suspect that's just a result of experiencing
(and avoiding) lots of sources of extreme pain in our lives, while sources of
pleasure tend to be diffuse and relatively rare. The human experience of
pleasure is conjunctive because in order to survive and reproduce you must
fairly reliably avoid all types of extreme pain. But in a pleasure-maximizing
environment, removing pain will be a given.
It's also true that our brains tend to adapt to pleasure over time, but that
seems simple to modify once physiological constraints are removed.
2CarlShulman6y
"one filled with pleasure and the other filled with pain, feels strongly
negative rather than symmetric to us"
Comparing pains and pleasures of similar magnitude? People have a tendency not
to do this, see the linked thread
[https://www.facebook.com/groups/effective.altruists/permalink/1117549958301360/?comment_id=1118800308176325&comment_tracking=%7B%22tn%22%3A%22R%22%7D]
.
"Another sign is that pain is an internal experience, while our values might
refer to the external world (though it's very murky"
You accept pain and risk of pain all the time to pursue various pleasures,
desires and goals. Mice will cross electrified surfaces for tastier treats.
If you're going to care about hedonic states as such, why treat the external
case differently?
Alternatively, if you're going to dismiss pleasure as just an indicator of true
goals (e.g. that pursuit of pleasure as such is 'wireheading') then why not
dismiss pain in the same way, as just a signal and not itself a goal?
0cousin_it6y
My point was comparing pains and pleasures that could be generated with similar
amount of resources. Do you think they balance out for human decision making?
For example, I'd strongly disagree to create a box of pleasure and a box of
pain, do you think my preference would go away after extrapolation?
1CarlShulman6y
"My point was comparing pains and pleasures that could be generated with similar
amount of resources. Do you think they balance out for human decision making?"
I think with current tech it's cheaper and easier to wirehead to increase pain
(i.e. torture) than to increase pleasure or reduce pain. This makes sense
biologically, since organisms won't go looking for ways to wirehead to maximize
their own pain, evolution doesn't need to 'hide the keys' as much as with
pleasure or pain relief (where the organism would actively seek out easy means
of subverting the behavioral functions of the hedonic system). Thus when
powerful addictive drugs are available, such as alcohol, human populations
evolve increased resistance over time. The sex systems evolve to make
masturbation less rewarding than reproductive sex under ancestral conditions,
desire for play/curiosity is limited by boredom, delicious foods become less
pleasant when full or the foods are not later associated with nutritional
sensors in the stomach, etc.
I don't think this is true with fine control over the nervous system (or a
digital version) to adjust felt intensity and behavioral reinforcement. I think
with that sort of full access one could easily increase the intensity (and ease
of activation) of pleasures/mood such that one would trade them off against the
most intense pains at ~parity per second, and attempts at subjective comparison
when or after experiencing both would put them at ~parity.
People will willingly undergo very painful jobs and undertakings for money,
physical pleasures, love, status, childbirth, altruism, meaning, etc. Unless you
have a different standard for the 'boxes' than used in subjective comparison
with rich experience of the things to be compared I think we just haggling over
the price re intensity.
We know the felt caliber and behavioral influence of such things can vary
greatly. It would be possible to alter nociception and pain receptors to amp up
or damp down any particular
0cousin_it6y
We could certainly make agents for whom pleasure and pain would use equal
resources per util. The question is if human preferences today (or extrapolated)
would sympathize with such agents to the point of giving them the universe.
Their decision-making could look very inhuman to us. If we value such agents
with a discount factor, we're back at square one.
2CarlShulman6y
That's what the congenital deafness discussion was about.
You have preferences over pain and pleasure intensities that you haven't
experienced, or new durations of experiences you know. Otherwise you wouldn't
have anything to worry about re torture, since you haven't experienced it.
Consider people with pain asymbolia
[https://link.springer.com/referenceworkentry/10.1007%2F978-0-387-79948-3_762]:
Suppose you currently had pain asymbolia. Would that mean you wouldn't object to
pain and suffering in non-asymbolics? What if you personally had only happened
to experience extremely mild discomfort while having lots of great positive
experiences? What about for yourself? If you knew you were going to get a cure
for your pain asymbolia tomorrow would you object to subsequent torture as
intrinsically bad?
We can go through similar stories for major depression and positive mood.
Seems it's the character of the experience that matters.
Likewise, if you've never experienced skiing, chocolate, favorite films, sex,
victory in sports, and similar things that doesn't mean you should act as though
they have no moral value. This also holds true for enhanced experiences and
experiences your brain currently is unable to have, like the case of congenital
deafness followed by a procedure to grant hearing and listening to music.
0cousin_it6y
Music and chocolate are known to be mostly safe. I guess I'm more cautious about
new self-modifications that can change my decisions massively, including
decisions about more self-modifications. It seems like if I'm not careful, you
can devise a sequence that will turn me into a paperclipper. That's why I
discount such agents for now, until I understand better what CEV means.
0Kaj_Sotala6y
This seems plausible but not obvious to me. Humans are superintelligent as
compared to chimpanzees (let alone, say, Venus flytraps), but humans have still
formed a multipolar civilization.
0tristanm6y
When thinking about whether s-risk scenarios are tied to or come about by
similar means as x-risk scenarios (such as a malign intelligence explosion), the
relevant issue to me seems to be whether or not such a scenario could result in
a multi-polar conflict of cosmic proportions. I think the chance of that
happening is quite low, since intelligence explosions seem to be most likely to
result in a singleton.
0[anonymous]6y
Due to complexity and fragility of human values, any superintelligence that
fulfills them will probably be adjacent in design space to many other
superintelligences that cause lots of suffering (which is also much cheaper), so
a wrong superintelligence might take over due to human error or malice or arms
races. That's where most s-risk is coming from, I think. The one in a million
number seems optimistic, actually.
2turchin6y
I agree that preventing s-risks is important, but I will try to look on possible
counter arguments:
1. Benevolent AI will able to fight acasual war against evil AI in the another
branch of the multiverse by creating more my happy copies, or more paths
from suffering observer-moment to happy observer-moment. So creating
benevolent superintelligence will help against suffering everywhere in the
multiverse.
2. Non-existence is the worst form of suffering if we define suffering as
action against our most important value. Thus x-risks are s-risks. Pain is
not always suffering, as masochists exist.
3. If we value too much attention to animal suffering, we give ground to
projects like Voluntary human extinction movement. So we increase chances of
human extinction, as humans created animal farms. Moreover, if we agree that
non-existence is not suffering, we could kill all life on earth and stop all
sufferings - which is not right.
4. Benevolent AI will able to resurrect all possible sentient beings and
animals and provide them infinite paradise thus compensating any current
suffering of animals.
5. Only infinite and unbearable suffering are bad. We should distinguish
unbearable sufferings like agony, and ordinary sufferings which just
reinforcement learning signals for wetware of our brain and inform us about
the past wrong decisions or the need to call a doctor.
5cousin_it6y
I think all of these are quite unconvincing and the argument stays intact, but
thanks for coming up with them.
2turchin6y
1. I think longer explanation is needed to show how benevolent AI will save
observers from evil AI. It is not just compensation for sufferings. It is
based on the idea of the indexical uncertainty of equal observers. If two
equal observers-moments exist, he doesn't know, which one them he is. So a
benevolent AI creates 1000 copies of an observer-moment which is in jail of
evil AI, and construct to each copy pleasant next moment. From the point of
view of the jailed observer-moment, there will be 1001 expected future
moments for him, and only 1 of them will consist of continued sufferings. So
expected duration of his suffering will be less than a second. However, to
win such game benevolent AI need to have the overwhelming advantage in
computer power and some other assumptions about nature of personal identity
need to be resolved.
2. I agree that some outcomes, like eternal very strong suffering are worse,
but it is important to think about non-existence as a form of suffering, as
it will help us in utilitarian calculations and will help to show that
x-risks are the type of s-risks.
3. There more people in the world who care about animal sufferings than about
x-risks, and giving them new argument increases the probability of x-risks.
4. What do you mean by "Also it's about animals for some reason, let's talk
about them when hell freezes over."? We could provide happiness to all
animals and provide infinitely survival to their species, which otherwise
will extinct completely in millions years.
5. Do you mean finite, but unbearable sufferings, like intensive pain for one
year?
EDITED: It looks like you changed your long reply while I was writing the long
answer on all your counterarguments.
1RomeoStevens6y
X-risk is still plausibly worse in that we need to survive to reach as much of
the universe as possible and eliminate suffering in other places.
Edit: Brian talks about this here:
https://foundational-research.org/risks-of-astronomical-future-suffering/#Spread_of_wild_animals-2
[https://foundational-research.org/risks-of-astronomical-future-suffering/#Spread_of_wild_animals-2]
I think my own rough future philosophy is making sure that the future has an increase in autonomy for humanity. I think it transforms into S-risk reduction assuming that autonomous people will chose to reduce their suffering and their potential future suffering if they can. It also transforms the tricky philosophical question of defining suffering into the tricky philosophical question of defining autonomy, that might be trade that is preferred.
I think I prefer the autonomy increase because I do not have to try... (read more)
So I don't have much experience with philosophy; this is mainly a collection of my thoughts as I read through.
1) S-risks seem to basically describe hellscapes, situations of unimaginable suffering. Is that about right?
2) Two assumptions here seem to be valuing future sentience and the additive nature of utility/suffering. Are these typical stances to be taking? Should there be some sort of discounting happening here?
3) I'm pretty sure I'm strawmanning here, but I can't but feel like there's some sort of argument by definition here where we first defined s-... (read more)
To maximize human suffering per unit of space-time, you need a good model of
human values, just like a Friendly AI.
But to create astronomical amount of human suffering (without really maximizing
it), you only need to fill astronomical amount of space-time with humans living
in bad conditions, and prevent them from escaping those conditions. Relatively
easier.
Instead of Thamiel, imagine immortal Pol Pot with space travel.
We could also add a-risks: that human civilisation will destroy alien life and alien civilizations. For example, LHC-false vacuum-catastrophe or UFAI could dangerously affect all visible universe and kill an unknown number of the alien civilisations or prevent their existence.
Preventing risks to alien life is one of the main efforts in the sterilisation of Mars rovers and sinking of Galileo and Cassini in Jupiter and Saturn after the end of their missions.
The flip side of this idea is "cosmic rescue missions" (term coined by David
Pearce), which refers to the hypothetical scenario in which human civilization
help to reduce the suffering of sentient extraterrestrials (in the original
context, it referred to the use of technology to abolish suffering
[https://wiki.lesswrong.com/wiki/Abolitionism]). Of course, this is more
relevant for simple animal-like aliens and less so for advanced civilizations,
which would presumably have already either implemented a similar technology or
decided to reject such technology. Brian Tomasik
[https://foundational-research.org/risks-of-astronomical-future-suffering/#Spread_of_wild_animals-2]
argues that cosmic rescue missions are unlikely.
Also, there's an argument that humanity conquering aliens civs would only be
considered bad if you assume that either (1) we have
non-universalist-consequentialist reasons to believe that preventing alien
civilizations from existing is bad, or (2) the alien civilization would produce
greater universalist-consequentialist value than human civilizations with the
same resources. If (2) is the case, then humanity should actually be willing to
sacrifice itself to let the aliens take over (like in the "utility monster"
thought experiment), assuming that universalist consequentialism is true. If
neither (1) nor (2) holds, then human civilization would have greater value than
ET civilization. Seth Baum's paper
[http://sethbaum.com/ac/2010_ET-Encounter.pdf] on universalist ethics and alien
encounters goes into greater detail.
1turchin6y
Thanks for links. My thought was that we may give higher negative utility to
those x-risks which are able to become a-risks too, that is LHC and AI.
If you know Russian science fiction by Strugatsky, there is an idea in it of
"Progressors" - the people who are implanted into other civilisations to help
them develop quickly. At the end, the main character concluded that such actions
violate value of any civilization to determine their own way and he returned to
earth to search and stop possible alien progressors on here.
2ignoranceprior6y
Oh, in those cases, the considerations I mentioned don't apply. But I still
thought they were worth mentioning.
In Star Trek, the Federation has a "Prime Directive" against interfering with
the development of alien civilizations.
3Lumifer6y
The main role of which is to figure in this recurring dialogue:
-- Captain, but the Prime Directive!
-- Screw it, we're going in.
0Lumifer6y
Iain Banks has similar themes in his books -- e.g. Inversions. And generally
speaking, in the Culture universe, the Special Circumstances are a meddlesome
bunch.
Feedback: I had to scroll a very long way until I found out what "s-risk" even was. By then I had lost interest, mainly because generalizing from fiction is not useful.
You might like this better:
https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/
[https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/]
2Max_Daniel6y
Thank you for your feedback. I've added a paragraph at the top of the post that
includes the definition of s-risk and refers readers already familiar with the
concept to another article.
2Brian_Tomasik6y
Thanks for the feedback! The first sentence below the title slide says: "I’ll
talk about risks of severe suffering in the far future, or s-risks." Was this an
insufficient definition for you? Would you recommend a different definition?
Is it still a facepalm given the rest of the sentence? "So, s-risks are roughly
as severe as factory farming, but with an even larger scope." The word "severe"
is being used in a technical sense (discussed a few paragraphs earlier) to mean
something like "per individual badness" without considering scope.
1[anonymous]6y
I think the claim that s-risks are roughly as severe as factory farming "per
individual badness" is unsubstantiated. But it is reasonable to claim that
experiencing either would be worse than death, "hellish". Remember, Hell has
circles.
1fubarobfusco6y
The section presumes that the audience agrees wrt veganism. To an audience who
isn't on board with EA veganism, that line comes across as the "arson, murder,
and jaywalking" trope.
2Lukas_Gloor6y
A lot of people who disagree with veganism agree that factory farming is
terrible. Like, more than 50% of the population I'd say.
-1Lumifer6y
Notably, the great majority of them don't have the slightest clue about farming
in general or factory farming in particular. Don't mistake social signaling for
actual positions.
5komponisto6y
As the expression about knowing "how the sausage is made" attests, generally the
more people learn about it, the less they like it.
Of course, veganism is very far from being an immediate consequence of disliking
factory farming. (Similarly, refusing to pay taxes is very far from being an
immediate consequence of disliking government policy.)
1Lumifer6y
That's not obvious to me.
I agree that the more people are exposed to anti-factory-farming propaganda, the
more they are influenced by it, but that's not quite the same thing, is it?
-1Lumifer6y
Facepalm was a severe understatement, this quote is a direct ticket to the loony
bin. I recommend poking your head out of the bubble once in a while -- it's a
whole world out there. For example, some horrible terrible no-good people --
like me -- consider factory farming to be an efficient way of producing a lot of
food at reasonable cost.
This sentence reads approximately as "Literal genocide (e.g. Rwanda) is roughly
as severe as using a masculine pronoun with respect to a nonspecific person, but
with an even larger scope".
The steeliest steelman that I can come up with is that you're utterly out of
touch with the Normies.
6Lukas_Gloor6y
I sympathize with your feeling of alienation at the comment, and thanks for
offering this perspective that seems outlandish to me. I don't think I agree
with you re who the 'normies' are, but I suspect that this may not be a fruitful
thing to even argue about.
Side note: I'm reminded of the discussion here
[http://lesswrong.com/r/discussion/lw/mwr/the_triumph_of_humanity_chart/curi].
(It seems tricky to find a good way to point out that other people are
presenting their normative views in a way that signals an unfair consensus,
without getting into/accused of identify politics or having to throw around
words like "loony bin" or fighting over who the 'normies' are.)
2Lumifer6y
Yes, we clearly have very different worldviews. I don't think alienation is the
right word here, it's just that different people think about the world
differently and IMHO that's perfectly fine (to clarify, I mean values and
normative statements, not facts). And, of course, you have no obligation at all
to do something about it.
If it makes sense to continue adding letters to different risks, l-risks could be identified, that is the risks that kill all life on earth. The main difference for us, humans, that there are zero chances of the new civilisation of Earth in that case.
Wow!
Many thanks for posting that link. It's clearly the most important thing I've read on LW in a long time, I'd upvote it ten times if I could.
It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct. The argument wasn't as clear to me before. Does anyone have good counterarguments? Why shouldn't we all focus on s-risk from now on?
(Unsong had a plot point where Peter Singer declared that the most important task for effective altruists was to destroy Hell. Big props to Scott for seeing it before the rest of us.)
I don't buy the "million times worse," at least not if we talk about the relevant E(s-risk moral value) / E(x-risk moral value) rather than the irrelevant E(s-risk moral value / x-risk moral value). See this post by Carl and this post by Brian. I think that responsible use of moral uncertainty will tend to push you away from this kind of fanatical view
I agree that if you are million-to-1 then you should be predominantly concerned with s-risk, I think they are somewhat improbable/intractable but not that improbable+intractable. I'd guess the probability is ~100x lower, and the available object-level interventions are perhaps 10x less effective. The particular scenarios discussed here seem unlikely to lead to optimized suffering, only "conflict" and "???" really make any sense to me. Even on the negative utilitarian view, it seems like you shouldn't care about anything other than optimized suffering.
The best object-level intervention I can think of is reducing our civilization's expected vulnerability to extortion, which seems poorly-leveraged relative to alignment because it is much less time-sensitive (unless we fail at alignment and so end up committi... (read more)
Carl gave a reason that future creatures, including potentially very human-like minds, might diverge from current humans in a way that makes hedonium much more efficient. If you assigned significant probability to that kind of scenario, it would quickly undermine your million-to-one ratio. Brian's post briefly explains why you shouldn't argue "If there is a 50% chance that x-risks are 2 million times worse, than they are a million times worse in expectation." (I'd guess that there is a good chance, say > 25%, that good stuff can be as efficient as bad stuff.)
I would further say: existing creatures often prefer to keep living even given the possibility of extreme pain. This can be easily explained by an evolutionary story, which suffering-focused utilitarians tend to view as a debunking explanation: given that animals would prefer keep living regardless of the actual balance of pleasure and pain, we shouldn't infer anything from that preference. But our strong dispreference for intense suffering has a similar evolutionary origin, and is no more reflective of underlying moral facts than is our strong preference for survival.
I feel a weird disconnect on reading comments like this. I thought s-risks were a part of conventional wisdom on here all along. (We even had an infamous scandal that concerned one class of such risks!) Scott didn't "see it before the rest of us" -- he was drawing on an existing, and by now classical, memeplex.
It's like when some people spoke as if nobody had ever thought of AI risk until Bostrom wrote Superintelligence -- even though that book just summarized what people (not least of whom Bostrom himself) had already been saying for years.
I guess I didn't think about it carefully before. I assumed that s-risks were much less likely than x-risks (true) so it's okay not to worry about them (false). The mistake was that logical leap.
In terms of utility, the landscape of possible human-built superintelligences might look like a big flat plain (paperclippers and other things that kill everyone without fuss), with a tall sharp peak (FAI) surrounded by a pit that's astronomically deeper (many almost-FAIs and other designs that sound natural to humans). The pit needs to be compared to the peak, not the plain. If the pit is more likely, I'd rather have the plain.
Was it obvious to you all along?
Interesting to see another future philosophy.
I think my own rough future philosophy is making sure that the future has an increase in autonomy for humanity. I think it transforms into S-risk reduction assuming that autonomous people will chose to reduce their suffering and their potential future suffering if they can. It also transforms the tricky philosophical question of defining suffering into the tricky philosophical question of defining autonomy, that might be trade that is preferred.
I think I prefer the autonomy increase because I do not have to try... (read more)
As usual, xkcd is relevant.
So I don't have much experience with philosophy; this is mainly a collection of my thoughts as I read through.
1) S-risks seem to basically describe hellscapes, situations of unimaginable suffering. Is that about right?
2) Two assumptions here seem to be valuing future sentience and the additive nature of utility/suffering. Are these typical stances to be taking? Should there be some sort of discounting happening here?
3) I'm pretty sure I'm strawmanning here, but I can't but feel like there's some sort of argument by definition here where we first defined s-... (read more)
We could also add a-risks: that human civilisation will destroy alien life and alien civilizations. For example, LHC-false vacuum-catastrophe or UFAI could dangerously affect all visible universe and kill an unknown number of the alien civilisations or prevent their existence.
Preventing risks to alien life is one of the main efforts in the sterilisation of Mars rovers and sinking of Galileo and Cassini in Jupiter and Saturn after the end of their missions.
Want to improve the wiki page on s-risk? I started it a few months ago but it could use some work.
Feedback: I had to scroll a very long way until I found out what "s-risk" even was. By then I had lost interest, mainly because generalizing from fiction is not useful.
Direct quote: "So, s-risks are roughly as severe as factory farming"
/facepalm
If it makes sense to continue adding letters to different risks, l-risks could be identified, that is the risks that kill all life on earth. The main difference for us, humans, that there are zero chances of the new civilisation of Earth in that case.
But y-risks term is free. What could it be?
The S is for "Skitter"