Richard Hollerith, 15 miles north of San Francisco. hruvulum@gmail.com

Wiki Contributions


Someone concerned about this possibility has posted to this site and used the term "s-risk".

It is approximately as difficult to create an AI that wants people to suffer as it is to create one that wants people to flourish, and humanity is IMO very far from being able to do the latter, so my main worry is the an AGI will kill us all painlessly.

I suggest making sure to get a lot of near-infrared radiation during covid and covid recovery. For most people, the sun will be the preferred source. If you don't want at the same time to get a lot of UV light, you can wear sunscreen or high-SPF clothing and (most of) the near-infrared radiation will go right through the clothing or sunscreen.

This suggestion is supported by a RCT, summarized in this next video. Also note that the author of the video is a COVID doctor.


The RCT described in the video used LEDs to produces the near-infrared radiation, but the sun produces a lot of radiation at the same frequency. The LEDs produce all their radiation at a specific frequency whereas the sun produces radiation at all infrared frequencies, but the sun is such an intense source that even though it produces most of its radiation at the "wrong" frequencies it produces enough at the right frequencies (namely, the frequencies absorbed by the cytochrome C molecules in your mitochondria).

TropicalFruit and I have taken this discussion private (in order to avoid flooding this comment section with discussion on a point only very distantly related to the OP.) However if you have any interest in the discussion, ask one of us for a copy. (We have both agreed to provide a copy to whoever asks.)

In your new scenario, if I understand correctly, you have postulated that one box always explodes and one never explodes; I must undergo 2 experiences: the first experience is with one of the boxes, picked at random; then I get to choose whether my second experience is with the same box or whether it is with the other box. But I don't need to know the outcome of the first experience to know that I want to limit my exposure to just one of these dangerous boxes: I will always choose to undergo the second experience with the same box as I underwent the first one with. Note that I arrived at this choice without doing the thing that I have been warning people not to do, namely, to update on observation X when I know it would have been impossible for me to survive (or more precisely for my rationality, my ability to have and to refine a model of reality, to survive) the observation not X.

That takes care of the first of your two new scenarios. In your second new scenario, I have a .5 chance of dying during my first experience. Then I may choose whether my second experience is with the same box or a new one. Before I make my choice, I would dearly love to experiment with either box in a setting in which I could survive the box's exploding. But by your postulate as I understand it, that is not possible, so I am indifferent about which box I have my second experience with: either way I choose, my probability that I will die during the second experience is .5.

Note the in your previous comment, in which there was some P such each time a box is used, it has a probability P of exploding, there is no benefit to my being able to experiment with a box in a setting in which I could survive an explosion, but in the scenario we are considering now there is a huge benefit.

Suppose my best friend is observing the scenario from a safe distance: he can see what is happening, but is protected from any exploding box. My surviving the first experience changes his probability that the box used in the first experience will explode the next time it is used from .5 to .333. Actually, I am not sure of that number (because I am not sure the law of succession applies here -- it has been a long time since I read my E.T. Jaynes) but I am sure that his probability changes from .5 to something less than .5. And my best friend can communicate that fact to me: "Richard," he can say, "stick with the same box used in your first experience." But his message has the same defect that my directly observing the behavior of the box has: namely, since I cannot survive the outcome that would have led him to increase his probability that the box will explode the next time it is used, I cannot update on the fact that his probability has decreased.

Students of E.T. Jaynes know that observer A's probability of hypothesis H can differ from observer B's probability: this happens when A has seen evidence for or against H that B has not seen yet. Well, here we have a case where A’s probability can differ from B’s even though A and B have seen the same sequence of evidence about H: namely, that happens when one of the observers could not have survived having observed a sequence of events (different from the sequence that actually happened) that the other observer could have survived.

[This comment is no longer endorsed by its author]Reply

If I know nothing about the boxes except that they have the same a priori probability of exploding and killing me, then I am indifferent between the two black boxes.

It is not terribly difficult to craft counter-intuitive examples of the principle. I anticipated I would be presented with such examples (because this is not my first time discussing this topic), which is why in my original comment I wrote, "its counter-intuitiveness is not by itself a strong reason to disbelieve it," and the rest of that paragraph.

In practice it would have to take at least eight minutes

We don't need to consider that here because any evidence of the explosion would also take at least eight minutes to arrive, so there is approximately zero minutes during which you are able to observe the evidence of the explosion before you are converted into a plasma that has no ability to update on anything. That is when observational selection effects are at their strongest: namely, when you are vanishingly unlikely to be in one of those intervals between your having observed an event and that event's destroying your ability to maintain any kind of mental model of reality.

We 21st-century types have so much causal information about reality that I have been unable during this reply to imagine any circumstance where I would resort to Laplace's law of succession to estimate any probability in anger where observational selection effects also need to be considered. It's not that I doubt the validity of the law; its just that I have been unable to imagine a situation in which the causal information I have about an "event" does not trump the statistical information I have about how many times the event has been observed to occur in the past and I also have enough causal information to entertain real doubts about my ability to survive if the event goes the wrong way while remaining confident in my survival if the event goes the right way.

Certainly we can imagine ourselves in the situation of the physicists of the 1800s who had no solid guess as to the energy source keeping the sun shining steadily. But even they had the analogy with fire. (The emissions spectra of the sun and of fire are both I believe well approximated as blackbody radiation and the 1800s had prisms and consequently at least primitive spectrographs.) A fire doesn't explode unless you suddenly give it fuel -- and not any fuel will do: adding logs to a fire will not cause an explosion, but adding enough gasoline will. "Where would the fuel come from that would cause the sun to explode?" the 1800s can ask. Planets are made mostly of rocks, which don't burn, and comets aren't big enough. Merely what I have written in this short paragraph would be enough to trump IMO statistical considerations of how many days the sun has gone without exploding.

If I found myself in a star-trek episode in which every night during sleep I find myself transported into some bizarre realm of "almost-pure sensation" where none of my knowledge of reality seems to apply and where a sun-like thing rises and sets, then yeah, I can imagine using the law of succession, but then for observational selection effects to enter the calculation, I'd have to have enough causal information about this sun-like thing (and about my relationship to the bizarre realm) to doubt my ability to survive if it sets and never rises again, but that seems to contradict the assumption that none of my knowledge of reality applies to the bizarre realm.

My probability of the sun's continuing to set and rise without exploding is determined exclusively by (causal) knowledge created by physicists and passed down to me in books, etc: how many times the sun has risen so far is in comparison of negligible importance. This knowledge is solid and "settled" enough that it is extremely unlikely that any sane physicist would announce that, well, actually, the sun is going to explode -- probably within our lifetimes! But if a sane physicist did make such an announcement, I would focus on the physicist's argument (causal knowledge) and pay almost no attention to the statistical information of how long there have been reliable observations of the sun's not exploding -- and this is true even if I were sure I could survive if the sun exploded -- because the causal model is so solid (and the facts the model depends on, e.g., the absorption spectra of hydrogen and helium, are so easily checked). Consequently, the explosion of the sun is not a good example of where observational selection effects become important.

By the way, observational selection effects are hairy enough that I basically cannot calculate anything about them. Suppose for example that if Russia attacked the US with nukes, I would survive with p = .4 (which seems about right). (I live in the US.) Suppose further that my causal model of Russian politics makes my probability that Russia will attack the US with nukes some time in the next 365 days as .003 if Russia had deployed nukes for the first time today (i.e., if Russia didn't have any nukes till right now). How should I adjust my probability (i.e., the .003) to take into account that fact that Russia's nukes were in fact deployed starting in 1953 (year?) and so far Russia has never attacked the US with nukes? I don't know! (And I have practical reasons for wanting to do this particular calculation, so I've thought a lot about it over the years. I do know that my probability should be greater than it should be if I and my ability to reason were impervious to nuclear attacks. In contrast to the solar-explosion situation, here is a situation in which the causal knowledge is uncertain enough that it would be genuinely useful to employ the statistical knowledge we have; it is just that I don't know how to employ it in a calculation.) But things that are almost certain to end my life are much easier to reason about -- when it comes to observational selection effects -- than something that has a .4 chance of ending my life.

In particular, most of the expected negative utility from AGI research stems from scenarios in which without warning -- more precisely, without anything that the average person would recognize as a warning -- an AGI kills every one of us. The observational selection effects around such a happening are easier to reason about than those around a nuclear attack: specifically, the fact that the predicted event hasn't happened yet is not evidence at all that it will not happen in the future. If a powerful magician kills everyone who tries to bring you the news that the Red Socks have won the World Series of Baseball, and if that magician is extremely effective at his task, then your having observed that the Yankees win the World Series every time it occurs (which is strangely not every year, but some years have no World Series as far as you have heard) is not evidence at all about how often the Red Socks have won the World Series.

And the fact that Eliezer has been saying for at least a few months now that AGI could kill us all any day now—that the probability that it will happen 15 years from now is greater than that probability that it will happen today, but the probability it will happen today is nothing to scoff at—is is very weak evidence against what he's been saying if it is evidence against it at all. A sufficiently rational person will assign what he has been saying the same or very nearly the same probability he would have if Eliezer had started saying it today. In both cases, a sufficiently rational person will focus almost entirely on Eliezer's argument (complicated though it is) and counterarguments and will give almost no weight to how long Eliezer's been saying it or how long AGIs have been in existence. Or more precisely, that is what a sufficiently rational person would do if he or she believed that he or she is unlikely to receive any advance warning of a deadly strike by the AGI beyond the warnings given so far by Eliezer and other AGI pessimists.

Eliezer's argument is more complicated than the reasoning that tells us that the sun will not explode any time soon. More complicated means more likely to contain a subtle flaw. Moreover, it has been reviewed by fewer experts than the solar argument. Consequently, here is a situation in which it would be genuinely useful to use statistical information (e.g., the fact that research labs have been running AGIs for years (ChatGPT is an AGI for example) combined with the fact that we are still alive) but the statistical information is in fact IMO useless because of the extremely strong observational selection effects.

Because a person has a significant chance of surviving a bullet wound -- or more relevantly, of surviving an assault with a gun -- your not having been assaulted by the first thief is evidence that you will not be assaulted in future encounters with him, but it is weaker evidence than it would be if you could be certain of your ability to survive (and your ability to retain your rationality skills and memories after) every encounter with him.

Humans are very good at reading the "motivational states" of the other people in the room with them. If for example the thief's eyes are glassy and he looks like he is staring at something far away even though you know it is unlikely there there is anything of interest in his visual field far away, well that is a sign he is in a dissociated state, which makes it more likely he'll do something unpredictable and maybe violent. If when he looks at you he seems to look right through you, that is a sign of a coldness that also makes it more likely he will be violent if he can thereby benefit himself personally by doing so. So, what is actually doing most of the work of lowering your probability about the danger to you posed the the first thief? The mere fact that you escaped all the previous encounters without having been assaulted or your observations of his body language, tone of voice and other details that give clues about his personality and his mental state?

I’m not sure a human biased economy can actually function on a deflationary standard

There were long stretches of time during which the supply of gold and silver did not keep up with economic growth when gold and silver and currencies backed by gold and silver were the only currencies. Isn't that proof that an economy can function using a deflationary currency?

Yes, IMO the reasoning is wrong: if you you definitely cannot survive an event, then observing that the event did not happened is not evidence at all that it will not explode in the future -- and it continues to not be evidence as long as you continue to observe the non-explosion.

Since we can survive at least for a little while the sudden complete darkening of the sun the sun's not having gone dark is evidence that it will not go dark in the future, but it is less strong evidence than it would be if we could survive the darkening of the sun indefinitely.

The law of the conservation of expected evidence requires us to take selection effects like those into account -- and the law is a simple consequence of the axioms of probability, so to cast doubt on it is casting doubt on the validity of the whole idea of probability (in which case, Cox's theorems would like to have a word with you).

This is not settled science: there is not widespread agreement among scholars or on this site on this point, but its counter-intuitiveness is not by itself a strong reason to disbelieve it because there are parts of settled science that are as counterintuitive as this is: for example, the twin paradox of special relativity and "particle identity in quantum physics".

When you believe that the probability of a revolution in the US is low because the US government is 230 or so years old and hasn't had a revolution yet, you are doing statistical reasoning. In contrast, noticing that if the sun exploded violently enough, we would immediately all die and consequently we would not be having this conversation -- that is causal reasoning. Judea Pearl makes this distinction in the intro to his book Causality. Taking into account selection effects is using causal reasoning (your knowledge of the causal structure of reality) to modify a conclusion of statistical reasoning. You can still become confident that the sun will explode soon if you have a refined-enough causal model of the sun.

Load More