Bounding the impact of AGI

Is there fulltext anywhere?

To paraphrase Kornai's best idea (which he's importing from outside the field):

A reasonable guideline is limiting the human caused xrisk to several orders of magnitude below the natural background xrisk level, so that human-caused dangers are lost in the noise compared to the pre-existing threat we must live with anyway.

I like this idea (as opposed to foolish proposals like driving risks from human made tech down to zero), but I expect someone here could sharpen the xrisk level that Kornai suggests. Here's a disturbing note from the appendix where he does his calculation:

Here we take the “big ﬁve” extinction events that occurred within the past half billion years as background. Assuming a mean time of 10^8 years between mass extinctions and 10^9 victims in the next one yields an annualized death rate of 10, comparing quite favorably to the reported global death rate of ~500 for contact with hornets, wasps, and bees (ICD-9-CM E905.3). [emphasis added]

Obviously, this is a gross mis-understanding of xrisks and why they matter. No one values human lives linearly straight down to 0 or assumes no expansion factors for future generations.

A motivated internet researcher could probably just look up the proper citations from Bostrom's "Global Catastrophic Risks" and create a decomposed model that estimated the background xrisk level from only nature (and then only nature + human risks w/o AI), and develop a better safety margin that would be lower than the one in this paper (implying that AGI could afford to be a few orders of magnitude riskier than Kornai's rough estimates).

[-]timtyler13y30

A reasonable guideline is limiting the human caused xrisk to several orders of magnitude below the natural background xrisk level, so that human-caused dangers are lost in the noise compared to the pre-existing threat we must live with anyway.

I like this idea [...]

We are well above there right now - and that's very unlikely to change before we have machine superintelligence.

[-]Louie13y00

Martel (1997) estimates a considerably higher annualized death rate of 3,500 from meteorite impacts alone (she doesn’t consider continental drift or gamma-ray bursts), but the internal logic of safety engineering demands we seek a lower bound, one that we must put up with no matter what strides we make in redistribution of food, global peace, or healthcare.

Is this correct? I'd expect that this lower-bound was superior to the above (10 deaths / year) for the purpose of calculating our present safety factor... unless we're currently able to destroy earth-threatening meteorites and no one told me.

[-]DaFranker13y10

unless we're currently able to destroy earth-threatening meteorites and no one told me.

Well, we do have the technological means to build something to counter one of them, if we were to learn about it tomorrow and it had ETA 2-3 years. Assuming the threat is taken seriously and more resources and effort are put into this than they were / are in killing militant toddlers in the middle-east using drones, that is.

But if one shows up now and it's about to hit Earth on the prophecy-filled turn of the Mayan calendar? Nope, GG.

[-]timtyler13y20

It is well known that conservatism and caution can result in greater risks under some circumstances. That was the point of my "The risks of caution" and Max's The Perils of Precaution. However, the message does not seem to be sinking in very well - and instead we have statements like: "Not only do we have to prove that the planned AGI will be friendly, the proof itself has to be short enough to be verifiable by humans". I don't think that is a sensible conclusion. It looks more like a naive and unrealistic approach to the issue.

[-][anonymous]13y10

In some very old publication, Eliezer made an xrisk engineering analogy with transistor failure in chip fabrication. The idea was that you often can't reduce transistor failure to acceptably minuscule levels, because rare catastrophic events are more common than acceptable levels. Things like trees falling on houses and destroying laptops, while rare, still contribute relatively huge amounts to transistor failure rates. Despite that, chips are reliable in the absence of catastrophic events. That reliability isn't a consequence of driving down transistor failure rates, it's a consequence of shoving all the failure probability into worlds where all the transistors on a chip fail at once.

Since that was an old publication, I have to take it with salt, but I still wonder, does designing "systems with a failure rate below 10^−63 per logical operation" miss some crucial point about conditional component failure?

[-]timtyler13y00

Regarding: "We are absolutely certain about this."

That is not a good sign in a paper about risk analysis :-(

[-]magfrump13y20

Insofar as it is intended to bring out the emotional parallels surrounding certainty of 63 orders of magnitude, I think it does a good job and brings a good example. I would, for one, go into that room with everyone I know for the free $100 (assuming it was conditional on the theorem being true, or I had at least a week to peruse the specific proof it checked and knew the exact specifications of what it judged to mean "bug-free")

[-]timtyler13y60

My confidences don't differ by "63 orders of magnitude" about anything. If they did, I would know I was being overconfident about something.

[-]John_Maxwell13y-10

Automated theorem proving will be required so that the proof can reasonably be checked by multiple human minds

Who will prove the correctness of the theorem provers?

[-]Louie13y80

I believe Coq is already short and proven using other proving programs that are also short and validated. So I believe the tower of formal validation that exists for these techniques is pretty well secured. I could be wrong about that though... would be curious to know the answer to that.

Relatedly, there are a lot of levels you can go with this. For instance, I wish someone would create other programming languages like CompCert for programming formally validated programs.

LESSWRONG
LW

LESSWRONG
LW

27

Bounding the impact of AGI

27

27