Sorted by New

# Wiki Contributions

Jonah, I agree with what you say at least in principle, even if you would claim I don't follow it in practice. A big advantage of being Bayesian is that you retain probability mass on all the options rather than picking just one. (I recall many times being dismayed with hacky approximations like MAP that let you get rid of the less likely options. Similarly when people conflate the Solomonoff probability of a bitstring with the shortest program that outputs it, even though I guess in that case, the shortest program necessarily has at least as much probability as all the others can combined.)

My main comment on your post is that it's hard to keep track of all of these things computationally. Probably you should try, but it can get messy. It's also possible that in keeping track of too many details, you introduce more errors than if you had kept the analysis simple. On many questions in physics, ecology, etc., there's a single factor that dominates all the rest. Maybe this is less true in human domains because rational agents tend to produce efficiencies due to eating up the free lunches.

So, I'm in favor of this approach if you can do it and make it work, but don't let the best be the enemy of the good. Focus on the strong arguments first, and only if you have the bandwidth go on to think about the weak ones too.

I used to eat a lot of chicken and eggs before I read Peter Singer. After that, I went cold turkey (pardon the expression).

Some really creative ideas, ChristianKl. :)

Even with what you describe, humans wouldn't become extinct, barring other outcomes like really bad nuclear war or whatever.

However, since the AI wouldn't be destroyed, it could bide its time. Maybe it could ally with some people and give them tech/power in exchange for carrying out its bidding. They could help build the robots, etc. that would be needed to actually wipe out humanity.

Obviously there's a lot of conjunction here. I'm not claiming this scenario specifically is likely. But it helps to stimulate the imagination to work out an existence proof for the extinction risk from AGI.

It's not at all clear that a AGI will be human-like, anyone than humans are dog-like.

Ok, bad wording on my part. I meant "more generally intelligent."

How do you fight the AGI past that point?

I was imagining people would destroy their computers, except the ones not connected to the Internet. However, if the AGI is hiding itself, it could go a long way before people realized what was going on.

Interesting scenarios. Thanks!

As we begin seeing robots/computers that are more human-like, people will take the possibility of AGIs getting out of control more seriously. These things will be major news stories worldwide, people will hold natural-security summits about them, etc. I would assume the US military is already looking into this topic at least a little bit behind closed doors.

There will probably be lots of not-quite-superhuman AIs / AGIs that cause havoc along the road to the first superhuman ones. Yes, it's possible that FOOM will take us from roughly a level like where we are now to superhuman AGI in a matter of days, but this scenario seems relatively unlikely to me, so any leverage you want to make on it has to be multiplied by that small probability of it happening.

--

BTW, I'm curious to hear more about the mechanics of your scenario. The AGI hacks itself onto every (Internet-connected) computer in the world. Then what? Presumably this wouldn't cause extinction, just a lot of chaos and maybe years' worth of setback to the economy? Maybe it would increase chances of nuclear war, especially if the AGI could infect nuclear-warhead-related computer systems.

This could be an example of the non-extinction-level AGI disasters that I was referring to. Let me know if there are more ways in which it might cause total extinction, though.

This is a good point. :) I added an additional objection to the piece.

As an empirical matter, extinction risk isn't being funded as much as you suggest it should be if almost everyone has some incentives to invest in the issue.

There's a lot of "extinction risk" work that's not necessarily labeled as such: Biosecurity, anti-nuclear proliferation, general efforts to prevent international hostility by nation states, general efforts to reduce violence in society and alleviate mental illnesses, etc. We don't necessarily see huge investments in AI safety yet, but this will probably change in time, as we begin to see more AIs that get out of control and cause problems on a local scale. 99+% of catastrophic risks are not extinction risks, so as the catastrophes begin happening and affecting more people, governments will invest more in safeguards than they do now. The same can be said for nanotech.

In any event, even if budgets for extinction-risk reduction are pretty low, you also have to look at how much money can buy. Reducing risks is inherently difficult, because so much is out of our hands. It seems relatively easier to win over hearts and minds to utilitronium (especially at the margin right now, by collecting the low-hanging fruit of people who could be persuaded but aren't yet). And because so few people are pushing for utilitronium, it seems far easier to achieve a 1% increase in support for utilitronium than a 1% decrease in the likelihood of extinction.

As you suggest with your "some" qualifier, my essay that benthamite shared doesn't make any assumptions about negative utilitarianism. I merely inserted parentheticals about my own views into it to avoid giving the impression that I'm personally a positive-leaning utilitarian.

Thanks, Jabberslythe! You got it mostly correct. :)

The one thing I would add is that I personally think people don't usually take suffering seriously enough -- at least not really severe suffering like torture or being eaten alive. Indeed, many people may never have experienced something that bad. So I put high importance on preventing experiences like these relative to other things.

Interesting story. Yes, I think our intuitions about what kinds of computations we want to care about are easily bent and twisted depending on the situation at hand. In analogy with Dennett's "intentional stance," humans have a "compassionate stance" that we apply to some physical operations and don't apply to others. It's not too hard to manipulate these intuitions by thought experiments. So, yes, I do fear that other people may differ (perhaps quite a bit) in their views about what kinds of computations are suffering that we should avoid.

I bet there are a lot more people who care about animals' feelings and who care a lot more, than those who care about the aesthetics of brutality in nature.

Well, at the moment, there are hundreds of environmental-preservation organizations and basically no organizations dedicated to reducing wild-animal suffering. Environmentalism as a cause is much more mainstream than animal welfare. Just like the chickens that go into people's nuggets, animals suffering in nature "are out of sight, and the connection between [preserving pristine habitats] and animals living terrible lives elsewhere is hard to visualize."

It's encouraging that more LessWrongers are veg than average, although I think 12.4% is pretty typical for elite universities and the like as well. (But maybe that underscores your point.)

The biggest peculiarity of Brian Tomasik's utility function, that is least likely to ever be shared by the majority of humanity, is probably not that he cares about animals (even that he cares about insects) but that he cares so much more about suffering than happiness and other good things.

An example post. I care a lot about suffering, a little about happiness, and none about other things.

The exchange rate in your utility function between good things and bad things is pretty relevent to whether you should prefer CEV or paperclipping (And what the changes in the probabilities of each even based on actions you might take would have to be in order justify them) and whether you think lab universes would be a good thing.

Yep!