Lumifer - LessWrong

The map of "Levels of defence" in AI safety

There seems to be a complexity limit to what humans can build. A full GAI is likely to be somewhere beyond that limit.

The usual solution to that problem -- see the EY's fooming scenario -- is to make the process recursive: let a mediocre AI improve itself, and as it gets better it can improve itself more rapidly. Exponential growth can go fast and far.

This, of course, gives rise to another problem: you have no idea what the end product is going to look like. If you're looking at the gazillionth iteration, your compiler flags were probably lost around the thousandth iteration and your chained monitor system mutated into a cute puppy around the millionth iteration...

Probabilistic safety systems are indeed more tractable, but that's not the question. The question is whether they are good enough.

The map of "Levels of defence" in AI safety

Lumifer7y00

Are you reinventing Asimov's Three Laws of Robotics?

Happiness Is a Chore

Lumifer7y00

I suspect the solution is this.

Announcing the AI Alignment Prize

Lumifer7y00

tomorrow

That's not conventionally considered to be "in the long run".

We don't have any theory that would stop AI from doing that

The primary reason is that we don't have any theory about what a post-singularity AI might or might not do. Doing some pretty basic decision theory focused on the corner cases is not "progress".

Why Bayesians should two-box in a one-shot

Lumifer7y00

It seems weird that you'd deterministically two-box against such an Omega

Even in the case when the random noise dominates and the signal is imperceptibly small?

Why Bayesians should two-box in a one-shot

Lumifer7y10

So the source-code of your brain just needs to decide whether it'll be a source-code that will be one-boxing or not.

First, in the classic Newcomb when you meet Omega that's a surprise to you. You don't get to precommit to deciding one way or the other because you had no idea such a situation will arise: you just get to decide now.

You can decide however whether you're the sort of person who accepts their decisions can be deterministically predicted in advance with sufficient certainty, or whether you'll be claiming that other people predicting your choice must be a violation of causality (it's not).

Why would you make such a decision if you don't expect to meet Omega and don't care much about philosophical head-scratchers?

And, by the way, predicting your choice is not a violation of causality, but believing that your choice (of the boxes, not of the source code) affects what's in the boxes is.

Second, you are assuming that the brain is free to reconfigure and rewrite its software which is clearly not true for humans and all existing agents.

Why Bayesians should two-box in a one-shot

Lumifer7y20

Old and tired, maybe, but clearly there is not much consensus yet (even if, ahem, some people consider it to be as clear as day).

Note that who makes the decision is a matter of control and has nothing to do with freedom. A calculator controls its display and so the "decision" to output 4 in response to 2+2 it its own, in a way. But applying decision theory to a calculator is nonsensical and there is no free choice involved.

Welcome to Less Wrong! (11th thread, January 2017) (Thread B)

Lumifer7y10

LW is kinda dead (not entirely, there is still some shambling around happening, but the brains are in short supply) and is supposed to be replaced by a shinier reincarnated version which has been referred to as LW 2.0 and which is now in open beta at www.lesserwrong.com

LW 1.0 is still here, but if you're looking for active discussion, LW 2.0 might be a better bet.

Re qualia, I suggest that you start with trying to set up hard definitions for terms "qualia" and "exists". Once you do, you may find the problem disappears -- see e.g. this.

Re simulation, let me point out that the simulation hypothesis is conventionally known as "creationism". As to the probability not being calculable, I agree.

The Critical Rationalist View on Artificial Intelligence

Lumifer7y40

The truth that curi and myself are trying to get across to people here is... it is the unvarnished truth... know far more about epistemology than you. That again is an unvarnished truth

In which way all these statements are different from claiming that Jesus is Life Everlasting and that Jesus dying for our sins is an unvarnished truth?

Lots of people claim to have access to Truth -- what makes you special?

The Critical Rationalist View on Artificial Intelligence

Lumifer7y10

LOL. You keep insisting that people have to play by your rules but really, they don't.

You can keep inventing your own games and declaring yourself winner by your own rules, but it doesn't look like a very useful activity to me.

LESSWRONG
LW

Posts

Wiki Contributions

Comments