warbo

This mostly reminds me of the "tragedy of the commons", where everyone benefits when an action is taken (like irrigating land, picking up litter, etc.), but it costs some small amount to one who takes the action, such that everyone agrees that action should be taken, but nobody wants to do it themselves.

There is also the related concept of "not in my back yard" (NIMBY), where everyone agrees that some 'necessary evil' be done, like creating a new landfill site or nuclear power plant, but nobody wants to take the sacrifice themselves (ie. have it "in their back yard").

Some real-life examples of this effect http://www.dummies.com/how-to/content/ten-reallife-examples-of-the-tragedy-of-the-common.html

By an off switch I mean a backup goal. Goals are standardly regarded as immune self modification, so an off switch, in my sense, would be too.

This is quite a subtle issue.

If the "backup goal" is always in effect, eg. it is just another clause of the main goal. For example, "maximise paperclips" with a backup goal of "do what you are told" is the same as having the main goal "maximise paperclips while doing what you are told".

If the "backup goal" is a separate mode which we can switch an AI into, eg. "stop all external interaction", then it will necessarily conflict with the the AI's main goal: it can't maximise paperclips if it stops all external interaction. Hence the primary goal induces a secondary goal: "in order to maximise paperclips, I should prevent anyone switching me to my backup goal". These kind of secondary goals have been raised by Steve Omohundro.

Buying one $1 lottery ticket earns you a tiny chance - 1 in 175,000,000 for the Powerball - of becoming absurdly wealthy. NOT buying that one ticket gives you a chance of zero.

There are ways to win a lottery without buying a ticket. For example, someone may buy you a ticket as a present, without your knowledge, which then wins.

So buying one ticket is "infinitely" better than buying no tickets.

No, it is much more likely that you'll win the lottery by buying tickets than by not buying tickets (assuming it's unlikely to be gifted a ticket), but the cost of being gifted a ticket is zero, which makes not buying tickets an "infinitely" better return on investment.

My intuition is that the described AIXItl implementation fails because it's implementation is too low-level. A higher-level AIXItl can succeed though, so it's not a limitation in AIXItl. Consider the following program:

P1) Send the current machine state* as input to a 'virtual' AIXItl.

P2) Read the output of this AIXItl step, which will be a new program.

P3) Write a back up of the current machine state*. This could be in a non-executing register, for example.

P4) Replace the machine's state (but not the backup!) to match the program provided by AIXItl.

Now, as AlexMennen notes, this machine will no longer be AIXItl and in all probability it will 'brick' itself. However, we can rectify this. The AIXItl agent is 'virtual' (ie. not directly hooked up to the machine's IO), so we can interpret its output programs in a safe way:

We can use a total language, such that all outputted programs eventually halt.

We can prevent the language having (direct) access to the backup.

We can append a "then reload the backup" instruction to all programs.

This is still AIXItl, not a "variant". It's just running on a rather complex virtual machine. From AIXItl's Cartesian point of view:

A1) Take in an observation, which will be provided in the form of a robot's configuration. A2) Output an action, in the form of a new robot configuration which will be run to completion. A3) GOTO 1.

From an embodied viewpoint, we can see that the robot AIXItl *thinks* it's programming doesn't exactly correspond to the robot which actually exists (in particular, it doesn't know that the real robot is also running AIXItl!). Also, where AIXItl measures time in terms of IO cycles, we can see that an arbitrary amount of time may pass between steps A1 and A2 (where AIXItl is 'thinking') and between steps A2 and A3 (where the robot is executing the new program, and AIXItl only exists in the backup).

This setup doesn't solve all Cartesian problems, for example AIXItl doesn't understand that it might die, it has no control over the backup (which a particularly egregious Omega might place restrictions on**) and the backup-and-restore scheme (just like anything else) might be interfered with by the environment. However, this article's main thrust is that a machine running AIXItl is unable to rewrite its code, which is false.

- Note that this doesn't need to be complete; in particular, we can ignore the current state of execution. Only the "code" and sensor data need to be included.

** With more effort, we could have AIXItl's output programs contain the backup and restore procedures, eg. validated by strong types. This would allow a choice of different backup strategies, depending on the environment (eg. "Omega wants this register to be empty, so I'll write my backup to this hard drive instead, and be sure to restore it afterwards")

One of my mistakes was believing in Bayesian decision theory, and in constructive logic at the same time. This is because traditional probability theory is inherently classical, because of the axiom that P(A + not-A) = 1.

Could you be so kind as to expand on that?

Classical logics make the assumption that all statements are either exactly true or exactly false, with no other possibility allowed. Hence classical logic will take shortcuts like admitting not(not(X)) as a proof of X, under the assumptions of consistency (we've proved not(not(X)) so there is no proof of not(X)), completeness (if there is no proof of not(X) then there must be a proof of X) and proof-irrelevance (all proofs of X are interchangable, so the existence of such a proof is acceptable as proof of X).

The flaw is, of course, the assumption of a complete and consistent system, which Goedel showed to be impossible for systems capable of modelling the Natural numbers.

Constructivist logics don't assume the law of the excluded middle. This restricts classical 'truth' to 'provably true', classical 'false' to 'provably false' and allows a third possibility: 'unproven'. An unproven statement might be provably true or provably false or it might be undecidable.

From a probability perspective, constructivism says that we shouldn't assume that P(not(X)) = 1 - P(X), since doing so is assuming that we're using a complete and consistent system of reasoning, which is impossible.

Note that constructivist systems are compatible with classical ones. We can add the law of the excluded middle to a constructive logic and get a classical one; all of the theorems will still hold and we won't introduce any inconsistencies.

Another way of thinking about it is that the law of the excluded middle assumes that a halting oracle exists which allows us to take shortcuts in our proofs. The results will be consistent, since the oracle gives correct answers, but we can't tell which results used the oracle as a shortcut (and hence don't need it) and which would be impossible without the oracle's existence (and hence don't exist, since halting oracles don't exist).

The only way to work out which ones are shortcuts is to take 'the long way' and produce a separate proof which doesn't use an oracle; these are exactly the constructive proofs!

We can generalise votes to carry different weights. Starting today, everyone who currently has one vote continues to have one vote. When someone makes a copy (electronic or flesh), their voting power is divided between themselves and the copy. The total amount of voting power is conserved and, assuming that copies default to the political opinion of their prototypes, the political landscape only moves when someone changes their mind.

We only need to perform proof search when we're given some unknown blob of code. There's no need to do a search when we're the ones writing the code; we know it's correct, otherwise we wouldn't have written it that way.

Admittedly many languages allow us to be very sloppy; we may not have to justify our code to the compiler and the language may not be powerful enough to express the properties we want. However there are some languages which allow this (Idris, Agda, Epigram, ATS, etc.). In such languages we don't actually write a program at all; we write down a constructive proof that our properties are satisfied by some program, then the compiler derives that program from the proof. Such programs are known as "correct by construction".

http://en.wikipedia.org/wiki/Program_derivation http://en.wikipedia.org/wiki/Proof-carrying_code

Scheme requires tail-call optimisation, so if you use tail-recursion then you'll never overflow.

Page 4 footnote 8 in the version you saw looks like footnote 9 in mine.

I don't see how 'proof-of-bottom -> bottom' makes a system inconsistent. This kind of formula appears all the time in Type Theory, and is interpreted as "not(proof-of-bottom)".

The 'principle of explosion' says 'forall A, bottom -> A'. We can instantiate A to get 'bottom -> not(proof-of-bottom)', then compose this with "proof-of-bottom -> bottom" to get "proof-of-bottom -> not(proof-of-bottom)". This is an inconsistency iff we can show proof-of-bottom. If our system is consistent, we can't construct a proof of bottom so it remains consistent. If our system is inconsistent then we can construct a proof of bottom and derive bottom, so our system remains inconsistent.

Have I misunderstood this footnote?

[EDIT: Ignore me for now; this is of course Lob's theorem for bottom. I haven't convinced myself of the existence of modal fixed points yet though]

Can humans "invent a new tool entirely", when all we have to work with are a handful of pre-defined quarks, leptons and bosons? AIXI is hard-coded to just use one tool, a Turing Machine; yet the open-endedness of that tool make it infinitely inventive.

We can easily put a machine shop, or any other manufacturing capabilities, into the abstract room. We could ignore the tedious business of manufacturing and just include a Star-Trek-style replicator, which allows the AI to use anything for which is can provide blueprints.

Also, we can easily be surprised by actions taken in the room. For example, we might simulate the room according to known scientific laws, and have it automatically suspend if anything strays too far into uncertain territory. We can then either abort the simulation, if something dangerous or undesirable is happening within, or else perform an experiment to see what would happen in that situation, then feed the result back in and resume. That would be a good way to implement an artificial scientist. Similar ideas are explored in http://lambda-the-ultimate.org/node/4392