Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Often, when talking about the Modified Demski Prior, I get the question "Why is this any different than the original? The original could have sampled logical sentences that claim that all outputs of a given Turing machine are true, with only a bounded complexity penalty." This post gives an answer to this question.

At first it may seem that there is no siginificant difference between the Demski Prior, and the Modified Demski Prior. The former samples logical sentences, while the latter samples Turing machines (Which can be represented as logical sentences). However, there is a subtle difference that actually causes the two to have radically different behavior.

Consider a Turing machine which outputs an infinite string of sentences. The Modified Demski prior might sample , and the Demski prior might sample the sentence "Every sentence output by is true." (There is a little complication here with needing a slightly different language, but that is not what makes the big difference.)

The big difference between the two Priors is that in the Demski Prior, you can sample a sentence of the form "Not every sentence output by is true." If you do, then if you later sample "Every sentence output by is true," you will have to throw it out. There is no analogous TM you can sample in the Modified Demski prior. can only be contradicted by a earlier Turing machine which directly contradicts one (or a finite number) of the sentences output by . If you sample a machine that outputs one sentence that says that outputs at least one false sentence, and then later sample , the model you will have will believe that outputs a false sentence at some nonstandard time, while all the actual outputs of are true.

This makes a huge difference when only outputs tautologies. Note that this does not change the end prior, but it does change the approximation procedure. If is a machine that only outputs tautologies, then eventually the Modified Demski algorithm will sample , and will believe that any sentence output by is true (Even if it does not see a proof yet). This happens eventually with probability 1, since there is no consistent set of sentences which could contradict any output of . This was important for showing that the Modified Demski Prior was Uniformly Coherent.

On the other hand, the first sentence sampled by the Demski Prior may have been "Not every sentence output by is true." This sentence is consistent. This stops us from using the same proof for the Demski prior.

In a sense, the Modified Demski Prior provides us with a tool for picking out the standard natural numbers. This is because the limits we take when considering the asymptotic properties of the approximation of the prior are limits within the standard natural numbers. We can therefore sample an infinite collection of sentences at once without hiding the collection of sentences behind a that could reference nonstandard numbers.

New to LessWrong?

New Comment