Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Part of the Asymptotic Logical Uncertainty series. Abram Demski proposed a computably approximable coherent probability assignment in 2012. In this post, I will present a modification developed last year by Benja Fallenstein, Abram Demski, and myself, which samples Turing machines rather than logical sentences. I will further give a concrete computable approximation of it which is a logical predictor.

Fix a UTM , which has a read only input tape with an infinite string of 0s and 1s, and a one way write only output tape which outputs a possibly infinite string of 0s and 1s and #s. We build a coherent probability distribution as follows.

Start with an empty set of logical sentences. (Or a set containing some logical axioms you wish to start with.) One at a time sample a random infinite string of 0s and 1s and use it as input for . Interpret the output as a finite or infinite string of logical sentences separated by # symbols. If is consistent with all the sentences in simultaneously, then modify by adding in all the sentences in . Resample and repeat infinitely many times.

The probability assigned to each logical sentence is the probability that it is eventually added to in the above process. Let denote the probability assigned to .

The step where we kept or threw out based on whether or not it was consistent with required a halting oracle. However, we can approximate the above process. The following procedure takes in an time and outputs a list of sentences. It has the property the limit as goes to infinity of the probability that outputs converges to .

On input , sample binary words . Run for time steps on each word to get finite lists of sentences, . Let be a proof checker which takes a list of sentences and searches for contradiction in those sentences. Let start as an empty set of sentences. For run for time steps on . If it does not find a contradiction, modify by adding in all the sentences in .

We make three assumptions about . First, we assume that for any fixed list of sentences, eventually finds a contradiction if and only if a contradiction exists. Second, we assume that for sufficiently large , observes any propositional contradictions in the time we give it. By propositional contradictions, I mean that should observe a contradiction if it can be proven using only propositional calculus. Third, we assume that if observes a contradiction in and , then should observe a contradiction in in the same amount of time. Only the first assumption is necessary for the approximation to converge to the correct thing, but the other two will be useful for uniform coherence.

This approximation procedure only gives a random process which either contains or does not contain a sentence with probability converging to . We can turn this into a procedure which outputs probabilities. Let be a machine which on input , for each which outputs with probability at least , computes the probability that outputs each (within an error of ), and outputs the ordered pair containing and this probability. This can be done by going through all possible execution paths for until the remaining paths have probability less than .

Finally, we can turn this into a logical predictor as defined in the previous post by running on input for all natural numbers in increasing order, and outputting a single # symbol between each pair of successive values of . In the next post, we will show that this logical predictor is uniformly coherent.

New to LessWrong?

New Comment