# 29

I'm pleased to announce a new paper from MIRI: Questions of Reasoning Under Logical Uncertainty.

Abstract:

A logically uncertain reasoner would be able to reason as if they know both a programming language and a program, without knowing what the program outputs. Most practical reasoning involves some logical uncertainty, but no satisfactory theory of reasoning under logical uncertainty yet exists. A better theory of reasoning under logical uncertainty is needed in order to develop the tools necessary to construct highly reliable artificial reasoners. This paper introduces the topic, discusses a number of historical results, and describes a number of open problems.

Following Corrigibility and Toward Idealized Decision Theorythis is the third in a series of six papers motivating MIRI's technical research agenda. This paper mostly motivates and summarizes the state of the field, and contains one very minor new technical result. Readers looking for more technical meat can find it in Paul Christiano's paper Non-Omniscience, Probabilistic Inference, and Metamathematics, published mid-2014. This paper is instead intended to motivate the study of logical uncertainty as relevant to the design of highly reliable smarter-than-human systems. The introduction runs as follows:

Consider a black box with one input chute and two output chutes. The box is known to take a ball in the input chute and then (via some complex Rube Goldberg machine) deposit the ball in one of the output chutes.

An environmentally uncertain reasoner does not know which Rube Goldberg machine the black box implements. A logically uncertain reasoner may know which machine the box implements, and may understand how the machine works, but does not (for lack of computational resources) know how the machine behaves.

Standard probability theory is a powerful tool for reasoning under environmental uncertainty, but it assumes logical omniscience: once a probabilistic reasoner has determined precisely which Rube Goldberg machine is in the black box, they are assumed to know which output chute will take the ball. By contrast, realistic reasoners must operate under logical uncertainty: we often know how a machine works, but not precisely what it will do.

General intelligence, at the human level, mostly consists of reasoning that involves logical uncertainty. Reasoning about the output of a computer program, the behavior of other actors in the environment, or the implications of a surprising observation are all done under logical (in addition to environmental) uncertainty. This would also be true of smarter-than-human systems: constructing a completely coherent Bayesian probability distribution in a complex world is intractable. Any artificially intelligent system writing software or evaluating complex plans must necessarily perform some reasoning under logical uncertainty.

When constructing smarter-than-human systems, the stakes are incredibly high: superintelligent machines could have an extraordinary impact upon humanity (Bostrom 2014), and if that impact is not beneficial, the results could be catastrophic (Yudkowsky 2008). If that system is to attain superintelligence by way of self-modification, logically uncertain reasoning will be critical to its reliability. The initial system's ability must reason about the unknown behavior of a known program (the contemplated self-modification) in order to understand the result of modifying itself.

In order to pose the question of whether a practical system reasons well under logical uncertainty, it is first necessary to gain a theoretical understanding of logically uncertain reasoning. Yet, despite significant research started by Los (1995) and Gaifman (1964), and continued by Halpern (2003), Hutter (2013), Demski (2012), Christiano (2014a) and many, many others, this theoretical understanding does not yet exist.

It is natural to consider extending standard probability theory to include the consideration of worlds which are "logically impossible" (such as where a deterministic Rube Goldberg machine behaves in a way that it doesn't). This gives rise to two questions: What, precisely, are logically impossible possibilities? And, given some means of reasoning about impossible possibilities, what is a reasonable prior probability distribution over them?

This paper discusses the field of reasoning under logical uncertainty. At present, study into logically uncertain reasoning is largely concerned with the problem of reasoning probabilistically about sentences of logic. Sections 2 and 3 discuss the two problems posed above in that context. Ultimately, our understanding of logical uncertainty will need to move beyond the domain of logical sentences; this point is further explored in Section 4. Section 5 concludes by relating these problems back to the design of smarter-than-human systems which are reliably aligned with human interests.

# 29

New Comment

Nate hello.

Do you have an opinion on my complexity theoretic approach?

It looks like an interesting question, but I don't quite see how you plan to use this to generate a theory of reasoning under logical uncertainty. Spell it out for me? :-)

The optimal "A" in my question is a way of assigning probabilities to hypotheses of the form "word x belongs to NP-language X" which are efficiently computable (also, I believe it is possible to extend this at least to languages in L^NP).

Consider a sentence phi in some formal theory T. For any given k, the hypothesis that phi is provable in T by a proof of length at most k is of the above form. The same is true of not-phi. This way we can assign phi the probability

P_k(phi) = P(phi has a proof of length <= k) / (P(phi has a proof of length <= k) + P(not-phi has a proof of length <= k))

It is not hard to see that the properties of A imply that P_k(phi) -> 0/1 when k->infinity for consistent T and phi refutable/provable in T.

I think this establishes some relevance to logical uncertainty in the "classical" sense but it doesn't seem to solve tiling agents since it is still restricted to reasoning within a fixed theory T. However, there are other reasons why NP-estimators might be able to solve tiling agents.

One way to think of the tiling agents problem is the computational impossibility of simulating the successor agent in reasonable time. Now, suppose that we put some restrictions on the form of agents that we allow, namely that our agents are search algorithms that divide the strategy space into "chunks" and perform optimization within each chunk in some way (which can be anything). In the end, the best out of all found strategies is selected. If the parent agent knew which chunk the child agent would select it would be able to simulate it by running it on one chunk only. However, our NP-estimator allows getting rid of exactly the type of exponential slow-downs that the multitude of chunks creates enabling the parent to perform a probabilistic "pseudosimulation".

Possibly this logic can be generalized to show that the estimator can be used to estimate in logarithmic time any computation within NC, thus allowing any agent algorithm as long as it's sufficiently parallelizable. This requires further thought.

It also should be possible to use A directly to construct an efficiently computable approximation of Solomonoff induction. Namely, once we introduce a cutoff on the runtime of the hypotheses we allow within our Solomonoff ensemble, computing Solomonoff induction amounts to a counting problem. However, counting problems have efficient approximations modulo an NP-oracle and it seems that it's possible to replace the oracle by a "probability assigner" to yield an efficient (but obviously less accurate) approximation in P.

This computable induction can be used to remove another obstruction of "pseudosimulating" the child agent, namely the unknown behavior of the environment in the future.

What do you think of / how do you classify the "counterfactual regret minimization" strategy used to solve the Heads-Up Limit Hold’em poker game?

A good popular writeup is here.

This looks like a powerful heuristic for resolving environmental uncertainty; I don't think it's relevant to the study of logical uncertainty in particular. (Correct me if I'm wrong, though.)

What I do not understand is the difference between a Rube Goldberg machine inside a box and cards dealt to the other player. Why is one "logical" and the other "environmental" uncertainty?

Environmental uncertainty occurs when you don't know which Rube Goldberg machine is in the box. Logical uncertainty occurs when you can't deduce how a specific Rube Goldberg machine behaves (for lack of resources, not for lack of understanding of the rules).

In poker, not knowing which cards the opponent has is environmental uncertainty, and not knowing which hands they can make from those cards (because you lack computation, despite knowing the rules) would be logical uncertainty.

I am aware of it, but I haven't read it. Have you read it, and do you think it's worthwhile? (Note that I have read some of Russell's other recent work on logical uncertainty, such as this short paper.)

It's actually Stuart Russell putting out some work his very smart friend Eric Wefald did (Eric tragically lost his life in I think a car accident before the book went to print). It is a bit dated, some of the value is in the keyword "limited rationality" (which is how some people think about what you call logical uncertainty).

I remember Stuart talking about limited rationality back when I was an undergrad (so this idea was on people's radar for quite a long while).

The link to Non-Omniscience, Probabilistic Inference, and Metamathematics isn't right. Also, 'published earlier this year' is now wrong, it should be 'midway through last year' :D

[-]gjm120

It should be "in mid-2014" so that it doesn't go out of date another year from now.

Fixed, thanks.

Your description of "true arithmetic" seems to be unusually realist. Do you mean that, or is it meant to illustrate a way of thinking about what a logical probability on models is a 'probability of'?

Pg. 5, 2^|−ψ| should be 2^-|ψ|.

I'm not sold on your discussion of desiderata for priors. Specifically, on the properties that a prior can have without its computable approximation having them - I almost feel like these should be split off into a separate section on desiderata for uncomputable priors over models. Because the probability distribution over models should update on deduction as the agent expends computational resources, I don't think one can usefully talk about the properties of a (actually computed) prior alone in the limit of large resources. The entire algorithm for assigning logical probability is, I think, the thing whose limit should be checked for the desiderata for uncomputable priors.

Yes, the description of true arithmetic is a bit realist; I'm not particularly sold on the realism of true arithmetic, and yes, it's mostly meant to illustrate a way of thinking about logical uncertainty. Basically, the question of logical uncertainty gets more interesting when you try to say "I have one particular model of PA in mind, but I can't compute which one; what should my prior be?"

Typo fixed; thanks.

The entire algorithm for assigning logical probability is, I think, the thing whose limit should be checked for the desiderata for uncomputable priors.

I agree.

I don't think one can usefully talk about the properties of a (actually computed) prior alone in the limit of large resources.

I see these desiderata more as a litmus test. I expect the approximation algorithm and the prior it's approximating will have to be developed hand-in-hand; these desiderata provide good litmus tests for whether an idea is worth looking into. I tend to think about the problem by looking for desirable limits with approximation in mind, but I agree that you could also attack the problem from the other direction.

[-][anonymous]00

Hi, I come from a regular science background (university grad student) so I may be biased, but I still have some questions.

Reasoning under uncertainty sounds a lot like Fuzzy Logic. Can you elaborate on the difference of your approach?

By contrast, realistic reasoners must operate under logical uncertainty: we often know how a machine works, but not precisely what it will do.

What do you exactly mean with "we know how it works"? Is there a known or estimated probability for what the machine will do?

Logically uncertain reasoning, then, requires the consideration of logically impossible possibilities.

"impossible possibilities" sounds like a contradiction (an Oxymoron ?) which I think is unusual for science writing. Does this add something to the paper or wouldn't another term be better?

Why do you consider that the black box implements a "Rube Goldberg machine"? I looked up Rube Goldberg machine on Wikipedia and for me it sounds more like a joke than something that requires scientific assessment. Is there other literature on that?

Have you considered sending your work to a peer-reviewed conference or journal? This could give you some feedback and add more credibility to what you are doing.

Best regards and I hope this doesn't sound too critical. Just want to help.

Hmm, you seem to have missed the distinction between environmental uncertainty and logical uncertainty.

Imagine a black box with a Turing machine inside. You don't know which Turing machine is inside; all you get to see are the inputs and the outputs. Even if you had unlimited deductive capability, you wouldn't know how the black box behaved: this is because of your environmental uncertainty, of not knowing which Turing machine the box implemented.

Now imagine a python computer program. You might read the program and understand it, but not know what it outputs (for lack of deductive capability). As a simple concrete example, imagine that the program searches for a proof of the Riemann hypothesis using less than a googol symbols: in this case, the program may be simple, but the output is unknown (and very difficult to determine). Your uncertainty in this case is logical uncertainty: you know how the machine works, but not what it will do.

Existing methods for reasoning under uncertainty (such as standard Bayesian probability theory) all focus on environmental uncertainty: they assume that you have unlimited deductive capability. A principled theory of reasoning under logical uncertainty does not yet exist.

"impossible possibilities" sounds like a contradition

Consider that python program that searches for a proof of the Riemann hypothesis: you can imagine it outputting either "proof found" or "no proof found", but one of these possibilities is logically impossible. The trouble is, you don't know which possibility is logically impossible. Thus, when you reason about these two possibilities, you are considering at least one logically impossible possibility.

1. Fuzzy logic is only loosely related. It's traditionally used in scenarios where the objects themselves can be "partly true", whereas in most simple models of logical uncertainty we consider cases where the objects are always either true or false but you haven't been able to deduce which is which yet. That said, probabilistic logics (such as this one) bear some resemblance to fuzzy logics.
2. When we say that you know how the machine works, we mean that you understand all the construction of the macihne and all the physical rules governing it. That is, we assert that you could write a computer program which would output "0" if the machine drops the ball into the top slot, and "1" if it drops the ball into the bottom slot. (Trouble is, while you could write the program, you may not know what it will output.)
3. The Rube Goldberg machine thought experiment is used to demonstrate the difference between environmental uncertainty (not knowing which machine is being used) and logical uncertainty (not being able to deduce how the machine acts). I'm sorry if the reference to physical Rube Goldberg machines confused you.
4. This is an overview document, it mostly just describes the field (which many are ignorant of) and doesn't introduce any particularly new results. Therefore, I highly doubt we'll put it through peer review. But rest assured, I really I don't think we're hurting for credibility in this domain :-)

(The field was by no means started by us. If it's arguments from authority that you're looking for, you can trace this topic back to Boole in 1854 and Bernoulli in 1713, picked up in a more recent century by Los, Gaifman, Halpern, Hutter, and many many more in modern times. See also the intro to this paper, which briefly overviews the history of the field, and which is on the same topic, and which is peer reviewed. See also many of the references in that paper; it contains a pretty extensive list.)

[-][anonymous]00

Thanks a lot, that clears up a lot of things. I guess I have to read up about the Riemann hypothesis, etc.

Maybe your Introduction could benefit from discussing the previous work of Huttner, etc. instead of putting all references in one sentence. Then the lay person would know that you are not making all the terms up.

A more precise way to avoid the oxymoron is "logically impossible epistemic possibility". I think 'Epistemic possibility' is used in philosophy in approximately the way you're using the term.