I'm pleased to announce a new paper from MIRI: *Questions of Reasoning Under Logical Uncertainty*.

Abstract:

A logically uncertain reasoner would be able to reason as if they know both a programming language and a program, without knowing what the program outputs. Most practical reasoning involves some logical uncertainty, but no satisfactory theory of reasoning under logical uncertainty yet exists. A better theory of reasoning under logical uncertainty is needed in order to develop the tools necessary to construct highly reliable artificial reasoners. This paper introduces the topic, discusses a number of historical results, and describes a number of open problems.

Following *Corrigibility *and *Toward Idealized Decision Theory, *this is the third in a series of six papers motivating MIRI's technical research agenda. This paper mostly motivates and summarizes the state of the field, and contains one very minor new technical result. Readers looking for more technical meat can find it in Paul Christiano's paper *Non-Omniscience, Probabilistic Inference, and Metamathematics**, *published mid-2014. This paper is instead intended to motivate the study of logical uncertainty as relevant to the design of highly reliable smarter-than-human systems. The introduction runs as follows:

Consider a black box with one input chute and two output chutes. The box is known to take a ball in the input chute and then (via some complex Rube Goldberg machine) deposit the ball in one of the output chutes.

An *environmentally uncertain* reasoner does not know which Rube Goldberg machine the black box implements. A *logically uncertain* reasoner may know which machine the box implements, and may understand how the machine works, but does not (for lack of computational resources) know how the machine behaves.

Standard probability theory is a powerful tool for reasoning under environmental uncertainty, but it assumes logical omniscience: once a probabilistic reasoner has determined precisely which Rube Goldberg machine is in the black box, they are assumed to know which output chute will take the ball. By contrast, realistic reasoners must operate under logical uncertainty: we often know how a machine works, but not precisely what it will do.

General intelligence, at the human level, mostly consists of reasoning that involves logical uncertainty. Reasoning about the output of a computer program, the behavior of other actors in the environment, or the implications of a surprising observation are all done under logical (in addition to environmental) uncertainty. This would also be true of smarter-than-human systems: constructing a completely coherent Bayesian probability distribution in a complex world is intractable. Any artificially intelligent system writing software or evaluating complex plans must necessarily perform some reasoning under logical uncertainty.

When constructing smarter-than-human systems, the stakes are incredibly high: superintelligent machines could have an extraordinary impact upon humanity (Bostrom 2014), and if that impact is not beneficial, the results could be catastrophic (Yudkowsky 2008). If that system is to attain superintelligence by way of self-modification, logically uncertain reasoning will be critical to its reliability. The initial system's ability must reason about the unknown behavior of a known program (the contemplated self-modification) in order to understand the result of modifying itself.

In order to pose the question of whether a practical system reasons well under logical uncertainty, it is first necessary to gain a theoretical understanding of logically uncertain reasoning. Yet, despite significant research started by Los (1995) and Gaifman (1964), and continued by Halpern (2003), Hutter (2013), Demski (2012), Christiano (2014a) and many, many others, this theoretical understanding does not yet exist.

It is natural to consider extending standard probability theory to include the consideration of worlds which are "logically impossible" (such as where a deterministic Rube Goldberg machine behaves in a way that it doesn't). This gives rise to two questions: What, precisely, are logically impossible possibilities? And, given some means of reasoning about impossible possibilities, what is a reasonable prior probability distribution over them?

This paper discusses the field of reasoning under logical uncertainty. At present, study into logically uncertain reasoning is largely concerned with the problem of reasoning probabilistically about sentences of logic. Sections 2 and 3 discuss the two problems posed above in that context. Ultimately, our understanding of logical uncertainty will need to move beyond the domain of logical sentences; this point is further explored in Section 4. Section 5 concludes by relating these problems back to the design of smarter-than-human systems which are reliably aligned with human interests.