Where does uncertainty come from?

by paulfchristiano 3 min read2nd Apr 201141 comments

9


Idealized rational agents are often discussed as operating in an uncertain environment, or as using probabilities to correctly reason anthropically (especially in MWI or Tegmark style multiverses). I believe that both of these sources of uncertainty are essential for a rational agent, but thinking about them obscures an important (I suspect ultimately more important) type of uncertainty, faced by computationally bounded agents making predictions about a fixed, perfectly understood, deterministic universe. This type of uncertainty is described by Eliezer in his description of TDT, but he doesn't deal there with any of the mathematical theory that would govern such uncertainties. I am somewhat surprised that I have not encountered more discussion of this form of uncertainty and the difficulty of dealing with it mathematically.

Consider a system consisting of two interacting algorithms: a predictor and a universe. Periodically the universe sends observations to the predictor, or asks the predictor to make a prediction (ie to answer some question about the state of the universe). The observer must answer these queries using some bounded resource as dictated by the universe; typically we imagine that the observer is allotted a fixed length of time. We rate the quality of a predictor by measuring how often its predictions agree with what the universe expects.

Now suppose I give you the description of the universe, and ask for the description of a good predictor. In some universes, I believe that there are very good predictors who perform something like Bayesian inference over mathematical statements---despite the fact that any consistent assignment of probabilities gives most statements of interest probability either 0 or 1.

For example, suppose that a particular theorem X would be useful to make future predictions. I would like to know whether or not X is true, but the proof may be too complex to deal with (or X may be independent of whatever axioms I have programmed the predictor to believe, or finding a perfectly valid proof may be too hard a problem...). To this end, the predictor may develop much more powerful rules of inference which allow it to establish the truth of X more directly, but which are only valid heuristically. That is, it may use rules of the form "If A and B are true, C is probably true."

This immediately raises two questions: how can we formally describe such a rule, and how could a predictor ever acquire this sort of probabilistic knowledge?

One way to formalize this sort of knowledge is to assign statements---for example the statement, "A and B => C"---a probability. This works well for many classes of statements, such as "The algorithm A will always output a prime factorization of its input." But we may think that a particularly implication is not likely to hold universally, while still suspecting that it does hold for a "typical" invocation in some appropriate sense. For example, we may use the rule "From A(x), B(x), conclude C(x)" and believe that the inference is valid with probability 95% for a random large x. I don't really have any idea how you should represent such knowledge, and I imagine that you could only say that a particular representation was good if it came along with a satisfactory theory for inference.

This does not answer the question of how such knowledge is acquired. I can be included as an axiom in a system: for example, I can believe the axioms of PA with probability 1, that ZFC is consistent with probability 99%, that the RH is true with probability 95%, that the RH is provable in ZFC with probability 90%, or that the results of my inference algorithm are "well-calibrated" in a precise sense with probability 50%. I can also derive probabilistic knowledge from exact knowledge: for example, I can believe that random strings probably have high Kolmogorov complexity; I can believe that if you randomly choose whether to negate a statement, the result is true with probabiltiy 50%. I can also learn acquire new beliefs by performing inference, once I already have some beliefs which are probabilistic. This is particularly important when combined with access to observations from the universe. For example, if I believe that algorithm A probably plays Go optimally, then when I observe A's play it alters many beliefs: it will affect my confidence that A actually plays optimally and all of the beliefs which shaped my belief that A plays optimally, it will affect my beliefs about the optimal play in that position, it may affect my beliefs about other Go playing algorithms I have considered, etc.

The possibility of probabilistic rather than exact beliefs may also allow us to get around hard problems with self-reference. For example, it may be that I can be "quite confident" that "almost everything" I am "quite confident" about is true, while it can never be the case that I am certain that everything I am certain about is true. I feel that questions of this form are currently of fundamental importance to our efforts to understand automated reasoning (especially for agents who are actually embedded in the universe they are reasoning about) which makes me particularly interested in this form of uncertainty. Unfortunately, even to evaluate the possibility of an end-run around incompleteness requires a much deeper understanding of mathematical uncertainty than anyone seems to have yet developed.

I haven't said anything of mathematical substance. What I mean to argue is that:

1. This is an interesting and important mathematical question.

2. There are a lot of mathematical subtleties involved; moreover, they don't just govern weird corner cases. It is generally unclear how to perform such inference or evaluate the correctness of a particular inference / the reasonableness of a set of beliefs. .

9