Confusions Concerning Pre-Rationality

byabramdemski10mo23rd May 201829 comments


Robin Hanson's Uncommon Priors Require Origin Disputes is a short paper with, according to me, a surprisingly high ratio of does-something-interesting-there per character. It is not clearly right, but it merits some careful consideration. If it is right, it offers strong reason in support of the common prior assumption, which is a major crux of certain modest-epistemology flavored arguments.

Wei Dai wrote two posts reviewing the concepts in the paper and discussing problems/implications. I recommend reviewing those before reading the present post, and possibly the paper itself as well.

Robin Hanson's notion of pre-rationality is: an agent's counterfactual beliefs should treat the details of its creation process like an update. If the agent is a Bayesian robot with an explicitly programmed prior, then the agent's distribution after conditioning on any event "programmer implements prior p" should be exactly p.

These beliefs are "counterfactual" in that agents are typically assumed to know their priors already, so that the above conditional probability is not well-defined for any choice of p other than the agent's true prior. This fact leads to a major complication in the paper; the pre-rationality condition is instead stated in terms of hypothetical "pre-agents" which have "pre-priors" encoding the agent's counterfactual beliefs about what the world would have been like if the agent had had a different prior. (I'm curious what happens if we drop that assumption, so that we can represent pre-rationality within the agent's same prior.)

Wei Dai offers an example in which a programmer flips a coin to determine whether a robot believes coin-flips to have probability 2/3rds or 1/3rd. Pre-rationality seems like an implausible constraint to put on this robot, because the programmer's coin-flip is not good reason to form such expectations about other coins.

Wei Dai seems to be arguing against a position which Robin Hanson isn't quite advocating. Wei Dai's accusation is that pre-rationality implies a belief that the process which created you was itself a rational process, which is not always plausible. Indeed, it's easy to see this interpretation in the math. However, Robin Hanson's response indicates that he doesn't see it:

I just don't see pre-rationality being much tied to whether you in fact had a rational creator. The point is, as you say, to consider the info in the way you were created.

Unfortunately, the discussion in the comments doesn't go any further on this point. However, we can make some inferences about Robin Hanson's position from the paper itself.

Robin Hanson does not discuss the robot/programmer example; instead, he discusses the possibility that people have differing priors due to genetic factors. Far from claiming people are obligated by rationality principles to treat inherited priors as rational, Robin Hanson says that because we know some randomness is involved in Mendelian inheritance, we can't both recognize the arbitrariness of our prior's origin and stick with that prior. Quoting the paper on this point:

Mendel’s rules of genetic inheritance, however, are symmetric and random between siblings. If optimism were coded in genes, you would not acquire an optimism gene in situations where optimism was more appropriate, nor would your sister’s attitude gene track truth any worse than your attitude gene does.
Thus it seems to be a violation of pre-rationality to, conditional on accepting Mendel’s rules, allow one’s prior to depend on individual variations in genetically-encoded attitudes. Having your prior depend on species-average genetic attitudes may not violate pre-rationality, but this would not justify differing priors within a species.

Robin Hanson suggests that pre-rationality is only plausible conditional on some knowledge we have gained throughout our lifetime about our own origins. He posits a sentence B which contains this knowledge, and suggests that the pre-rationality condition can be relativized to B. In the above-quoted case, B would consist of Mendelian inheritance and the genetics of optimism. Robin Hanson is not saying that genetic inheritance of optimism or pessimism is a rational process, but rather, he is saying that once we know about these genetic factors, we should adjust our pessimism or optimism toward the species average. After performing this adjustment, we are pre-rational: we consider any remaining influences on our probability distribution to have been rational.

Wei Dai's argument might be charitably interpreted as objecting to this position by offering a concrete case in which a rational agent does not update to pre-rationality in this way: the robot has no motivation to adjust for the random noise in its prior, despite its recognition of the irrationality of the process by which it inherited this prior. However, I agree with Robin Hanson that this is intuitively quite problematic, even if no laws of probability are violated. There is something wrong with the robot's position, even if the robot lacks cognitive tools to escape this epistemic state.

However, Wei Dai does offer a significant response to this: he complains that Robin Hanson says too little about what the robot should do to become pre-rational from its flawed state. The pre-rationality condition provides no guidance for the robot. As such, what guidance can pre-rationality offer to humans? Robin Hanson's paper admits that we have to condition on B to become pre-rational, but offers no account whatsoever about the structure of this update. What normative structure should we require of priors so that an agent becomes pre-rational when conditioned on the appropriate B?

Here is the text of Wei Dai's sage complaint:

Assuming that we do want to be pre-rational, how do we move from our current non-pre-rational state to a pre-rational one? This is somewhat similar to the question of how do we move from our current non-rational (according to ordinary rationality) state to a rational one. Expected utility theory says that we should act as if we are maximizing expected utility, but it doesn't say what we should do if we find ourselves lacking a prior and a utility function (i.e., if our actual preferences cannot be represented as maximizing expected utility).
The fact that we don't have good answers for these questions perhaps shouldn't be considered fatal to pre-rationality and rationality, but it's troubling that little attention has been paid to them, relative to defining pre-rationality and rationality. (Why are rationality researchers more interested in knowing what rationality is, and less interested in knowing how to be rational? Also, BTW, why are there so few rationality researchers? Why aren't there hordes of people interested in these issues?)

I find myself in the somewhat awkward position of agreeing strongly with Robin Hanson's intuitions here, but also having no idea how it should work. For example, suppose that we have a robot whose probabilistic beliefs are occasionally modified by cosmic rays. These modification events can be thought of as the environment writing a new "prior" into the agent. We cannot perfectly safeguard the agent against this, but we can write the agent's probability distribution such that so long as it is not too damaged, it can self-repair when it sees evidence that its beliefs have been modified by the environment. This seems like an updating-to-pre-rationality move, with "a cosmic ray hit you in this memory cell" playing the role of B.

Similarly, it seems reasonable to do something like average beliefs with someone if you discover that your differing beliefs are due only to genetic chance. Yet, it does not seem similarly reasonable to average values, despite the distinction between beliefs and preferences being somewhat fuzzy.

This is made even more awkward by the fact that Robin Hanson has to create the whole pre-prior framework in order to state his new rationality constraint.

The idea seems to be that a pre-prior is not a belief structure which an actual agent has, but rather, is a kind of plausible extrapolation of an agent's belief structure which we layer on top of the true belief structure in order to reason about the new rationality constraint. If so, how could this kind of rationality constraint be compelling to an agent? The agent itself doesn't have any pre-prior. Yet, if we have an intuition that Robin Hanson's argument implies something about humans, then we ourselves are agents who find arguments involving pre-priors to be relevant.

Alternatively, pre-priors could be capturing information about counterfactual beliefs which the agent itself has. This seems less objectionable, but it brings in tricky issues of counterfactual reasoning. I don't think this is likely to be the right path to properly formalizing what is going on, either.

I see two clusters of approaches:

  • What rationality conditions might we impose on a Bayesian agent such that it updates to pre-rationality given "appropriate" B? Can we formalize this purely within the agent's own prior, without the use of pre-priors?
  • What can we say about agents becoming rational from irrational positions? What should agents do when they notice Dutch Books against their beliefs, or money-pumps against their preferences? (Logical Induction is a somewhat helpful story about the former, but not the latter.) Can we characterize the receiver of decision-theoretic arguments such as the VNM theorem, who would find such arguments interesting? If we can produce anything in this direction, can it say anything about Robin Hanson's arguments concerning pre-rationality? Does it give a model, or can it be modified to give a model, of updating to pre-rationlity?

It seems to me that there is something interesting going on here, and I wish that there were more work on Hansonian pre-rationality and Wei Dai's objection.