The first part of this post describes a way of interpreting the basic mathematics of Bayesianism. Eliezer already presented one such view at http://lesswrong.com/lw/hk/priors_as_mathematical_objects/, but I want to present another one that has been useful to me, and also show how this view is related to the standard formalism of probability theory and Bayesian updating, namely the probability space.

The second part of this post will build upon the first, and try to explain the math behind Aumann's agreement theorem. Hal Finney had suggested this earlier, and I'm taking on the task now because I recently went through the exercise of learning it, and could use a check of my understanding. The last part will give some of my current thoughts on Aumann agreement.

#### Probability Space

In http://en.wikipedia.org/wiki/Probability_space, you can see that a probability space consists of a triple:

- Ω – a non-empty set – usually called sample space, or set of states
- F – a set of subsets of Ω – usually called sigma-algebra, or set of events
- P – a function from F to [0,1] – usually called probability measure

F and P are required to have certain additional properties, but I'll ignore them for now. To start with, we’ll interpret Ω as a set of possible world-histories. (To eliminate anthropic reasoning issues, let’s assume that each possible world-history contains the same number of observers, who have perfect memory, and are labeled with unique serial numbers.) Each “event” A in F is formally a subset of Ω, and interpreted as either an actual event that occurs in every world-history in A, or a hypothesis which is true in the world-histories in A. (The details of the events or hypotheses themselves are abstracted away here.)

To understand the probability measure P, it’s easier to first introduce the probability mass function p, which assigns a probability to each element of Ω, with the probabilities summing to 1. Then P(A) is just the sum of the probabilities of the elements in A. (For simplicity, I’m assuming the discrete case, where Ω is at most countable.) In other words, the probability of an observation is the sum of the probabilities of the world-histories that it doesn't rule out.

A payoff of this view of the probability space is a simple understanding of what Bayesian updating is. Once an observer sees an event D, he can rule out all possible world-histories that are not in D. So, he can get a posterior probability measure by setting the probability masses of all world-histories not in D to 0, and renormalizing the ones in D so that they sum up to 1 while keeping the same relative ratios. You can easily verify that this is equivalent to Bayes’ rule: P(H|D) = P(D ∩ H)/P(D).

To sum up, the mathematical objects behind Bayesianism can be seen as

- Ω – a set of possible world-histories
- F – information about which events occur in which possible world-histories
- P – a set of weights on the world-histories that sum up to 1

#### Aumann's Agreement Theorem

Aumann's agreement theorem says that if two Bayesians share the same probability space but possibly different information partitions, and have common knowledge of their information partitions and posterior probabilities of some event A, then their posterior probabilities of that event must be equal. So what are information partitions, and what does "common knowledge" mean?

The information partition I of an observer-moment M divides Ω into a number of subsets that are non-overlapping, and together cover all of Ω. Two possible world-histories w1 and w2 are placed into the same subset if the observer-moments in w1 and w2 have the exact same information. In other words, if w1 and w2 are in the same element of I, and w1 is the actual world-history, then M can't rule out either w1 or w2. I(w) is used to denote the element of I that contains w.

Common knowledge is defined as follows: If w is the actual world-history and two agents have information partitions I and J, an event E is common knowledge if E includes the member of the meet I∧J that contains w. The operation ∧ (meet) means to take the two sets I and J, form their union, then repeatedly merge any of its elements (which you recall are subsets of Ω) that overlap until it becomes a partition again (i.e., no two elements overlap).

It may not be clear at first what this meet operation has to do with common knowledge. Suppose the actual world-history is w. Then agent 1 knows I(w), so he knows that agent 2 must know one of the elements of J that overlaps with I(w). And he can reason that agent 2 must know that agent 1 knows one of the elements of I that overlaps with one of these elements of J. If he carries out this inference to infinity, he'll find that both agents know that the actual world-history is in (I∧J)(w), and both know the other know, and both know the other know the other know, and so on. In other words it is common knowledge that the actual world-history is in (I∧J)(w). Since event E occurs in every world-history in (I∧J)(w), it's common knowledge that E occurs in the actual world-history.

Proof for the agreement theorem then goes like this. Let E be the event that agent 1 assigns a posterior probability (conditioned on everything it knows) of q1 to event A and agent 2 assigns a posterior probability of q2 to event A. If E is common knowledge at w, then both agents know that P(A | I(v)) = q1 and P(A | J(v)) = q2 for every v in (I∧J)(w). But this implies P(A | (I∧J)(w)) = q1 and P(A | (I∧J)(w)) = q2 and therefore q1 = q2. (To see this, suppose you currently know only (I∧J)(w), and you know that no matter what additional information I(v) you obtain, your posterior probability will be the same q1, then your current probability must already be q1.)

Is Aumann Agreement Overrated?

Having explained all of that, it seems to me that this theorem is less relevant to a practical rationalist than I thought before I really understood it. After looking at the math, it's apparent that "common knowledge" is a much stricter requirement than it sounds. The most obvious way to achieve it is for the two agents to simply tell each other I(w) and J(w), after which they share a new, common information partition. But in that case, agreement itself is obvious and there is no need to learn or understand Aumann's theorem.

There are some papers that describe ways to achieve agreement in other ways, such as iterative exchange of posterior probabilities. But in such methods, the agents aren't just moving closer to each other's beliefs. Rather, they go through convoluted chains of deduction to infer what information the other agent must have observed, given his declarations, and then update on that new information. (The process is similar to the one needed to solve the second riddle on this page.) The two agents essentially still have to communicate I(w) and J(w) to each other, except they do so by exchanging posterior probabilities and making logical inferences from them.

Is this realistic for human rationalist wannabes? It seems wildly implausible to me that two humans can communicate all of the information they have that is relevant to the truth of some statement just by repeatedly exchanging degrees of belief about it, except in very simple situations. You need to know the other agent's information partition exactly in order to narrow down which element of the information partition he is in from his probability declaration, and he needs to know that you know so that he can deduce what inference you're making, in order to continue to the next step, and so on. One error in this process and the whole thing falls apart. It seems much easier to just tell each other what information the two of you have directly.

Finally, I now see that until the exchange of information completes and common knowledge/agreement is actually achieved, it's rational for even honest truth-seekers who share common priors to disagree. Therefore, two such rationalists may persistently disagree just because the amount of information they would have to exchange in order to reach agreement is too great to be practical. This is quite different from the understanding of Aumann agreement I had before I read the math.

I think there's another, more fundamental reason why Aumann agreement doesn't matter in practice. It requires each party to assume the other is completely rational and honest.

Acting as ifthe other party is rational is good for promoting calm and reasonable discussion.Seriously considering the possibilitythat the other party is rational is certainly valuable. Butassuming that the other party is in fact totally rationalis just silly. Weknowwe're talking to other flawed human beings, and either or both of us might just be totally off base, even if we're hanging around on a rationality discussion board.One question on your objections: how would you characterize the state of two human rationalist wannabes who have failed to reach agreement? Would you say that their disagreement is common knowledge, or instead are they uncertain if they have a disagreement?

ISTM that people usually find themselves rather certain that they are in disagreement and that this is common knowledge. Aumann's theorem seems to forbid this even if we assume that the calculations are intractable.

The rational way to characterize the situation, if in fact intractability is a practical o... (read more)

I too found my understanding changed dramatically when I looked into Aumann's original paper. Basically, the result has a misleading billing - and those citing the result rarely seemed to bother explaining much about the actual result or its significance.

I also found myself wondering why people remained puzzled about the high observed levels of disagreement. It seems obvious to me that people are poor approximations of truth-seeking agents - and instead promote their own interests. If you understand that, then the existence of many real-world disagreements is explained: people disagree in order to manipulate the opinions and actions of others for their own benefit.

Sure all by itself this first paper doesn't seem very relevant for real disagreements, but there is a whole literature beyond this first paper, which weakens the assumptions required for similar results. Keep reading.

Should people

reallyadopt the "common knowledge" terminology? Surely that terminology is highly misleading and is responsible for many misunderstandings.If people take common English words and give them an esoteric technical meaning that differs dramatically from a literal reading, then shouldn't they

at leastcapitalise them?Sorry, I think I got a bit confused about the "meet" operation, mind clarifying?

is (I^J)(w) equal to the intersection of I(w) and J(w) (which seems to be the implied way it works based on the overall description here) or something else? (Since the definition of meet you gave involved

unionsrather than intersections, and some sort of merging operation)Thanks.

EDIT: whoops. am stupid today. Meant to say intersection, not disjunction

I'm not sure I understand how $\Omega$ represents the set of world histories. If world histories were to live anywhere, they'd live in the sigma algebra — as collections of events, per the definition. If not, and every element of $\Omega$ truly is a world history, then how can $F$ represent "information about which events occur in which possible world-histories", when each $f \in F$ is made up of atoms from $\Omega$, that is, when every element in $F$ is a collection of world histories? One of these definitions ought to be recast, I believe. It might be most sensible to make $\Omega$ the set of all possible events across all possible histories, that way you can largely keep your other definitions as-is

Interesting that the problems with Aumann's theorem were pointed out ten years ago, but belief in it continues to be prevalent.

Diagrams would be wonderful, anyone up to drawing them?

I think that I understand this proof now. Does the following dialogue capture it?

AGENT 1: My observations establish that our world is in the world-set

S. However, as far as I can tell, any world inScould be our world.AGENT 2: My observations establish that our world is in the world-set

T. However, as far as I can tell, any world inTcould be our world.TOGETHER: So now we both know that our world is in the world-set

S∩T—though, as far as we can tell, any world inS∩Tcould be our world. Therefore, since we share the same priors, we both arriv... (read more)Efforts to illuminate Aumann's disagreement result do seem rather rare - thanks for your efforts here.

It appears to me that reducing this to an equation is totally irrelevant, in that it obscures the premises of the argument, and an argument is only as good as the reliability of the premises. Moreover, the theorem appears faulty based on inductive logic, in that the premises can be true and the conclusion false. I'm really interested in why this thought process is wrong.

While I see your point, I wouldn't say that the agreement issue is over rated at all.

There are many disagreements that don't change at all over arbitrarily many iterations, which sure don't look right given AAT. Even if the beliefs don't converge exactly, I don't think its too much to ask for some motion towards convergence.

I think the more important parts are the parts that talk about predicting disagreements

The main problem I have always had with this is that the reference set is "actual world history" when in fact that is the exact thing that observers are trying to decipher.

We all realize that there is in fact an "actual world history" however if it was known then this wouldn't be an issue. Using it as a reference set then, seems spurious in all practicality.

I think that summatio... (read more)