Kurros

10y00

Hmm, thanks. Seems similar to my description above, though as far as I can tell it doesn't deal with my criticisms. It is rather evasive when it comes to the question of what status models have in Bayesian calculations.

10y00

I am curious; what is the general LessWrong philosophy about what truth "is"? Personally I so far lean towards accepting an operational subjective Bayesian definition, i.e. the truth of a statement is defined only so far as we agree on some (in principle) operational procedure for determining its truth; that is we have to agree on what observations make it true or false.

For example "it will rain in Melbourne tomorrow" is true if we see it raining in Melbourne tomorrow (trivial, but also means that the truth of the statement doesn't depend on rain being "real", or just a construction of Descartes' evil demon or the matrix, or a dream, or even a hallucination). It is also a bit disturbing because the truth of "the local speed of light is a constant in all reference frames" can never be determined in such a way. We could go to something like Popper's truthlikeness, but then standard Bayesianism gets very confusing, since we then have to worry about the probability that a statement has a certain level of "truthlikeness", which is a little mysterious. Truthlikeness is nice in how it relates to the map-territory analogy though.

I am inclined to think that standard Bayesian style statements about operationally-defined things based on our "maps" makes sense, i.e. "If I go and measure how long it takes light to travel from the Earth to Mars, the result will be proportional to c" (with this being influenced by the abstraction that is general relativity), but it still remains unclear to me precisely what this means, in terms of Bayes theorem that is: i.e. the probability P("measure c" | "general relativity") implies that P("general relativity") makes sense somehow, though the operational criteria cannot be where its meaning comes from. In addition we must somehow account for that fact "general relativity" is strictly False, in the "all models are wrong" sense, so we need to somehow rejig that proposition into something that might actually be true, since it makes no sense to condition our beliefs on things we know to be false.

I suppose we might be able to imagine some kind of super-representation theorem, in the style of de-Finetti, in which we show that degrees of belief in operational statements can be represented as the model average of the predictions from all computable theories, hoping to provide an operational basis for Solomonoff induction, but actually I am still not 100% sure what de-Finetti's usual representation theorem really means. We can behave "as if" we had degrees of belief in these models weighted by some prior? Huh? Does this mean we don't really have such degrees of belief in models but they are a convenient fiction? I am very unclear on the interpretation here.

The map-territory analogy does seem correct to me, but I find it hard to reconstruct ordinary Bayesian-style statements via this kind of thinking...

10y00

Lol that is a nice story in that link, but it isn't a Dutch book. The bet in it isn't set up to measure subjective probability either, so I don't really see what the lesson in it is for logical probability.

Say that instead of the digits of pi, we were betting on the contents of some boxes. For concreteness let there be three boxes, one of which contains a prize. Say also that you have looked inside the boxes and know exactly where the prize is. For me, I have some subjective probability P( X_i | I_mine ) that the prize is inside box i. For you, all your subjective probabilities are either zero or one, since you know perfectly well where the prize is. However, if my beliefs about where the prize is follow the probability calculus correctly, you still cannot Dutch book me, even though you know where the prize is and I don't.

So, how is the scenario about the digits of pi different to this? Do you have some example of an actual Dutch book that I would accept if I were to allow logical uncertainty?

edit:

Ok well I thought of what seems to be a typical Dutch book scenario, but it has made me yet more confused about what is special about the logical uncertainty case. So, let me present two scenarios, and I wonder if you can tell me what the difference is:

Consider two propositions, A and B. Let it be the case that A->B. However, say that we do not realise this, and say we assign the following probabilities to A and B:

P(A) = 0.5

P(B) = 0.5

P(B|A) = P(B)

P(A & B) = 0.25

indicating that we think A and B are independent. Based on these probabilities, we should accept the following arrangement of bets:

Sell bet for $0.50 that A is false, payoff $1 if correct

Sell bet for $0.25 that A & B are both true, payoff $1 if correct

The expected amount we must pay out is 0.5*$1 + 0.25*$1 = $0.75, which is how much we are selling the bets for, so everything seems fair to us.

Someone who understands that A->B will happily buy these bets from us, since they know that "not A" and "A & B" are actually equivalent to "not A" and "A", i.e. he knows P(not A) + P(A & B) = 1, so he wins $1 from us no matter what is the case, making a profit of $0.25. So that seems to show that we are being incoherent if we don't know that A->B.

But now consider the following scenario; instead of having the logical relation that A->B, say that our opponent just has some extra empirical information D that we do not, so that for him P(B|A,D) = 1. For him, then, he would still say that

P(not A | D) + P(A & B | D) = P(not A | D) + P(B|A,D)*P(A|D) = P(not A|D) + P(A|D) = 1

so that we, who do not know D, could still be screwed by the same kind of trade as in the first example. But then, this is sort of obviously possible, since having more information than your opponent *should* give you a betting advantage. But both situations seem equivalently bad for us, so why are we being incoherent in the first example, but not in the second? Or am I still missing something?

10y30

That sounds to me more like an argument for needing lower p-values, not higher ones. If there are many confounding factors, you need a higher threshold of evidence for claiming that you are seeing a real effect.

Physicists need low p-values for a different reason, namely that they do very large numbers of statistical tests. If you choose p=0.05 as your threshold then it means that you are going to be claiming a false detection at least one time in twenty (roughly speaking), so if physicists did this they would be claiming false detections every other day and their credibility would plummet like a rock.

10y00

Is there any more straightforward way to see the problem? I argued with you about this for a while and I think you convinced me, but it is still a little foggy. If there is a consistency problem, surely this means that we must be vulnerable to Dutch books doesn't it? I.e. they would not seem to be Dutch books to us, with our limited resources, but a superior intelligence would know that they were and would use them to con us out of utility. Do you know of some argument like this?

10y00

Very well, then i will wait for the next entry. But i thought the fact that we were explicitly discussing things the robot could not compute made it clear that resources were limited. There is clearly no such thing as logical uncertainty to the magic logic god of the idealised case.

10y00

No we aren't, we're discussing a robot with finite resources. I obviously agree that an omnipotent god of logic can skip these problems.

10y-10

It was your example, not mine. But you made the contradictory postulate that P("wet outside"|"rain")=1 follows from the robots prior knowledge and the probability axioms, and simultaneously that the robot was unable to compute this. To correct this I alter the robots probabilities such that P("wet outside"|"rain")=0.5 until such time as it has obtained a proof that "rain" correlates 100% with "wet outside". Of course the axioms don't determine this; it is part of the robots prior, which is not determined by any axioms.

You haven't convinced nor shown me that this violates Cox's theorem. I admit I have not tried to follow the proof of this theorem myself, but my understanding was that the requirement you speak of is that the probabilistic logic reproduces classical logic in the limit of certainty. Here, the robot is not in the limit of certainty because it cannot compute the required proof. So we should not expect to get the classical logic until updating on the proof and achieving said certainty.

10y-20

You haven't been very specific about what you think I'm doing incorrectly so it is kind of hard to figure out what you are objecting to. I corrected your example to what I think it should be so that it satisfies the product rule; where's the problem? How do you propose that the robot can possibly set P("wet outside"|"rain")=1 when it can't do the calculation?

Keynes in his "Treatise on probability" talks a lot about analogies in the sense you use it here, particularly in "part 3: induction and analogy". You might find it interesting.