P: 0 <= P <= 1

DragonGod

P: 0 <= P <= 1 — LessWrong

7 P: 0 <= P <= 1

by DragonGod

27th Aug 2017

11 min read

7

Part of The Contrarian Sequences.

Reply to infinite certainty and 0 and 1 are not probabilities.

Introduction

In infinite certainty, Eliezer makes the argument that you can't ever be absolutely sure of a proposition. That is an argument I disagreed with for a long time, but due to Akrasia acedia, I never got around to writing it. I think I have a more coherent counter argument now, and would present it below. Because the post I am replying to and infinite certainty are linked, I address both of them in this post.

This doesn't mean, though, that I have absolute confidence that 2 + 2 = 4. See the previous discussion on how to convince me that 2 + 2 = 3, which could be done using much the same sort of evidence that convinced me that 2 + 2 = 4 in the first place. I could have hallucinated all that previous evidence, or I could be misremembering it. In the annals of neurology there are stranger brain dysfunctions than this.

This is true. That a statement is true does not mean that you have absolute confidence in the veracity of the statement. It is possible that you may have hallucinated everything.

Suppose you say that you're 99.99% confident that 2 + 2 = 4. Then you have just asserted that you could make 10,000 independent statements, in which you repose equal confidence, and be wrong, on average, around once.

I am not so sure of this. If I have X% confidence in a belief, and I am well calibrated, then if there were K statements for which I said I have X% confidence in, then you expect that ((100-X)/100)*K of those statements would be wrong, and the remainder would be right. It does not follow that if I have X% confidence in a belief that I can make K statements in which I repose equal confidence, and be wrong only ((100-X)/100)*K times.

It's something like X% confidence (implies) if you made K statements then ((100-X)/100)*K of those statements would be wrong.

A well calibrated agent does not have to be able to make K with only ((100-X)/100)*K wrong those statements for them to possess X% confidence in the proposition. It only indicates that in a hypothetical world in which they did make K statements, if they were well calibrated, only ((100-X)/100)*K of those statements would be wrong. To assert that a well calibrated agent must be able to make those statements before they can have X% confidence, is to establish the hypothetical as a given fact—either a honest mistake, or deliberate malice.

As for the notion that you could get up to 100% confidence in a mathematical proposition—well, really now! If you say 99.9999% confidence, you're implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once. That's around a solid year's worth of talking, if you can make one assertion every 20 seconds and you talk for 16 hours a day.

Assert 99.9999999999% confidence, and you're taking it up to a trillion. Now you're going to talk for a hundred human lifetimes, and not be wrong even once?

Assert a confidence of (1—1/googolplex) and your ego far exceeds that of mental patients who think they're God.

And a googolplex is a lot smaller than even relatively small inconceivably huge numbers like 3^^^3.

All based on the same flawed premise, and equally flawed.

I am Infinitely Certain

There is one proposition that I would start with and assign a probability of 1, not 1-1/googolplex. Not 1 - 1/3^^^^3, Not 1 - epsilon (where epsilon is an arbitrarily small number), but a probability of 1.

I exist.

Rene Descartes presents a very wonderful argument for the veracity of this statement:

Accordingly, seeing that our senses sometimes deceive us, I was willing to suppose that there existed nothing really such as they presented to us; And because some men err in reasoning, and fall into Paralogisms, even on the simplest matters of Geometry, I, convinced that I was as open to error as any other, rejected as false all the reasonings I had hitherto taken for Demonstrations; And finally, when I considered that the very same thoughts (presentations) which we experience when awake may also be experienced when we are asleep, while there is at that time not one of them true, I supposed that all the objects (presentations) that had ever entered into my mind when awake, had in them no more truth than the illusions of my dreams. But immediately upon this I observed that, whilst I thus wished to think that all was false, it was absolutely necessary that I, who thus thought, should be something; And as I observed that this truth, I think, therefore I am,^[c] was so certain and of such evidence that no ground of doubt, however extravagant, could be alleged by the Sceptics capable of shaking it, I concluded that I might, without scruple, accept it as the first principle of the philosophy of which I was in search

Eliezer quotes Rafal Smigrodski:

"I would say you should be able to assign a less than 1 certainty level to the mathematical concepts which are necessary to derive Bayes' rule itself, and still practically use it. I am not totally sure I have to be always unsure. Maybe I could be legitimately sure about something. But once I assign a probability of 1 to a proposition, I can never undo it. No matter what I see or learn, I have to reject everything that disagrees with the axiom. I don't like the idea of not being able to change my mind, ever."

I am alright with accepting as an axiom that I exist. I see no reason why I should be cautious of assigning a probability of 1 to this statement. I am infinitely certain that I exist.

If you accept Descartes argument, then this is very important. You're accepting that we can be infinitely certain about a proposition—and not just that—that it is sensible to be infinitely certain about a proposition. Usually, only one counterexample is necessary, but there are several other statements which you may assign a probability of 1 to.

I believe that I exist.

I believe that I believe that I exist.

I believe that I believe that I believe that I exist.

And so on and so forth, ad infinitum. An infinite chain of statements, all of which are exactly true. I have satisfied Eliezer's (fatuous) requirements for assigning a certain level of confidence to a proposition. If you feel that it is not sensible to assign probability 1 to the first statement, then consider this argument. I assign a probability 1 to the proposition "I exist". This means that the proposition "I exist" exists (pun intended) in my mental map of the world, and is therefore a belief of mine. By deduction, if I assign a probability of 1 to the statement "I exist", then I must assign a probability of 1 to the proposition "I believe that I exist". By induction, I must assign a probability of 1 to all the infinite statements, and all of them are true.

(I assign a probability of 1 to deduction being true).

Generally, using the power of recursion, we can pick any statement, to which we assign a probability of 1 and generate infinite more statements to which we (by deduction) also assign a probability of 1.

Let X be a proposition to which we assign a probability of 1.

define f(var, n=0) if n < 0 or type(n) != int return -1 end if if var == X and n == 0 var = ("I believe " + var + ".") print var end if n = (n < 2)?2:n str = ("I believe that " + var + ".") print str i = 0 while i < n str += "I believe that " + str + "." print str end while end if else f(str, n**n) end

f(f(X, n)) for any X (to which we assign a probability of 1 and some valid n) prints an infinite number of statements to which we also assign a probability of 1.

While I'm at it, I can show that there are an uncountably infinite number of such statements with a probability of 1.

Let S be the array of all propositions produced by f(f(X, n)) (for some valid X to which we assigned a probability of 1, and a valid n).

define g(var) k = rand(#S) i = 0 j = rand(#S) str = "I believe " + S[j] delete(S[j]) while i < k j = rand(#S) str += " and " + S[j] delete(S[j] i++ end while print(str) f(g(var), 2) end

Assuming #S = Aleph_null, there are 2^#S possible values for str, and each of them can be used to generate an infinite sequence of true propositions. By Cantor's diagonal argument the number of propositions to which we assign a probability of 1 are uncountable. For each of those propsitions, we assign a probability of 0 to their negation. That is if you accept Descartes argument, or accept any single proposition has having a probability of 1 (or 0), then you accept uncountably infinite many propositions as having a probability of 1 (or 0). Either we can never be certain of any propositions ever, or we can be certain of uncountably infinite many propositions (you can also use the outlined method to construct K statements with arbitrary accuracy).

Personally, I see no problem with accepting "I exist" (and deduction) as having P of 1.

When you work in log odds, the distance between any two degrees of uncertainty equals the amount of evidence you would need to go from one to the other. That is, the log odds gives us a natural measure of spacing among degrees of confidence.

Using the log odds exposes the fact that reaching infinite certainty requires infinitely strong evidence, just as infinite absurdity requires infinitely strong counterevidence.

This ignores the fact that you can assign priors of 0 and 1—in fact, it is for this very reason that I argue that 0 and 1 are probabilities—Eliezer is right in that we can never update upwards (or downwards as the case may be) to 1 or 0 (without using priors of 0 or 1), but we can (and I argue we should) sometimes start with priors of 0 and 1.

0 and 1 as priors.

Consider Pascal's Mugging. Pascal's Mugging is a breaker (breakers are a name I coined for decision problems which break decision theories). Let us reconceive the problem such that the person doing the mugging is me.

I walk up to Eliezer and tell him that he should pay me a $10,000 or I would grant him infinite negative utility.

Now, I cannot (as a matter of fundamental physical law) inflict infinite negative utility on Eliezer. However, if Eliezer is rational (maximising his expected utility), then Eliezer must pay me the money. No matter how much money I demand from Eliezer, Eliezer must pay me, because Eliezer does not assign a probability of 0 to me carrying out my threat, and no matter how small the probability is, as long as it's not 0, paying me the ransom I demanded is the choice which maximises expected utility.

(If you claim that it is impossible for me to grant you infinite negative utility/infinite negative utility is incoherent/return a category error on infinite negative utility, then you are assigning a probability of 0 to the existence of infinite negative utility, and (implicitly (because P(A) >= P(A and B). A here is "infinite negative utility exists". B is "I can grant infinite negative utility".) assigning a probability of 0 to me granting you infinite negative utility).

I have no problems with decision problems which break decision theories, but when a problem breaks the very formulation of rationality itself, then I'm pissed. There is a trivial solution to resolving Pascal's mugging using classical decision theory (accept the objective definition of probability; once you do so, the probability of me carrying out my threat becomes zero and the problem disappears). Only the insistence to cling to (unfounded) subjective probability that forbids 0 and 1 as probabilities leads to this mess.

If anything, Pascal's mugging should be a definitive proof demonstrating that indeed 0 and 1 are perfectly legitimate priors (if you accept a prior of 0 that I will grant you infinite negative utility, then trivially, you accept a prior of 1 that I do not grant you infinite negative utility). Pascal's mugging only "breaks" Expected utility theory if you forbid priors of 0 and 1—an inane commandment.

I'll expand more on breakers, rationality, etc. in my upcoming several ten pages+ paper.

Conclusion

So I propose that it makes sense to say that 1 and 0 are not in the probabilities; just as negative and positive infinity, which do not obey the field axioms, are not in the real numbers.

The main reason this would upset probability theorists is that we would need to rederive theorems previously obtained by assuming that we can marginalize over a joint probability by adding up all the pieces and having them sum to 1.

However, in the real world, when you roll a die, it doesn't literally have infinite certainty of coming up some number between 1 and 6. The die might land on its edge; or get struck by a meteor; or the Dark Lords of the Matrix might reach in and write "37" on one side.

If you made a magical symbol to stand for "all possibilities I haven't considered", then you could marginalize over the events including this magical symbol, and arrive at a magical symbol "T" that stands for infinite certainty.

But I would rather ask whether there's some way to derive a theorem without using magic symbols with special behaviors. That would be more elegant. Just as there are mathematicians who refuse to believe in double negation or infinite sets, I would like to be a probability theorist who doesn't believe in absolute certainty.

Eliezer presents a shaky basis for rejecting 0 and 1 as probabilities. His model leads to absurd conclusion(s) (a proof by contradiction that 0 and 1 are indeed probabilities), he offers no benefits to rejecting the standard model and replacing it with his (only multiple demerits), and he doesn't formalise an alternative model of probability that is free of absurdities and has more benefits than the standard model.

0 and 1 are not probabilities is a solution in search of a problem.

Epistemic Hygiene

This article may have come across as overly vicious and confrontational; I adopted such an attitude to minimise the bias in my perception of the original article based on the halo effect.

Personal Blog

7

New Comment

61 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:14 PM

[-]jimrandomh9y60

I am alright with accepting as an axiom that I exist. I see no reason why I should be cautious of assigning a probability of 1 to this statement. I am infinitely certain that I exist.

The search keyword for reasoning that uses "I exist" as a derivation step in arguments is "anthropic reasoning". This is squarely in the middle of a thicket of very hard, mostly unsolved research problems, and unless you have a research-level understanding of that field, you probably shouldn't assign ordinary confidence, let alone axiom-level confidence, in anything whatsoever about it.

[-]DragonGod9y20

I choose to accept as an axiom that I exist. For if I do not exist, everything else is meaningless. If I doubt my own existence, what then can I believe in? What belief can I have if not my own existence?

I exist.

(I've taken note of anthropic reasoning, and may read up on it later, but it wouldn't change the axiom).

[-]jimrandomh9y20

There are decision-theoretic contexts in which you don't exist, but your (counterfactual) actions still matter because you're being simulated or reasoned about. These push on corner cases of the definitions of "I" and of "exist", and as far as I know are mostly not written up and published because they're still poorly understood. But I'm pretty sure that, for the most obvious ways of defining "I" and "exist", adding your own existence as an axiom will lead to incorrect results.

[-]Friendly-HI9y30

I'm tempted to agree with DragonGod on a weaker form (or phrasing) of the "I exist" proposition:

I would defend the proposition that my feeling of subjective experience (independent of whether or not I am mistaken about literally everything I think and believe) really does exist with a probability of 1. And even if my entire experience was just a dream or simulated on some computer inside a universe where 2+2=3 actually holds true, the existence of my subjective experience (as opposed to whatever "I" might mean) seems beyond any possible doubt.

Even if every single one of my senses and my entire map of reality (even including the concept of reality itself) was entirely mistaken in every possible aspect, there would still be such a thing as having/being my subjective experience. It's the one and only true axiom in this world that I think we can assign P=1 to.

Especially if you don't conceive of the word "exist" as meaning "is a thing within the base level of reality as opposed to a simulation."

[-]DragonGod9y20

A simulation exists. A simulation of me is me.

I am my information, and a simulation of me is still me.

[-]math_viking9y40

I think this version of Pascal's mugging could be rejected if you think that "infinite negative utility" is not a phrase that means anything, without appealing to probability of 0.

However, I still accept 0 and 1 as valid probabilities, because that is how probability is defined in the mathematical structures and proofs that underpin all of the probability theory we use, and as far as I know no other foundation of probability (up to isophorism)has been rigorously defined and explored.

The fact that measure#measure_space) is nonnegative, instead of positive, is a relevant fact and if you're going to claim 0 and 1 are not probabilities, you had better be ready to re-define all of the relevant terms and re-derive all of the relevant results in probability theory in this new framework. Since no such exposition exists, you should feel free to treat any claims that 0 and 1 are not probabilities as, at best, speculation.

Now, I know those of you who have read Eliezer's post are about to go "But wait! What about Cox's Theorem! Doesn't that imply that odds have to be finite?" No, it does no such thing. If you look at the Wikipedia article on Cox's Theorem, you will see that probability must be represented by real numbers, and that this is an assumption, rather than a result. In other words, any "way of representing uncertainties" must map them to real numbers in order for Cox's Theorem to apply, and so Cox's Theorem only applies to odds or log odds if you assume that odds and log odds are finite to begin with. Obviously, this is circular reasoning, and no more of an argument than simply asserting that probability must be in (0,1) and stopping there.

Moreover, if you look down the page, you will see that the article explicitly states that one of Cox's results is that probability is in... wait for it... [0,1].

[-]abramdemski9y30

This article was overly vicious and confrontational; I adopted such an attitude to minimise the bias in my perception of the original article based on the halo effect.

I accept that this is likely the best thing for you to do for debugging your own world-view, but there's a problematic group-epistemic question: it would be bad if a person could always justify arguing in a way that's biased against X by saying "I'm biased toward X, so I have to argue in a way that's biased against X."

To the extent that you can, I'd suggest that you steer toward de-biasing in a way that's closer to "re-deriving things from first principles"; IE, try to figure out how one would actually answer the question involved, and then do that, without particularly steering toward X or against X.

With respect to the object-level question: the same type of argument which supports the laws of probability also supports non-dogmatism (see theorem 4), IE, the rejection of probabilities zero or one for non-logical facts. So, I put this principle on the same level as the axioms of probability theory, but I do not extend it to things like "P(A or not A)=1", which don't fall to the same arguments.

[-]DragonGod9y10

I reject 0 and 1 for non logical facts as well.
"I think therefore I am" is a logical proof of my own existence, and as such, I assign a probability of 1 to the proposition: "I exist".

[-]gjm9y10

"I think therefore I am" is a logical proof of my own existence, and as such, I assign a probability of 1 to the proposition: "I exist".

I'm not at all sure it's any such thing. It depends a little on how broadly you're prepared to construe "my own existence".

You aren't really entitled to say "I think". You know that some thought is happening, but you don't really know that what's having that thought is the right sort of thing to be labelled "I" because that word carries a lot of baggage (e.g., the assumption of persistence through time) that you aren't entitled to when all you have is the knowledge that some thinking is going on. So, for instance, if you go on -- as Descartes does -- to draw further inferences involving "I" from "I exist", and if you assume at different times that you're referring to the same "I", then you are cheating.

For more about this stuff, see the Stanford Encyclopedia of Philosophy.

[-]entirelyuseless9y00

Also I mentioned that even if you actually have a logical proof of something, you cannot assign a probability of 1 to the conclusion, because you might have made a mistake in the argument. You are pointing out some ways that might have happened here. Even if it did not, no one can reasonably assign a probability of 1 to the claim that they did not make such a mistake, and hence to the conclusion.

[-]abramdemski9y00

Right; sorry for not phrasing that in a way that sounded like agreement with you. We should be less that totally certain about mathematical statements in real life, but when setting up the formalism for probability, we're "inside" math rather than outside of it; there isn't going to be a good argument for assigning less than probability 1 to logical truths. Only bad things happen when you try.

This does change a bit when we take logical uncertainty into account, but although we understand logical uncertainty better these days, there's not a super strong argument one way or the other in that setting -- you can formulate versions of logical induction which send probabilities to zero immediately when things get ruled out, and you can also formulate versions in which probabilities rapidly approach zero once something has been logically ruled out. The version which jumps to zero is a bit better, but no big theoretical advantage comes out of it afaik. And, in some abstract sense, the version which merely rapidly approaches zero is more prepared for "mistakes" from the deductive system -- it could handle a deductive system which occasionally withdrew faulty proofs.

[-]Rossin_duplicate0.68981943096413869y20

I found the fact that Eliezer did not mention the classic "I think, therefore I am" argument in these essays odd as well. It does seem as though nothing I could experience could convince me that I do not exist because by experiencing it, I am existing. Therefore, assigning a probablitly of 1 to "I exist" seems perfectly reasonable.