New Answer

New Comment

9 Answers sorted by
top scoring

Sep 16, 2024

210

The problem is update procedure.

When you are conditioning on empirical fact, you are imaging set of logically consistent worlds where this empirical fact is true and ask yourself about frequency of other empirical facts inside this set.

But it is very hard to define update procedure for fact "P=NP", because one of the worlds here is logically inconsistent, which implies all other possible facts which makes notion of "frequency of other facts inside this set" kinda undefined.

[-]Ape in the coat1y40

Could you explain it using specifically the examples that I brought up?

Are you claiming that:

There is no logically consistent world where all the physics is exactly the same and yet, the googolth digit of pi is different?
There is a logically consistent world where all the physics of the universe is the same and yet, the outcome of a particular coin toss is different?

[-]quetzal_rainbow1y134

There is a logically consistent world, where you made all the same observations, and coin came up tail. It may be a world with different physics than the world with coin coming up head, which means that result of coin toss is an evidence in favor of particular physical theory.

And yeah, there are no worlds with different pi.

EDIT: Or, to speak more precise, maybe there is some sorta-cosistent sorta-sane notion of the "world with different pi", but we currently don't know how to build it and if we knew, we would have solved logical uncertainty problem.

3Ape in the coat1y

Neither we know how to build worlds with different physiscs, don't we? If this was the necessary condition for being able to use probability theory then we shouldn't be able to use it neither for empiric nor for logical uncertanity. On the other hand, if all we need is the vague idea of how its possible to make the same observations even when situation is different, well I definetely could've misheard that the question is about googolth digit of pi, while in actuality the question was about some other digit or maybe some other constant. Frankly, I'm not sure what does this speculation about alternative worlds has to do with probability theory and updating in the first place. We have a probability space that describes a particular state of our knowledge. We have Bayes theorem, that formalizes the update procedure. So what's the problem?

7Aleksander1y

We know how to construct worlds with different physics. We do it all the time. Video games, or if you don’t accept that example we can construct a world consisting of 1 bit of information and 1 time dimension. This bit flips every certain increment of time. This universe obviously has different physics than ours. Also as the other person mentioned, a probability space is the space of all possibilities organized based on whether statement Q is true, which is isomorphic to the space all universes consistent with your previous observations. There is, as far as I am aware, no way to logic yourself into the belief that pi could somehow have a different digit in a different universe, given you use a sufficiently exclusive definition of pi(specifying the curve of the plane the circle is upon being the major example)

5Ape in the coat1y

Technically true, but irrelevant to the point I'm making. I was talking about constructing alternative worlds similar to ours to such degree that 1. I can inhabit either of them 2. I can be reasonably uncertain which one I inhabit 3. Both worlds are compatible with all my observations of a particular probability experiment - a coin toss And yet, despite all of that in one world the coin comes Heads and in the other it comes Tails. These are the type of worlds relevant for the discussions of probability experiments. We have no idea how to construct them, when talking about empiric uncertainty, and yet we don't mind, only demanding such level of constructivism when dealing with logical uncertainty, for some reason. Accent on sufficiently exclusive definition. Likewise, we can sufficiently exculisvily define a particular coin toss in a particular world and refuse to entertain the framing of different possible worlds : "No, the question is not about an abstract coin toss that could've ended differently in different possible worlds, the question is about this coin toss in this world". It's just pretty clear in case of empirical uncertainty, that we should not be doing it, because such level of precision doesn't capture our knowledge state. So why are we insisting on this level of exclusivity when talking about logical uncertainty? In other words, this seems as an isolated demand for rigor to me.

2quetzal_rainbow1y

We don't??? Probability space literally defines set of considered worlds.

1Ape in the coat1y

Probability space consists of three things: sample space, event space and probability function. Sample space defines a set of possible outcomes of probability experiment, representing the knowledge state of the person participating in it. In this case its: {Odd, Even} For event space we can just take a superset of the sample space. And as our measure function we just need to assign probabilities to the elementary events: P(Odd) = P(Even) = 1/2 Do I understand correctly that the apparent problem is in defining the probability experiment in such a way so that we could talk about Odd and Even as outcomes of it?

2quetzal_rainbow1y

The problem is "how to define P(P=NP|trillionth digit of pi is odd)".

1Ape in the coat1y

It's an interesting question, but its a different, more complex problem than simply not knowing googolth digit of pi and trying to estimate whether it's even or odd.

2quetzal_rainbow1y

The reason why logical uncertainty was brought up in the first place is decision theory, to make crisp formal expression for intuitive "I cooperate with you conditional on you cooperating with me", where "you cooperating with me" is result of analysis of probability distribution over possible algorithms which control actions of your opponent and you can't actually run these algorithms due to computational constraints, and you want to do all this reasoning in non-arbitrary ways.

1Canaletto1y

Interesting. Is there an obvious way to do that for toy examples like P(1 = 2 | 7 = 11), or something like that

keith_wynroe

Sep 17, 2024

104

I can reason as follows: There is 0.5 chance that it is Heads. Let P represent the actual, unknown, state of the outcome of the toss (Heads or Tails); and let Q represent the other state. If Q, then anything follows. For example, Q implies that I will win $1 billion. Therefore the value of this bet is at least $500,000,000, which is 0.5 * $1,000,000, and I should be willing to pay that much to take the bet.

This doesn't go through, what you have are two separate propositions "H -> (T -> [insert absurdity here]" and "T -> (H -> [insert absurdity here]" ^[1], and actually deriving a contradiction from the consequent requires proving which antecedent obtains, which you can't do since neither is a theorem.

The distinction then with logical uncertainty is supposedly that you do already have a proof of the analogue of H or T, so you can derive the consequent that either H or T derives a contradiction

^{^}
You don't really have these either, unless you can prove NOT(H AND T) i.e. can you definitively rule out a coin landing both heads and tails? But that's kinda pedantic

[-]Ape in the coat1y32

Thank you for addressing specifically the example I raised!

This doesn't go through, what you have are two separate propositions "H -> (T -> [insert absurdity here]" and "T -> (H -> [insert absurdity here]" ^[1], and actually deriving a contradiction from the consequent requires proving which antecedent obtains, which you can't do since neither is a theorem.

So what changes if H and T are theorems? Let O mean "googolth digit of pi is odd" and E mean "googolth digit of pi is even". I have two separate propositions:

O -> ( E -> Absur... (read more)

JBlack

Sep 17, 2024

Yes, both of these credences should obey the axioms of a probability space.

This sort of thing is applied in cryptography with the concept of "probable primes", which are numbers (typically with many thousands of decimal digits) that pass a number of randomized tests. The exact nature of the tests isn't particularly important, but the idea is that for every composite number, most (at least 3/4) of the numbers less than it are "witnesses" such that when you apply a particular procedure using that number, the composite number fails the test but primes have no such failures.

So the idea is that you pick many random numbers, and each pass gives you more confidence that the number is actually prime. The probability of any composite number passing (say) 50 such tests is no more than 4^-50, and for most composite numbers it is very much less than that.

No such randomized test is known for parity of the googolth digit of pi, but we also don't know that there isn't one. If there was one, it would make sense to update credence using the results of such tests using probability axioms.

Vladimir_Nesov

Sep 17, 2024

With empirical uncertainty, it's easier to abstract updating from reasoning. You can reason without restrictions, and avoid the need to update on new observations, because you are not making new observations. You can decide to make new observations at the time of your own choosing, and then again freely reason about how to update on them.

With logical uncertainty, reasoning simultaneously updates you on all kinds of logical claims that you didn't necessarily set out to observe at this time, so the two processes are hard to disentangle. It would be nice to have better conceptual tools for describing what it means to have a certain state of logical uncertainty, and how it should be updated. But that doesn't quite promise to solve the problem of reasoning always getting entangled with unintended logical updating.

cubefox

Sep 20, 2024*

In theories of Bayesianism, the axioms of probability theory are conventionally assumed to say that all logical truths have probability one, and that the probability of a disjunction of logically inconsistent statements is the sum of their probabilities. Corresponding to the second and third Kolmogorov axiom.

If one then e.g. regards the Peano axioms as certain, then all theorems of Peano arithmetic must also be certain, because those are just logical consequences. And all statements which can be disproved in Peano arithmetic then must have probability zero. So the above version of the Kolmogorov axioms is assuming we are logically omniscient. So this form of Bayesianism doesn't allow us to assign anything like 0.5 probability to the googolth digit of pi being odd: We must assign 1 if it's odd, or 0 if it's even.

I think the simple solution is to not talk about logical tautologies and contradictions when expressing the Kolmogorov axioms for a theory of subjective Bayesianism. Instead talk about what we actually know a priori, not about tautologies which we merely could know a priori (if we were logically omniscient). Then the second Kolmogorov axiom says that statements we actually know a priori have to be assigned probability 1, and disjunctions of statements actually known a priori to be mutually exclusive have to be assigned the sum of their probabilities.

Then we are allowed to assign probabilities less than 1 to statements where we don't actually know that they are tautologies, e.g. 0.5 to "the googolth digit of pi is odd" even if this happens to be, unbeknownst to us, a theorem of Peano arithmetic.

[-]Ape in the coat1y20

I think the simple solution is to not talk about logical tautologies and contradictions when expressing the Kolmogorov axioms for a theory of subjective Bayesianism. Instead talk about what we actually know a priori, not about tautologies which we merely could know a priori (if we were logically omniscient).

Yes, absolutely. When I apply probability theory it should represent my state of knowledge, not state of knowledge of some logically omniscient being. For me it seems such an obvious thing that I struggle to understand why it's still not a standard a... (read more)

2Noosphere8910mo

Basically, because it screws with update procedures, since formally speaking, only 1 answer is correct, and quetzal rainbow pointed this out: https://www.lesswrong.com/posts/H229aGt8nMFQsxJMq/what-s-the-deal-with-logical-uncertainty#yHC8EuR76FE3tnuk6

2Ape in the coat10mo

There always is only one correct answer for what outcome from the sample space is actually realised in this particular iteration of the probability experiment. This doesn't screw up our update procedure, because probability update represent changes in our knowledge state about which iteration of probability experiment could be this one, not changes in what has actually happened in any particular iteration.

2Noosphere8910mo

The point is that if you consider all iterations in parallel, you can realize all possible outcomes of the sample space, and assign a probability to each outcome occurring for a Bayesian superintelligence, while in a consistent proof system, not all possible outcomes/statements can be proved, no matter how many iterations are done, and if you could do this, you have proved the logic/theory inconsistent, which is the problem, because for logical uncertainty, there is only 1 possible outcome no matter the amount of iterations for searching for a proof/disproof of a statement (for consistent logics. If not the logic can prove everything) This is what makes logical uncertainty non-Bayesian, and is why Bayesian reasoning assumes logical omniscience, so this pathological outcome doesn't happen, but as a consequence, you have basically trivialized learning/intelligence.

2Ape in the coat10mo

Likewise if I consider every digit of pi in parallel, some of them are odd and some of them are even. And likewise I can assign probabilities based on how often an unknown to me digit of pi is even or odd. Not sure what does a superintelligence has to do here. The same applies to a coin toss. I can't prove both "This particular coin toss is Heads" and "This particular coin toss is Tails", no more than I can simultaneously prove both "This particular digit of pi is odd" and "This particular digit of pi is even" You just need to define you probability experiment more broadly, talking about not a particular digit of pi but a random one, the same way we are doing it for a toss of the coin.

1Markvy1y

Yeah, I think it’s that one

Cole Wyeth

Sep 16, 2024

I agree that there should not be a fundamental difference. Actually, I think that when an A.I. is reasoning about improving its reasoning ability some difficulties arise that are tricky to work out with probability theory, but similar to themes that have been explored in logic / recursion theory. But that only implies we haven't worked out the versions of the logical results on reflectivity for uncertain reasoning, not that logical uncertainty is in general qualitatively different than probability. In the example you gave I think it is perfectly reasonable to use probabilities, because we have the tools to do this.

See also my comment on a recent interesting post from Jessica Taylor: https://www.lesswrong.com/posts/Bi4yt7onyttmroRZb/executable-philosophy-as-a-failed-totalizing-meta-worldview?commentId=JYYqqpppE7sFfm9xs

kqr

Sep 17, 2024

If Q, then anything follows. (By the Principle of Explosion, a false statement implies anything.) For example, Q implies that I will win $1 billion.

I'm not sure even this is the case.

Maybe there's a more sophisticsted version of this argument, but at this level, we only know the implication Q=>$1M is true, not that $1M is true. If Q is false, the implication being true says nothing about $1M.

But more generally, I agree there's no meaningful difference. I'm in the de Finetti school of probability in that I think it only and always expresses our personal lack of knowledge of facts.

Jiro

Sep 16, 2024

If you make a lot of educated guesses about the googolth digit of pi based on chains of reasoning that are actually possible for humans, around 50% of them will get its parity right.

(Of course that's reversed, in a way, since it isn't stating that the digit is uncertain.)

JeffJo

Oct 10, 2024

-1-4

Contrary to what too many want to believe, probability theory does not define what "the probability" is. It only defines these (simplified) rules that the values must adhere to:

Every probability is greater than, or equal to, zero.
The probability of the union of two distinct outcomes A and B is Pr(A)+Pr(B).
The probability of the universal event (all possible outcomes) is 1.

Let A="googolth digit of pi is odd", and B="googolth digit of pi is even." These required properties only guarantee that Pr(A)+Pr(B)=1, and that each is a non-zero number. We only "intuitively" say that Pr(A)=Pr(B)=0.5 because we have no reason to state otherwise. That is, we can't assert that Pr(A)>Pr(B) or Pr(A)<Pr(B), so we can only assume that Pr(A)=Pr(B). But given a reason, we can change this.

The point is that there are no "right" or "wrong" statements in probability. Only statements where the probabilities adhere to these requirements. We can never say what a "probability is," but we can rule out some sets of probabilities that violate these rules.

[-]TAG4mo44

You have set out some criteria for being a probability statement at all. There are further criteria for being a true probability statement. It's obviously possible to check probability claims.

1JeffJo3mo

My criteria are what is required for the Laws of Probability to apply. There are no criteria for being a "true probability statement," whatever it is that you think that means. There is only one criterion - it is more of a guideline than a rule - for how to assign values to probabilities. It's called he Principle of Indifference. If you have a set of events, and no reason to think that any one of them is more, or less, likely than any of the other events in the set (this is what "indifferent" means), then you should assign the same probability to each. You don't have to, of course, but that makes for unreasonable (not "false") probabilities if you don't. This is why, in a simple coin flip, both "Heads" and "Tails" are assigned a 50% probability. Even if I tell you that I have determined experimentally, with 99% confidence, that the coin is biased. Unless I tell you which face is favored, you should assign 50% to each until you learn enough to change that (it's called Baysean Updating). I suppose that is what you would call a "false probability statement," but it is the correct thing to do. and you will not find a "true probability statement," just one that you have more confidence in. +++++ Now, in the Sleeping Beauty Problem, there are ways to test it. And they get met with what I consider to be unfounded, and unreasonable, objections because some don't like the answer. Start with four volunteers instead of one. Assign each a different combination from the set {H+Mon, H+Tue, T+Mon, T+Tue}. Using the same coin and setup as in the popular version, wake three of them on each day of the experiment. Exclude the one assigned the current day and coin result. Ask each for the probability that this is the only time she will be wakened. This was actually the question in the original version of the SB problem, but it is equivalent to "the probability of Heads" for the first two combinations in that set, and "the probability of Tails" for the last two. This problem is ide

[-]Ape in the coat1y20

Even if this was true, I don't see how it answers my question.

1JeffJo4mo

Well, it is true, and your confusion demonstrates what you don't want to understand about probability. It is not about what what can be shown, deterministically, to be true when we have perfect knowledge of a system. It is about the different outcomes that could be true when your knowledge is imperfect. Consider your "opaque box" problem. You are not "assigning non-zero probability to a false statement," you are assigning it to a possibility that could be true based on the incomplete knowledge you have. This is what probability is supposed to be - a measure of your lack of knowledge about a result, not a measure of absolute truth. Even if there are others who possess more knowledge. So yes, it does provide an answer to the question you asked. That question was just not what you thought you were asking. Like "If a plane traveling from New York to Montreal, carrying 50 Americans and 50 Canadians, crashes exactly on the border, where do we bury the survivors?" Answer: you don't bury the survivors.

4Ape in the coat4mo

Were it true, then the correct answer to "What is a probability of a fair coin toss, about which you know nothing else, to come Heads?", would be: "Any real number between zero and one". And the same would go for any other probability theory problem, turning the whole domain pretty useless. I have no idea how you managed to come to the conclusion that I don't understand that probability is about lack of knowledge. On the other hand, this distinction: Makes little sense. Things that we consider to be possible while our knowledge is imperfect can very well be shown to be either true or false, when our knowledge is perfect. That's kind of the point of the third axiom. After all the knowledge is acquired and all the conditionalizations are done, the resulting sample space consists of a single outcome and its probability is one, which, if you want to be dramatic about, you can call "absolute truth". Anyway, back to the actual question. And how is it different from the initial example with googolth digit of pi? Why can't we say that we are not assigning non-zero probability to a false statement but to a possibility that could be true based on the incomplete knowledge that we have?

-1JeffJo3mo

Yes, you can make a valid probability space where the probability of Heads is any such number. What you can't do, is say that probability space accurately represents the system. Probability Theory only determines the requirements for such a space. It makes no mention of how you should satisfy them. And I assumed you would know the difference, and not try to misrepresent it. Because of responses like that last one, and this: Okay, I misspoke. Because I also assumed you would at least try to understand. You need complete knowledge of an assumed (Baysean) or actual (Frequentist) system that can produce different results. But lack at least some knowledge of the one instance of the system in question; that is, one particular result. The system in the SB problem is that it is a two-day experiment, where the individual (single) days are observed separately. There are four combinations, that depend on the separation, and the four are equally likely to apply to any single day in the system. An "instance" is not one experiment, it one day's observation by SB. Because it is independent of any other instance, which includes a different day in the same experiment, due to the amnesia. The partial information that she gains is that it is not Tuesday, after Heads. What I keep trying to convince you of, is that that combination is part of the system and has a probability in the probability space. Because the system is "a digit in {0,1,2,3,4,5,6,7,8,9} is selected in a manner that is completely hidden from us at the moment, even if it could be obtained. The probability refers to what the result of calculating the nth digit could be when we do not apply n to the formula for pi, not when we do. If what you are suggesting were true, then the correct answer to "What is a probability of a fair coin toss, about which you know nothing else, to come Heads?", would be: "It is either 0 or 1, but we can't tell. And reason I "[concluded] that [you] don't understand that probability is abo

2Ape in the coat3mo

What kind of misrepresentation you are talking about? You have literally just claimed that there is no way to know whether a particular probabilistic model correctly represents the system. Frankly, I have no idea what is going on with your views. There must be some kind of misunderstanding but to clear it up I've came up with the clearest demonstration of the absurdity of your position and you decided to bite the bullet anyway. Do you really not see that it completely invalidates all epistemology? That if we can only say that a map is valid but never able to claim anything about its soundness towards a particular territory, then the map is useless at navigation? So you came to a wrong conclusion about my views because you misspoke? And you misspoke because you assumed that I will be trying to understand? Okay, look, I think we need to calm down a bit, make a deep breath and one step back. This conversation seems entangled in combatitiveness and it affects your ability to explain yourself clearly and my ability to understand what you are talking about and probably vice versa. I'm happy to assume that you mean something reasonable by the things you are saying and it's just the issue of miscommunication, if you are ready to give the same courtesy to me. I would be glad to understand your position. I promise to give my best effort to do that. But you also need to make your best effort in explaining yourself. I will probably keep misunderstanding you in all kind of ways, anyway, not because of ill intent, but simply because of inferential distances and some unresolved cruxes of disagreement to which we can't get until we understand each other. Okay I definitely agree with "lack at least some knowledge" part. If we had a perfect map, then there would be no point in approximating the territory with probabilistic models. And that's exactly my point. In both logical and empirical uncertainty we do lack some knowledge. Therefore we use the approximate probability theor

1JeffJo3mo

Since what I said was that probability theory makes no restrictions on how to assign values, but that you have to make assignments that are reasonable based on things like the Principle of Indifference, this would be an example of misrepresentation. You claim that "the probability that the nth digit of pi, in base 10, is 1/10 for each possible digit," is assigning a non-zero probability to nine false statements, and probability that is less than 1 to one true statement. I am saying that it such probabilities apply only if we do not apply the formula that will determine that digit. I claim that the equivalent statement, when I flip a coin but place my hand over it before you see the result, is "the probability that the coin landed Heads is 1/2, as is the probability that it landed Tails." And that the truth of these two statements is just as deterministic, if I lift my hand. Or if I looked before I hid the coin. Or if someone has a x-ray that can see it. That the probability in question is about the uncertainty when no method is applied that determine the truth of the statements, even when we know such methods exist. I'm saying that this is not a question in epistemology, not that epistemology is invalid. And the reason the SB problem is pertinent, is because it does not matter if H+Tue can be observed. It is one of the outcomes that is possible.

1Ape in the coat3mo

That's not what you've said at all. Re-read our comment chain. There is not a single mention of principle of indifference by you there. I've specifically brought up an example of a fair coin about which we know nothing else and yet you agreed that any number from 0 to 1 is the right answer to this problem. The first time you've mentioned principle of indifference under this post is in an answer to another person, which happened after my alleged misrepresentation of yours. Now, I'm happy to assume that you've meant this all the time and just was unable to communicate your views properly. Please do better next time. But for now let me try to outline your views as far as I understand them, which do appear less ridiculous now: You believe that probability theory only deals with whether something is a probability space at all - does it fit the axioms or not. And the question which valid probability space is applicable to a given problem based on some knowledge level, doesn't have a "true" answer. Instead there is a completely separate category of "reasonableness" and some probability spaces are reasonable to a given problem based on the principle of uncertainty, but this is a completely separate domain from probability theory. Please correct me, what I got wrong and then I hope we will be able to move forward. No, I don't. I'm ready to entertain this framework, but then I don't see any difference compared to an example with empirical uncertainty. Do you see this difference? Do you think there is some fundamental reason why we can say that probability of a coin to be Heads is 1/2 and yet we can't say the same about the probability of a particular unknown to us digit of pi to be even? You mean, before we apply the formula, right? Suppose we are going to apply the formula in a moment, we can still say that before we applied it the probability for any digit is 1/10, can we? And after we applied the formula we know which digit it is so it's 1 for this particular digit an

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 6:55 AM

[-]Ape in the coat6mo20

In this post I've described a unified framework that allows to reason about any type of uncertainty be it logical or empirical. I would appreciate engagement from people who think that logical uncertainty is still unsolved.

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

41

[ Question ]

What's the Deal with Logical Uncertainty?

41

41

9 Answers sorted by
top scoring

Sep 16, 2024

Sep 17, 2024

Sep 17, 2024

Sep 17, 2024

Sep 20, 2024*

Sep 16, 2024

Sep 17, 2024

Sep 16, 2024

Oct 10, 2024

41

[ Question ]

What's the Deal with Logical Uncertainty?

41

41

9 Answers sorted by top scoring

Sep 16, 2024

Sep 17, 2024

Sep 17, 2024

Sep 17, 2024

Sep 20, 2024*

Sep 16, 2024

Sep 17, 2024

Sep 16, 2024

Oct 10, 2024

9 Answers sorted by
top scoring