I'm what David Chalmers would call a "Type-A materialist" which means that I deny the existence of "subjective facts" which aren't in some way reducible to objective facts.

The concerns Chalmers wrote about focused on the nature of phenomenal experience, and the traditional dichotomy between subjective and objective in human experience. That distinction draws a dividing line way off to the side of what I'm interested in. My main concern isn't with ineffable consciousness, it's with cognitive processing of information, information def...

The Bayesian calculation only needs to use the event "Tuesday exists"

I can't follow this. If "Tuesday exists" isn't indexical, then it's exactly as true on Monday as it is on Tuesday, and furthermore as true everywhere and for everyone as it is for anyone.

there doesn't seem to be any non-arbitrary way of deriving a distribution over centered worlds from a distribution over uncentered ones.

Indeed, unless you work within the confines of a finite toy model. But why go in that direction? What non-arbitrary reason is there not to ...

213y

Well, in my toy model of the Doomsday Argument, there's only a 1/2 chance that Tuesday exists, and the only way that a person can know that Tuesday exists is to be alive on Tuesday. Do you still think there's a problem?
Even in toy models like Sleeping Beauty we have to somehow choose between SSA and SIA (which are precisely two rival methods for deriving centered from uncentered distributions.)
That's a very good, philosophically deep question! Like many lesswrongers, I'm what David Chalmers would call a "Type-A materialist" which means that I deny the existence of "subjective facts" which aren't in some way reducible to objective facts.
Therefore, I think that centered worlds can be regarded one of two ways: (i) as nonsense or (ii) as just a peculiar kind of uncentered world: A "centered world" really just means an "uncentered world that happens to contain an ontologically basic, causally inert 'pointer' towards some being and an ontologically basic, causally inert catalogue of its "mental facts". However, because a "center" is causally inert, we can never acquire any evidence that the world has a "center".
(I'd like to say more but really this needs a lot more thought and I can see I'm already starting to ramble...)

I suppose I'm being obtuse about this, but please help me find my way through this argument.

- The event "it is Monday today" is indexical. However, an "indexical event" isn't strictly speaking an event. (Because an event picks out a set of possible worlds, whereas an indexical event picks out a set of possible "centered worlds".) Since it isn't an event, it makes no sense to treat it as 'data' in a Bayesian calculation.

Isn't this argument confounded by the observation that an indexical event "It is Tuesday today"...

213y

The Bayesian calculation only needs to use the event "Tuesday exists" which is non-indexical (though you're right - it is entailed by "today is Tuesday").
The problem with indexical events is that our prior is a distribution over possible worlds, and there doesn't seem to be any non-arbitrary way of deriving a distribution over centered worlds from a distribution over uncentered ones. (E.g. Are all people equally likely regardless of lifespan, brain power, state of wakefulness etc.? What if people are copied and the copies diverge from one another? Where does the first 'observer' appear in the tree of life? etc.)

On further reflection, both Ancestor and each Descendant can consider the proposition P(X) = "X is a descendant & X is a lottery winner". Given the setup, Ancestor can quantify over X, and assign probability 1/N to each instance. That's how the statement {"I" will win the lottery with probability 1} is to be read, in conjunction with a particular analysis of personal identity that warrants it. This would be the same proposition each descendant considers, and also assigns probability 1/N to. On this way of looking at it, both Ance...

There need be no information transferred.

I didn't quite follow this. From where to where?

But anyway, yes, that's correct that the referents of the two claims aren't the same. This could stand some further clarification as to why. In fact, Descendant's claim makes a direct reference to the individual who uttered it at the moment it's uttered, but Ancestor's claim is not about himself in the same way. As you say, he's attempting to refer to *all* of his descendants, and on that basis claim identity with whichever particular one of them happens to win th...

013y

I meant from anywhere to the descendant. Perhaps that wasn't the best wording.

113y

On further reflection, both Ancestor and each Descendant can consider the proposition P(X) = "X is a descendant & X is a lottery winner". Given the setup, Ancestor can quantify over X, and assign probability 1/N to each instance. That's how the statement {"I" will win the lottery with probability 1} is to be read, in conjunction with a particular analysis of personal identity that warrants it. This would be the same proposition each descendant considers, and also assigns probability 1/N to. On this way of looking at it, both Ancestor and each descendant are in the same epistemic state, with respect to the question of who will win the lottery.
Ok, so far so good. This same way of looking at things, and the prediction about probability of descendants, is a way of looking at the Sleeping Beauty problem I tried to explain some months ago, and from what I can see is an argument for why Beauty is able to assert on Sunday evening what the credence of her future selves should be upon awakening (which is different from her own credence on Sunday evening), and therefore has no reason to change it when she later awakens on various occasions. It didn't seem to get much traction then, probably because it was also mixed in with arguments about expected frequencies.

I don't think personal identity is a mathematical equivalence relation. Specifically, it's not symmetric: "I'm the same person you met yesterday" actually needs to read "I was the same person you met yesterday"; "I will be the same person tomorrow" is a prediction that may fail (even assuming I survive that long). This yields failures of transitivity: "Y is the same person as X" and "Z is the same person as X" doesn't get you "Y is the same person as Z".

...Given that you know there will be a fut

013y

With respect to the descendant "changing their mind" on the probabilility of winning the lottery: when the descendant says "I will win the lottery" perhaps that is a different statement to when the ancestor says "I will win the lottery". For the ancestor, "I" includes all the ancestor's descendants. For descendant X, "I" refers to only X (and their descendants, if any). Hence the sense that there is an update occurring is an illusion; the quotation is the same, the referent is not. There need be no information transferred.

013y

Yes, this is the article that covers several attempts to depragmatize the arguments. I highly recommend it. Unfortunately it doesn't explain his own approach in any detail.
Edit: This contains a summary and a claimed refutation of Maher's theory, but it isn't complete enough to let me understand what Maher says.

Did I accuse someone of being incoherent? I didn't mean to do that, I only meant to accuse myself of not being able to follow the distinction between a rule of logic (oh, take the Rule of Detachment for instance) and a syntactic elimination rule. In virtue of what do the latter escape the quantum of sceptical doubt that we should apply to other tautologies? I think there clearly is a distinction between believing a rule of logic is reliable for a particular domain, and knowing with the same confidence that a particular instance of its application has bee...

Ah, thanks for the pointer. Someone's tried to answer the question about the reliability of Bayes' Theorem itself too I see. But I'm afraid I'm going to have to pass on this, because I don't see how calling something a syntactic elimination rule instead a law of logic saves you from incoherence.

013y

I'd be interested to hear your thoughts on why you believe EY is incoherent? I thought that what EY said makes sense. Is the probability of a tautology being true 1? You might think that it is true by definition, but what if the concept is not even wrong, can you absolutely rule out that possibility? Your sense of truth by definition might be mistaken in the same way as the experience of a Déjà vu. The experience is real, but you're mistaken about its subject matter. In other words, you might be mistaken about your internal coherence and therefore assign a probability to something that was never there in the first place. This might be on-topic:
Nothing has a probability of 1, including this sentence, as doubt always remains, or does it? It's confusing for sure, someone with enough intellectual horsepower should write a post on it.

Probabilities of 1 and 0 are considered rule violations and discarded.

What should we take for P(X|X) then?

And then what can I put you down for the probability that Bayes' Theorem is actually false? (I mean the theorem itself, not any particular deployment of it in an argument.)

013y

Althoug I'm not sure exactly what to say about it, there's some kind of connection here to Created Already in Motion and The Bedrock of Fairness - in each case you have an infinite regress of asking for a logical axiom justifying the acceptance of a logical axiom justifying the acceptance of a logical axiom, asking for fair treatment of people's ideas of fair treatment of people's ideas of fair treatment, or asking for the probability that a probability of a ratio of probabilities being correct is correct.

13y19

What should we take for P(X|X) then?

The one that I confess is giving me the most trouble is P(A|A). But I would prefer to call that a syntactic elimination rule for probabilistic reasoning, or perhaps a set equality between events, rather than claiming that there's some specific proposition that has "Probability 1".

and then

...Huh, I must be slowed down because it's late at night... P(A|A) is the simplest case of all. P(x|y) is defined as P(x,y)/P(y). P(A|A) is defined as P(A,A)/P(A) = P(A)/P(A) = 1. The ratio of these tw

We're getting ahead of the reading, but there's a key distinction between the plausibility of a single proposition (i.e. a probability) and the plausibilities of a whole family of related plausibilities (i.e. a probability distribution).

Ok, that sounds helpful. But then my question is this-- if we have whole family of mutually exclusive propositions, with varying real numbers for plausibilities, about the plausibility of one particular proposition, then the assumption that that one proposition can have one specific real number as its plausibility is ca...

014y

If you have a mutually exclusive and exhaustive set of propositions , each of which specifies a plausibility ) for the one proposition B you're interested in, then your total plausilibity is =\sum_iP(B|A_i)P(A_i)). (Actually this is true whether or not the A's say anything about B. But if they do, then this can be useful way to think about P(B).)
I haven't said how to assign plausibilities to the A's (quick, what's the plausibility that an unspecified urn contains one white and three cyan balls?), but this at least should describe how it fits together once you've answered those subproblems.

Perhaps it would be wiser to use complex numbers for instance.

Perhaps it might be wiser to use measures (distributions), or measures on spaces of measures, or iterate that construction indefinitely. (The concept of hyperpriors seems to go in this direction, for example.)

...But intuitively it seems very likely that if you tell me two different propositions, that I can say either that one is more likely than the other, or that they are the same. Are there any special cases where one has to answer "the probabilities are uncomparable" that makes yo

014y

Very interesting! But I have to read up on the Appendix A4 I think to fully appreciate it...I will come back if I change my mind after it! :-)
My own, current, thoughts are like this: I would bet on the ball being white up to some ratio...if my bet was $1 and I could win $100 I would do it for instance. The probability is simply the border case where ratio between losing and winning is such that I might as well bet or not do it. Betting $50 I would certainly not do. So I would estimate the probability to be somewhere between 1 and 50%...and somewhere there is one and only one border case in between, but my human brain has difficulty thinking in such terms...
The same thing goes for the coin-flip, there is some ratio where it is rational to bet or not to.

114y

We're getting ahead of the reading, but there's a key distinction between the plausibility of a single proposition (i.e. a probability) and the plausibilities of a whole family of related plausibilities (i.e. a probability distribution).
Our state of knowledge about the coin is such that if we assessed probabilities for the class of propositions, "This coin has a bias X", where X ranged from 0 (always heads) to 1 (always tails) we would find our prior distribution a sharp spike centered on 1/2. That, technically, is what we mean by "confidence", and formally we will be using things like the variance of the distribution.

- Can you think of further desiderata for plausible inference, or find issues with the one Jaynes lays out?

I find desideratum 1) to be poorly motivated, and a bit problematic. This is urged upon us in Chapter 1 mainly by considerations of convenience: a reasoning robot can't calculate without numbers. But just because a calculator can't calculate without numbers doesn't seem a sufficient justification to assume those numbers exist, i.e., that a full and coherent mapping from statements to plausibilities exists. This doesn't seem the kind of thing we ...

014y

It is indeed an extremely interesting question! Perhaps it would be wiser to use complex numbers for instance.
But intuitively it seems very likely that if you tell me two different propositions, that I can say either that one is more likely than the other, or that they are the same. Are there any special cases where one has to answer "the probabilities are uncomparable" that makes you doubt that it is so?

-2[anonymous]14y

B°„bd‹¨È2«Î,¼·îðe8&"Â¯¯åØûÑ‹¥Õ»ãæLŒß¿~—ãmMvñ $0˜ÅÚ‡íxf½wœçYÍØ9çG’•ÿñ8Ê±'|x‡P z‰ :kb\ÈƒiÉû2ÔA2i‘Õ„Ó4‘·DÅ™™ aèá;ºyÖ´òdÄPX‡å²ï<ã§[µaŠ¡îbˆ˜æ‰èbaÅÞï_,¶e©U9ê,H^»þ*¾

Perhaps this is beating a dead horse, but here goes. Regarding your two variants:

...1 Same as SSB except If heads, she is interviewed on Monday, and then the coin is turned over to tails and she is interviewed on Tuesday. There is amnesia and all of that. So, it's either the sequence (heads on Monday, tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence has a 50% probability, and she should think of the days within a sequence as being equally likely. She's asked about the current state of the coin. She should answer P(H)=1/4.

Thanks for your response. I should have been clearer in my terminology. By "Iterated Sleeping Beauty" (ISB) I meant to name the variant that we here have been discussing for some time, that repeats the Standard Sleeping Beauty problem some number say 1000 of times. In 1000 coin tosses over 1000 weeks, the number of Heads awakenings is 1000 and the number of Tails awakenings is 2000. I have no catchy name for the variant I proposed, but I can make up an ugly one if nothing better comes to mind; it could be called Iterated Condensed Sleeping Bea...

Yet one more variant. On my view it's structurally and hence statistically equivalent to *Iterated Sleeping Beauty*, and I present an argument that it is. This one has the advantage that it does not rely on any science fictional technology. I'm interested to see if anyone can find good reasons why it's *not* equivalent.

The *Iterated Sleeping Beaty problem (ISB)* is the original *Standard
Sleeping Beauty (SSB)* problem repeated a large number N of times. People always seem to want to do this anyway with all the variations, to use the Law of Large Numbers to gai...

014y

Sorry I was slow to respond .. busy with other things
My answers:
Q1: I agree with you: 1/3, 1/3, 2/3
Q2. ISB is similar to SSB as follows: fair coin; woken up twice if tails, once if heads; epistemic state reset each day
Q3. ISB is different from SSB as follows: more than one coin toss; same number of interviews regardless of result of coin toss
Q4. It makes a big difference. She has different information to condition on. On a given coin flip, the probability of heads is 1/2. But, if it is tails we skip a day before flipping again. Once she has been woken up a large number of times, Beauty can easily calculate how likely it is that heads was the most recent result of a coin flip. In SSB, she cannot use the same reasoning. In SSB, Tuesday&heads doesn't exist, for example.
Consider 3 variations of SSB:
1. Same as SSB except If heads, she is interviewed on Monday, and then the coin is turned over to tails and she is interviewed on Tuesday. There is amnesia and all of that. So, it's either the sequence (heads on Monday, tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence has a 50% probability, and she should think of the days within a sequence as being equally likely. She's asked about the current state of the coin. She should answer P(H)=1/4.
2. Same as SSB except If heads, she is interviewed on Monday, and then the coin is flipped again and she is interviewed on Tuesday. There is amnesia and all of that. So, it's either the sequence (heads on Monday, tails on Tuesday), (heads on Monday, heads on Tuesday) or (tails on Monday, tails on Tuesday). The first 2 sequences have a 25% chance each and the last one has a 50% chance. When asked about the current state of the coin, she should say P(H)=3/8
The 1/2 solution to SSB results from similar reasoning. 50% chance for the sequence (Monday and heads). 50% chance for the sequence (Monday and tails, Tuesday and tails). P(H)=1/2
If you apply this kind of reasoning to ISB, where we are thinking o

Two ways to iterate the experiment:

- Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings.

and

...

- Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she s

This sounds like the continuity argument, but I'm not quite clear on how the embedding is supposed to work, can you clarify? Instead of telling me what the experimenter rightly or wrongly believes to be the case, spell out for me how he behaves.

If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.

What does this mean operationally? Is there a nonzero chance, let's call it epsilon or e, that the experimenter will incorrectly behave as if it's Tuesday when it's Monday? I.e., with probability ...

014y

On reflection, my verbal description doesn't rmatch the reply I wanted to give, which was: the experimenter behaves such that the probability mass is allocated as in the spreadsheet.
Make it "on any day when Beauty is scheduled to remain asleep, the experimenter has some probability of mistakenly waking her, and vice-versa".

Your argument is, I take it, that these counts of observations are irrelevant, or at best biased.

No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.>

I maintain that it is. I can guarantee you that it is. What obstacle do you see to accepting that? You've made noises that this is because the counts are correlated, but I haven't seen any argument for this beyond bare assertion. Do you want to claim it is impossible for some reason, or are you just saying you haven't s...

014y

At this point, it is just assertion that it's not a probability. I have reasons for believing it's not one, at least, not the probability that people think it is. I've explained some of that reasoning.
I think it's reasonable to look at a large sample ratio of counts (or ratio of expected counts). The best way to do that, in my opinion, is with independent replications of awakenings (that reflect all possibilities at an awakening). I probably haven't worded this well, but consider the following two approaches. For simplicity, let's say we wanted to do this (I'm being vague here) 1000 times.
1. Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings. But... whatever the total number of awakenings are, they are not independent. For example, one the first awakening it could be either heads or tails. On the second awakening, it only could be heads if it was heads on the first awakening. So, Beauty's options on awakening #2 are (possibly) different than her options on awakening #1. We do not have 2 replicates of the same situation. This approach will give you the correct ratio of counts in the long run (for example, we do expect the # of heads & Monday to equal the # of tails and Monday and the # of tails and Tuesday).
2. Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails. She knows that it was a fair coin. She knows that if she's awake it's definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads. She knows that 50% of coin tosses would end up tails, so we assign 0.5 to tails, which implies 0.25 to tails&Monday and 0.25 to tails&Tuesday. If we generate observations from this 1000 times, we'll ge

The 1/3 solution makes the assumption that the probability of heads given an awakening is:

lim N-> infinity n1/(n1+n2+n3)

I'd quibble about calling it an assumption. The 1/3 solution notes that this is the ratio of observations upon awakening of heads to the total number of observations, which is one of the problematic facts about the experimental setup. The 1/3 solution assumes that this is relevant to what we should mean by "credence", and makes an argument that this is a justification for the claim that Sleeping Beauty's credence should ...

014y

Thanks for your detailed response. I'll make a few comments now, and address more of it later (short on time).
No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.
No, I wouldn't say that. My argument is that you should use probability laws to get the answer. If you take ratios of expected counts, well, you have to show that what you get as actually a probability.
I definitely disagree with your bullet points about what halfers think
I said: "Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables."
You disagreed, and said:
Hm, I think that is what I'm saying. She does have to treat them all identically. They are the same variable. That's why she has to say the same thing on Monday and Tuesday. That's why an awakening contains no new info. If she had new evidence at an awakening, she'd give different answers under heads and tails.

14y0

I don't follow your latest argument against thirders. You claim that the denominator

#(heads & monday) + #(tails & monday) + #(tails & tuesday)

counts events that are not mutually exclusive. I don't see this. They look mutually exclusive to me-- heads is exclusive of tails, and monday is exclusive of tuesday, Could you elaborate this argument? Where does exclusivity fail? Are you saying tails&monday is not distinct from tails&tuesday, or all three overlap, or something else?

You also assert that the denominator is not determined by...

I think the temptation is very strong to notice the distinction between the elemental nature of raw sensory inputs and the cognitive significance they are the bearers of. And this is so, and is useful to do, precisely to the extent that the cognitive significance will vary depending on context and background knowledge, such as light levels, perspective, etc. because those serve as dynamically updated calibrations of cognitive significance. But these calibrations become transparent with use, so that we see, hear and feel vividly and directly in three dime... (read more)