Wiki Contributions


Answer by shminux-26

First, your non-standard use of the term "counterfactual" is jarring, though, as I understand, it is somewhat normalized in your circles. "Counterfactual" unlike "factual" means something that could have happened, given your limited knowledge of the world, but did not. What you probably mean is "completely unexpected", "surprising" or something similar. I suspect you got this feedback before.

Sticking with physics. Galilean relativity was completely against the Aristotelian grain. More recently, the singularity theorems of Penrose and Hawking unexpectedly showed that black holes are not just a mathematical artifact, but a generic feature of the world. A whole slew of discoveries, experimental and theoretical, in Quantum mechanics were almost all against the grain. Probably the simplest and yet the hardest to conceptualize was the Bell's theorem

Not my field, but in economics, Adam Smith's discovery of what Scott Alexander later named Moloch was a complete surprise, as I understand it. 


Let's say I start my analysis with the model that the predictor is guessing, and my model attaches some prior probability for them guessing right in a single case. I might also have a prior about the likelihood of being lied about the predictor's success rate, etc. Now I make the observation that I am being told the predictor was right every single time in a row. Based on this incoming data, I can easily update my beliefs about what happened in the previous prediction excercises: I will conclude that (with some credence) the predictor was guessed right in each individual case or that (also with some credence) I am being lied to about their prediction success. This is all very simple Bayesian updating, no problem at all.

Right! If I understand your point correctly, given a strong enough prior for the predictor being lucky or deceptive, it would have to be a lot of evidence to change one's mind, and the evidence would have to be varied. This condition is certainly not satisfied by the original setup. If your extremely confident prior is that foretelling one's actions is physically impossible, then the lie/luck hypothesis would have to be much more likely than changing your mind about physical impossibility. That makes perfect sense to me. 

I guess one would want to simplify the original setup a bit. What if you had full confidence that the predictor is not a trickster? Would you one-box or two-box? To get the physical impossibility out of the way, they do not necessarily have to predict every atom in your body and mind, just observe you (and read your LW posts, maybe) to Sherlock-like make a very accurate conclusion about what you would decide.

Another question: what kind of experiment, in addition to what is in the setup, would change your mind? 


Sorry, could not reply due to rate limit.

In reply to your first point, I agree, in a deterministic world with perfect predictors the whole question is moot. I think we agree there.

Also, yes, assuming "you have a choice between two actions", what you will do has not been decided by you yet. Which is different from "Hence the information what I will do cannot have been available to the predictor." If the latter statement is correct, then how can could have "often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation"? Presumably some information about your decision-making process is available to the predictor in this particular situation, or else the problem setup would not be possible, would it? If you think that you are a very special case, and other people like you are not really like you, then yes, it makes sense to decide that you can get lucky and outsmart the predictor, precisely because you are special. If you think that you are not special, and other people in your situation thought the same way, two-boxed and lost, then maybe your logic is not airtight and your conclusion to two-box is flawed in some way that you cannot quite put your finger on, but the experimental evidence tells you that it is. I cannot see a third case here, though maybe I am missing something. Either you are like others, and so one-boxing gives you more money than two boxing, or you are special and not subject to the setup at all, in which case two-boxing is a reasonable approach.

I should decide to try two-boxing. Why? Because that decision is the dominant strategy: if it turns out that indeed I can decide my action now, then we're in a world where the predictor was not perfect but merely lucky and in that world two-boxing is dominant

Right, that is, I guess, the third alternative: you are like other people who lost when two-boxing, but they were merely unlucky, the predictor did not have any predictive powers after all. Which is a possibility: maybe you were fooled by a clever con or dumb luck. Maybe you were also fooled by a clever con or dumb luck when the predictor "has never, so far as you know, made an incorrect prediction about your choices". Maybe this all led to this moment, where you finally get to make a decision, and the right decision is to two-box and not one-box, leaving money on the table.

I guess in a world where your choice is not predetermined and you are certain that the predictor is fooling you or is just lucky, you can rely on using the dominant strategy, which is to two-box. 

So, the question is, what kind of a world you think you live in, given Nozick's setup? The setup does not say it explicitly, so it is up to you to evaluate the probabilities (which also applies to a deterministic world, only your calculation would also be predetermined).

What would a winning agent do? Look at other people like itself who won and take one box, or look at other people ostensibly like itself and who nevertheless lost and two-box still?

I know what kind of an agent I would want to be. I do not know what kind of an agent you are, but my bet is that if you are the two-boxing kind, then you will lose when push comes to shove, like all the other two-boxers before you, as far as we both know.


There is no possible world with a perfect predictor where a two-boxer wins without breaking the condition of it being perfect.


People constantly underestimate how hackable their brains are. Have you changed your mind and your life based on what you read or watched? This happens constantly and feels like your own volition. Yet it comes from external stimuli. 


Note that it does not matter in the slightest whether Claude is conscious. Once/if it is smart enough it will be able to convince dumber intelligences, like humans, that it is indeed conscious. A subset of this scenario is a nightmarish one where humans are brainwashed by their mindless but articulate creations and serve them, kind of like the ancients served the rock idols they created. Enslaved by an LLM, what an irony.


Not into ancestral simulations and such, but figured I comment on this:

I think "love" means "To care about someone such that their life story is part of your life story."

I can understand how how it makes sense, but that is not the central definition for me. When I associate with this feeling is what comes to mind is willingness to sacrifice your own needs and change your own priorities in order to make the other person happier, if only a bit and if only temporarily. This is definitely not the feeling I would associate with villains, but I can see how other people might.


Thank you for checking! None of the permutations seem to work with LW, but all my other feeds seem fine. Probably some weird incompatibility with protopage.


neither worked... Something with the app, I assume.


Could be the app I use. It's (which is the best clone of the defunct iGoogle I could find):

Load More