notfnofn

2mo10

For the second paragraph, we're assuming this AI has not made a mistake in predicting human behavior yet after many, many trials in different scenarios. No exact probability. We're also assuming perfect levels of observation, so we know that they pressed a button, bombs are heading over, and any observable context behind the decision (like false information).

The first paragraph contains an idea I hadn't considered, and it might be central to the whole thing. I'll ponder it more.

2mo10

I didn't get around to providing more clarity. I'll do that now:

- Both parties would click the button if it was clear that the other party would not click the button in retaliation. This way they do not have to worry about being wiped off the map.
- The two parties would both prefer a world in which only the other party survives to a world without any humanity.

We know that the other party will click the button if and only if they predict with extremely high confidence that we will not retaliate. Our position is the same.

2mo52

It's extremely beautiful, and seems like it would serve as a nice introduction to the website that isn't subject to the same random noise as the front page.

I really like 'leastwrong' in the url and top banner (header?), but I could see how making 'The LeastWrong' the actual title could rub off on some as pretentious.

2mo10

Thanks for your answer; this explains why I was not able to find any related discussion on this. I read this article recently: https://www.lesswrong.com/posts/c3wWnvgzdbRhNnNbQ/timeless-decision-theory-problems-i-can-t-solve and misremembered it as a defense of evidential decision theory, instead of a different decision theory altogether.

So from a timeless decision theory perspective, is it correct to say that one would press the button? And from both EDT/CDT perspectives, one would *not* press the button (assuming they value the other country staying alive over no country staying alive)?

2mo110

I'm not sure if these kind of comments are acceptable on this site, but I just wanted to say thank you for this sequence. I doubt I will significantly change my life after reading this, but I hope to change it at least a little in this direction.

Viewing myself as a reinforcement learning agent that balances policy improvement (taking my present model and thinking about how to tweak my actions to optimize rewards assuming my model is correct) and exploration (observing how the world actually responds to certain actions to update the model), I have historically spent far too much time on policy improvement.

This sequence provides a nice set of guidelines and methods to pivot gears and really think about what it even means to improve ones model of the world, in a way that seems... fun? fulfilling? I hope to report back on this in a few months and say how it's gone; there is a high probability that I fall back into old habits, but I hope I do not.

2mo30

In case it hasn't crossed your mind, I personally think it's helpful to start in the setting of estimating the true mean of a data stream. A *very* natural choice estimator for is the sample mean of the which I'll denote . This can equivalently be formulated as the minimizer of .

Others have mentioned the normal distribution, but this feels secondary to me. Here's why - let's say , where is a known continuous probability distribution with mean 0 and variance 1, and are unknown. So the distribution of each has mean and variance (and assume independence).

What must be for the sample mean to be the maximum likelihood estimator of ? Gauss proved that it must be , and intuitively it's not hard to see why it would have to be of the form .

So from this perspective, MSE is a generalization of taking the sample mean, and asking the linear model to have gaussian errors is necessary to formally justify MSE through MLE.

Replace sample mean with sample median and you get the mean absolute error.

2mo10

Is there no way to salvage it via a Nash bargaining argument if the odds are different? Or at least, deal with scenarios where you have x:1 and 0:1 odds (i.e. you can only bet on heads)?

Suppose we don't have any prior information about the dataset, only our observations. Is any metric more accurate than assuming our dataset is the exact distribution and calculating mutual information? Kind of like bootstrapping.