NothingnessAbove

Confidence levels inside and outside an argument

The person running the qrng server decided to screw with you.

Thoughts on the 5-10 Problem

As far as I can tell, this problem is an exercise in logical uncertainty. Consider an example agent A which makes a decision between, say, options a and b, with possible outcomes U=0, U=5, and U=10. In general, the agent uses its logical uncertainty estimator to compare the expected utilities 0*P(U=0|a)+5*P(U=5|a)+10*P(U=10|a) to 0*P(U=0|b)+5*P(U=5|b)+10*P(U=10|b). Of course, this causes a divide by zero error if A is certain of which action it will take. To avoid this, if A ever proves that it will take an action, it will immediately take a different action, regardless of the expected utility assigned to that action. So, if A ever proves what action it takes in advance of taking it, it will be wrong, and unsound. Thus if A is sound it cannot prove in advance what action it will take. In the $5 and $10 game, A will correctly assess the expected values, and choose the $10 because it is higher. It will be able to correctly assess the expected value of the $5 because it will hold nonzero probability of taking the $5 before it makes its decision. Why will it hold this nonzero probability, when $10 is the obvious choice? Because, since by Lob's theorem A can't prove itself sound, and if it is unsound, it might prove in advance that it will take the $10, and therefore take the $5 instead.

I don't know if this was clear but this is not a full answer, because logical uncertainty is hard and I'm just assuming agent A is somehow good at it.

Edit: How does A calculate the expected utility of another agent being in its position, when it is nontrivially embedded in its environment? Of course, if the agent is not embedded, the 5-10 problem ceases to be an issue(AIXI is not bothered by it), for precisely this reason: it's easy to see what the counterfactual world where the agent decided to take another action looks like, rather than in this case where that counterfactual world might be logically impossible or ill-defined.

Give me a deterministic algorithm that performs worse than random on that problem and and I will show you how.