I recently realized that the formalism of incomplete models provides a rather natural solution to all decision theory problems involving "Omega" (something that predicts the agent's decisions). An incomplete hypothesis may be thought of a zero-sum game between the agent and an imaginary opponent (we will call the opponent "Murphy" as in Murphy's law). If we assume that the agent cannot randomize against Omega, we need to use the deterministic version of the formalism. That is, an agent that learns an incomplete hypothesis converges to the corresponding max

2Chris_Leong2mo"The key point is, "applying the counterfactual belief that the predictor is
always right" is not really well-defined" - What do you mean here?
I'm curious whether you're referring to the same as or similar to the issue I
was referencing in Counterfactuals for Perfect Predictors
[https://www.lesswrong.com/posts/AKkFh3zKGzcYBiPo7/counterfactuals-for-perfect-predictors]
. The TLDR is that I was worried that it would be inconsistent for an agent that
never pays in Parfait's Hitchhiker to end up in town if the predictor is
perfect, so that it wouldn't actually be well-defined what the predictor was
predicting. And the way I ended up resolving this was by imagining it as an
agent that takes input and asking what it would output if given that
inconsistent input. But not sure if you were referencing this kind of concern or
something else.

5Vanessa Kosoy2moIt is not a mere "concern", it's the crux of problem really. What people in the
AI alignment community have been trying to do is, starting with some factual and
"objective" description of the universe (such a program or a mathematical
formula) and deriving counterfactuals. The way it's supposed to work is, the
agent needs to locate all copies of itself or things "logically correlated" with
itself (whatever that means) in the program, and imagine it is controlling this
part. But a rigorous definition of this that solves all standard decision
theoretic scenarios was never found.
Instead of doing that, I suggest a solution of different nature. In
quasi-Bayesian RL, the agent never arrives at a factual and objective
description of the universe. Instead, it arrives at a subjective description
which already includes counterfactuals. I then proceed to show that, in
Newcomb-like scenarios, such agents receive optimal expected utility (i.e. the
same expected utility promised by UDT).

Yeah, I agree that the objective descriptions can leave out vital information, such as how the information you know was acquired, which seems important for determining the counterfactuals.

I recently realized that the formalism of incomplete models provides a rather natural solution to all decision theory problems involving "Omega" (something that predicts the agent's decisions). An incomplete hypothesis may be thought of a zero-sum game between the agent and an imaginary opponent (we will call the opponent "Murphy" as in Murphy's law). If we assume that the agent cannot randomize against Omega, we need to use the

... (read more)deterministicversion of the formalism. That is, an agent that learns an incomplete hypothesis converges to the corresponding maxYeah, I agree that the objective descriptions can leave out vital information, such as how the information you know was acquired, which seems important for determining the counterfactuals.