Do decision theories underspecify policies?

6th Aug 2019


(thinking within the general RL framework):

If I know what I'm optimizing over, does a decision theory tell me what my policy should do on trajectories which are known to be counterfactual according to the decision theory?

e.g. if my decision theory says "always take action1", then I will never see (partial) trajectories with action0 in them. So on the face of it, I should be able to choose the policy freely for those (partial) trajectories.

But I'm pretty sure that's not true, because (I think?) decision theories need to have the right counter-factuals (e.g. for Newcomb's problem).

So then the question is: (when) does a decision theory specify actions on ALL possible (partial) trajectories (including all counterfactuals)?

(and is it important or desirable to do so? etc.)

One sufficient condition for always defining actions is when a decision theory can give decisions as a function of the state of the world. For example, CDT evaluates outcomes in a way purely dependent on the world's state. A more complicated way of doing this is if your decision theory takes in a model of the world and outputs a policy, which tells you what to do in each state of the world.