(thinking within the general RL framework):
If I know what I'm optimizing over, does a decision theory tell me what my policy should do on trajectories which are known to be counterfactual according to the decision theory?
e.g. if my decision theory says "always take action1", then I will never see (partial) trajectories with action0 in them. So on the face of it, I should be able to choose the policy freely for those (partial) trajectories.
But I'm pretty sure that's not true, because (I think?) decision theories need to have the right counter-factuals (e.g. for Newcomb's problem).
So then the question is: (when) does a decision theory specify actions on ALL possible (partial) trajectories (including all counterfactuals)?
(and is it important or desirable to do so? etc.)