A Possible Resolution To Spurious Counterfactuals

[-]Charlie Steiner4yΩ450

Just leaving a sanity check, even though I'm not sure about what the people who were more involved at the time are thinking about the 5 and 10 problem these days:

Yes, I agree this works here. But it's basically CDT - this counterfactual is basically a causal do() operator. So it might not work for other problems that proof-based UDT was intended to solve in the first place, like the absent-minded driver, the non-anthropic problem, or simulated boxing.

[-]JoshuaOSHickman4yΩ010

It seems like you could use these counterfactuals to do whatever decision theory you'd like? My goal wasn't to solve actually hard decisions -- the 5 and 10 problem is perhaps the easiest decision I can imagine -- but merely to construct a formalism such that even extremely simple decisions involving self-proofs can be solved at all.

I think the reason this seems to imply a decision theory is that it's such a simple model that there are some ways of making decisions that are impossible in the model -- a fair portion of that was inherited from the psuedocode in the Embedded Agency paper. I have an extension of the formalism in mind that allows an expression of UDT as well (I suspect. Or something very close to it. I haven't paid enough attention to the paper yet to know for sure). I would love to hear your thoughts once I get that post written up? :)

[-]Charlie Steiner4y20

Sure, I can promise you a comment or you can message me about it directly.

[-]Chris_Leong4yΩ120

I don't suppose you could clarify:

Agent :: Agent -> Situation -> Choice

It seems strange for an agent to take another agent and a situation and return a choice.

I also think this approach matches our intuition about how counterfactuals work. We imagine ourselves as the same except we're choosing this particular behavior. Surely, in the formal reasoning, there might also be a distinction between the initial agent and the agent within that counterfactual, considering it's present in our own imaginations?

Yeah, this is essentially my position as well. My most recent attempt at articulating this is Why 1-boxing doesn't imply backwards causation.

[-]JoshuaOSHickman4yΩ010

The Agent needs access to a self pointer, and it is parameterized so it doesn't have to be a static pointer, as it was in the original paper -- this approach in particular needs it to be dynamic in this way.

There are also use cases where a bit of code receives a pointer not to its exact self -- when it is called as a subagent, it will get the parent's pointer.

A := Spend some time searching for proofs of sentences of the form “[A() = 5 →U() = x] & [A() = 10 →U() = y]” for x,y ∈{0,5,10}. if a proof is found with x > y: return 5 else: return 10 U := if A() = 10 : return 10 if A() = 5 : return 5

Situation :: Nonce Agent :: Agent -> Situation -> Choice Universe :: Agent -> Reward CounterfactuallyDoing(PriorAgent, CounterfactualSituation, Choice) := NewAgent := \Agent, Situation -> if Situation == CounterfactualSituation: return Choice else: return PriorAgent(NewAgent, Situation) return NewAgent A(Self, Situation == OnlySituation) := Spend some time searching for proofs of sentences of the form “[U(CounterfactuallyDoing(Self, OnlySituation, 5)) = x] & [U(CounterfactuallyDoing(Self, OnlySituation, 10)) = y]” for x,y ∈{0,5,10}. if a proof is found with x > y: return 5 else: return 10 U(Actor) := if Actor(Actor, OnlySituation) = 10: return 10 if Actor(Actor, OnlySituation) = 5: return 5

A(Self, Situation == ChooseDesire) := Spend some time searching for proofs of sentences of the form “[U(CounterfactuallyDoing(Self, ChooseDesire, 5)) = x] & [U(CounterfactuallyDoing(Self, ChooseDesire, 10)) = y]” for x,y ∈{0,5,10}. if a proof is found with x > y: return 5 else: return 10 A(Self, Situation == GetReward) := Spend some time searching for proofs of sentences of the form “[U(CounterfactuallyDoing(Self, GetReward, 5)) = x] & [U(CounterfactuallyDoing(Self, GetReward, 10)) = y]” for x,y ∈{0,5,10}. if a proof is found with x > y: return 5 else: return 10

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

15

A Possible Resolution To Spurious Counterfactuals

15

Ω 9

15

Ω 9