LESSWRONG
LW

JoshBurroughs
0020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Timeless Decision Theory: Problems I Can't Solve
JoshBurroughs15y00

A simpler way to say all this is "Pick a depth where you will stop recursing (due to growing uncertainty or computational limits) and at that depth assume your opponent acts randomly." Is my first attempt needlessly verbose?

Reply
Timeless Decision Theory: Problems I Can't Solve
JoshBurroughs15y00

Agents A & B are two TDT agents playing some prisoner's dilemma scenario. A can reason:

u(c(A)) = P(c(B))u(C,C) + P(d(B))u(C,D)

u(d(A)) = P(c(B))u(D,C) + P(d(B))u(D,D)

( u(X) is utility of X, P() is probability, c() & d() are cooperate & defect predicates )

A will always pick the option with higher utility, so it reasons B will do the same:

p(c(B) u'(c(B)) > u'(d(B)) --> c(B)

(u'() is A's estimate of B's utility function)

But A can't perfectly predict B (even though it may be quite good at it), so A can represent this uncertainty as a random variable e:

u'(c(B)) + e > u'(d(B)) - e --> c(B)

In fact, we can give e a parameter, N, which is given by the depth of recursion, like a game of telephone:

u'(c(B)) + e(N) > u'(d(B)) - e(N) --> c(B)

Intuitively, it seems e(N) will tend to overwhelm u() for high enough N (since utilities don't increase as you recurse.) At that recursion depth:

p(c(B)) = p(d(B))

so:

u(c(A)) = u(C,C) +u(C,D)

u(d(A)) = u(D,C) + u(D,D)

u(D,C) > u(C,C) > u(D,D) > u(C,D)

so u(d(A)) > u(c(A)), meaning defection at the recursive depth where uncertainty overwhelms other considerations.

Does this mean a TDT agent must revert to CDT if it is not smart enough (or does not believe its opponent is smart enough) to transform the recursion to a closed-form solution?

Reply
No posts to display.