LESSWRONG
LW

Oracle AI
Personal Blog

1

Finding reflective oracle distributions using a Kakutani map

by jessicata
2nd May 2017
AI Alignment Forum
2 min read
0

1

Ω 1

Oracle AI
Personal Blog

1

Ω 1

New Comment
Moderation Log
Curated and popular this week
0Comments

(follow-up to: A correlated analogue to reflective oracles)

Motivation

Suppose players are playing in a correlated equilibrium using a reflective oracle distribution. How does the equilibrium they play in vary as a function of the parameters of the game, or of the players' policies? It turns out that the set of equilibria is a Kakutani map of the parameters to the game. This is a lot like it being a continuous map.

This might make it possible for players to reason about the effects of their policy on the equilibrium that they play (since the equilibrium is now a Kakutani map of the players' policies).

Definitions

Let k be some natural number. We will consider reflective oracle distributions whose queries are parameterized on some vector in θ∈Rk.

To do this, let the Turing machines M, instead of outputting a raw query, output a continuous function from θ∈Rk to the query. (The details of representing continuous functions don't seem that important). The reflectivity condition on oracle distributions is now relative to θ (since the queries depend on θ).

Define the map

ParamsToDistrs(θ):={D∈D|D is reflective relative to θ }

which maps the parameters θ to the set of reflective oracle distributions for θ.

Theorem: ParamsToDistrs is a Kakutani map.

Proof:

From the previous post, we have:

  • For each θ, ParamsToDistrs(θ) is nonempty (Theorem 1)
  • For each θ, ParamsToDistrs(θ) is convex (Theorem 2)

So it is sufficient to show that ParamsToDistrs has a closed graph.

Let θ1,θ2,... be an infinite sequence of Rk values converging to θ∞. Let D1,D2,... be an infinite sequence of oracle distributions in D converging to D∞. Assume that for each natural n, Dn is reflective relative to θn. We will show that D∞ is reflective relative to θ∞; this is sufficient to show that ParamsToDistrs has a closed graph.

Let M∈M. Let a be such that D∞(O(M)=a)>0. Then D∞(O(M)=a)>ϵ for some ϵ>0. By convergence, there is some N such that for all n≥N, Dn(O(M)=a)>ϵ. By reflectivity of each Dn relative to θi, we have, for each n≥N,

a∈Eval(M)(θn)(Condition(Dn,M,a))

Let qθ:=Eval(M)(θ). Rewriting the above statement:

a∈argmaxi∈{1,...,l(q)}EDn[qθni(O)]

Consider the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]}. This is the intersection of a finite number of sets of the form {(θ,D)|ED[qθa(O)−qθi(O)]≥0}. Each of these sets is closed because (θ,D)↦ED[qθa(O)−qθi(O)] is continuous. Therefore the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]} is closed.

In total, this is sufficient to show a∈argmaxi∈{1,...,l(q)}ED∞[qθ∞i(O)], as desired.

□

An immediate consequence of this theorem is that the set of correlated equilibria of a normal-form game is a Kakutani map of the parameters of the game.