3 Games for factoring out variables

by Stuart_Armstrong

28th Apr 2016

AI Alignment Forum

4 min read

9

3 Ω 2

Personal Blog

3 Ω 2

New Comment

9 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:08 AM

[-]jessicata10yΩ120

You might be interested in a way of ensuring that 2 players always have the same mixed strategy in all Nash equilibria of some game:

Assume we have a player $A$ and a player $B$ . Player $A$ has some already-specified utility function; we would like player $B$ to play the same mixed strategy as $A$ . Introduce a new player $C$ who gets to observe either $A$ or $B$ 's action (unknown with 50% probability for each), and tries to determine who took this action (getting a utility of 1 for guessing correctly and 0 otherwise). $B$ 's utility function is 1 if $C$ guesses incorrectly, and 0 if $C$ guesses correctly. $B$ will use the same mixed strategy as $A$ in all Nash equilibria.

A similar method is used in the appendix A of the reflective oracles paper.

Reply

[-]orthonormal10yΩ000

Typo in the "Single Variable Maximalisation" section: you meant to write $R (a) R^{'} (a^{'})$ rather than $R (a) R (a^{'})$ .

Reply

[-]Stuart_Armstrong10yΩ000

Thanks, corrected!

Reply

[-]jessicata10yΩ000

I'm not sure what the "arbitrarily bad decisions" example is meant to illustrate? If the 2 agents randomize uniformly between r and l, they each get an expected utility of 1/2, which is greater than -1.

Reply

[-]Stuart_Armstrong10yΩ000

But there aren't two players, that's just the model. What I mean is that all these ways of factoring out B can lead to arbitrary bad real expected utility, as compared with the agent that doesn't factor.

Reply

[-]jessicata10yΩ000

I still don't understand why the expected utility is $- W$ rather than $1 / 2$ .

Reply

[-]Stuart_Armstrong10yΩ000

In the real world, the utility is given by the diagonal (since $a$ and $a^{'}$ being different in $Q (a, a^{'})$ is the fiction allowing for factoring of $B$ ). Therefore the genuine expected utilities are only on the diagonal, and anything else than $c$ will give $- W$ .

Reply

[-]orthonormal10yΩ000

There's nothing in the setup preventing the players from having access to independent random bits, though it's fair to say that these approaches assume this to be the case even when it's not.

But then the fault is with that assumption of access to randomness, not with any of the constraints on $Q$ . So I don't think this is a strike against these methods.

Reply

[-]Stuart_Armstrong10yΩ000

I'm not following. This "game" isn't a real game. There are not multiple players. There is one agent, where we have taken its real, one-valued probability, and changed it into a two-valued $Q$ , for the purposes of factoring out the impact of the variable. The real probability is the original probability, which is the diagonal of $Q$ .

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

3

Games for factoring out variables

3

Ω 2

3

Ω 2

Let's play a game

Standard maximalisation

Double variable maximalisation

Single variable maximalisation

Summary

Arbitrarily 'bad' decisions

Further considerations