Vanessa Kosoy's Shortform

Game theory is widely considered the correct description of rational behavior in multi-agent scenarios. However, real world agents have to learn, whereas game theory assumes perfect knowledge, which can be only achieved in the limit at best. Bridging this gap requires using multi-agent learning theory to justify game theory, a problem that is mostly open (but some results exist). In particular, we would like to prove that learning agents converge to game theoretic solutions such as Nash equilibria (putting superrationality aside: I think that superrational

... (read more)
Showing 3 of 5 replies (Click to show all)

Instead of postulating access to a portion of the history or some kind of limited access to the opponent's source code, we can consider agents with full access to history / source code but finite memory. The problem is, an agent with fixed memory size usually cannot have regret going to zero, since it cannot store probabilities with arbitrary precision. However, it seems plausible that we can usually get learning with memory of size . This is because something like "counting pieces of evidence" should be sufficient. For example, if consider fini

... (read more)
2Gurkenglas2moWhat do you mean by equivalent? The entire history doesn't say what the opponent will do later or would do against other agents, and the source code may not allow you to prove what the agent does if it involves statements that are true but not provable.
2Vanessa Kosoy2moFor a fixed policy, the history is the only thing you need to know in order to simulate the agent on a given round. In this sense, seeing the history is equivalent to seeing the source code. The claim is: In settings where the agent has unlimited memory and sees the entire history or source code, you can't get good guarantees (as in the folk theorem for repeated games). On the other hand, in settings where the agent sees part of the history, or is constrained to have finite memory (possibly of size O(log11−γ)?), you can (maybe?) prove convergence to Pareto efficient outcomes or some other strong desideratum that deserves to be called "superrationality".

Vanessa Kosoy's Shortform

by Vanessa Kosoy 18th Oct 201947 comments

9

Ω 5


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.