What are good models of collusion in AI?

22nd Sep 2021



I'm working on a paper and accompanying blog post examining theories of collusion in the context of oligopolistic firms in economics, to see what those models would say about AI safety scenarios (e.g. values handshakes, acausal negotiation, etc.). I'm very familiar with the econ literature, but I want to make sure I'm drawing on the state-of-the-art in AI theory as well. Any advice on which sources I should look at?

Huh, good question. I don't really know, but I'll try and help anyway :)

Of all the prisoner's dilemma tournaments we've run, I think 2014 was probably the most interesting. But the lesson was pretty common-sensical - just simulate your opponent and if cooperating is at least as good as defecting, cooperate. There's another interesting result about the prisoner's dilemma that I found while googling that I hadn't seen before (for background on that post, see here).