Before finding this site, I didn't normally associate the Prisoner's Dilemma or other questions of Decision Theory with the field of Artificial Intelligence. After all, when writing software I don't have to negotiate to get the program to cooperate with me; rather, it obeys simply by the nature of its function.

So then my next thought was the possibility of autonomous agents needing to negotiate with each other. This could be multiple components of a superintelligent construct, or possibly multiple different superintelligent AIs that share resources of the planet between them. In this scenario, it would be important for the AIs to cooperate whenever they have conflicting tasks that use the same resources.

At that point, it occurred to me that the problem of programs concurrently requiring the same resource have been known to Computer Science for decades, known as the Dining Philosophers Problem. (The analogy of philosophers "only able to eat or think" reflects the historically low opinion computer scientists have for philosophers).

Of course, the philosophers in this analogy merely represent non-autonomous programs or execution threads, and not agents, but I think an analogous issue could still appear in the real world. For example, if two agents simultaneously arrived to a resource they both require, CDT would conclude both agents should attempt to obtain the resource at the same time, causing deadlock.

Officially, the Dining Philosophers Problem has "no general solution", but current operating systems have certain approaches to mitigate it, either with a global hierarchy of resources or a "waiter" implemented as a mutex.

On the surface, it would seem that a supervising operating system across all AGIs would be a trivial solution for their mutual cooperation. Similar to the Dining Philosophers Problem, a kind of mutex would be used to regulate autonomous agents across the network, a process currently known as the Internet of Things. In fact, this kind of operating system could be a cheeky way to solve almost any Newcomblike problem.

Of course, this doesn't always work in practice, because the AI also has to consider humans as agents in the same simulation, who are both unable and unwilling to be governed in the same operating system. So instead, the AI has to model other agents decisions using something like FDT or UDT.

So in the situation where the Dining Philosophers are autonomous agents, how would FDT resolve the problem? I'm not very experienced in the field, but from what I read I would guess that each philosopher would understand the tasks that the other one needs to do, because they can assume that they all have copies of the same function. So, the philosophers would defer to whichever task is of greater importance, or has a more pressing time limit, or takes shorter time, etc.

On a final note, I start to wonder what it would be like if a computer had no operating system at all, but instead relied entirely on Decision Theory. An OS is basically a monarch of the computer, dictating an organization for programs who can't decide for themselves. But if programs were autonomous, then a computer can essentially run on "anarchy" with each execution thread carrying their own Bible of FDT.

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 1:40 PM

The processes need to agree on a shared solution algorithm. If the algorithm does consequentialist decision making, it needs to be able to anticipate how a possible joint policy it might suggest to all processes works out (so unless it's some flavor of FDT, it's going to be very confused by what's going on). But in general it could be any algorithm that does anything. The toy example of cooperation in PD can be scaled up to running an algorithm instead of following (C, C), as long as that algorithm (but not its output) is as obvious to both processes as (C, C), as an option for what they might want to do.

So if there is some well-known algorithm, say harsanyi(X, Y), whose adjudication the processes X and Y would predictably abide by, they can each separately verify that fact about the other with reasoning about programs, run the algorithm, and then blindly follow its instructions, secure in the knowledge that the other did the same.

What a funny and fascinating question. I wish I could answer it and I hope somebody else who understands the decision theory better is able to formalize the problem and do so, if it's as cool as I'm imagining it is and not somehow a trivial or incoherent thing to ask.