(This post was originally published on Oct 6th 2017, and has been temporarily brought forwarded as part of the AI Alignment Forum launch sequence on fixed points.)
This post explains how to view Kakutani's fixed point theorem as a special case of Brouwer's fixed point theorem with hyperreal numbers. This post is just math intuitions, but I found them useful in thinking about Kakutani's fixed point theorem and many things in agent foundations. This came out of conversations with Sam Eisenstat.
Brouwer's fixed theorem says that a continuous function from a compact convex subset of to itself has fixed point. Kakutani's fixed point is similar, but instead of continuous functions, it uses Kakutani functions, which are set valued functions with closed graph which are point wise nonempty and convex.
When I think about Kakutani functions, I usually think about them as limits of continuous functions. For example, consider the kakutani function from to itself which sends negative inputs to , positive inputs to , and sends to the entire interval. You can view as a the limit of a sequence of functions sends to . This is not a point wise limit, since if it was would be sent to 0, rather than the entire interval. Instead, it is a limit in the Hausdorff metric between the graphs of the functions.
Given two compact nonempty subsets and of , the Hausdorff distance between and is the maximum over all points in or of the Euclidean distance between that point and the closest point in the other set. Since and are compact, this maximum is achieved.
Given compact convex subsets and of , we say that a sequence of continuous functions from to converges in graph Hausdorff distance to the closed graph set valued function if the graphs of viewed as subsets of converges to the graph of in Hausdorff distance.
We say that a closed graph function from to is a continuous graph limit of if there exists a sequence of continuous functions which converges to in graph Hausdorff distance.
Theorem 1: Every continuous graph limit from a compact convex subset of to itself has a fixed point (a point contained in its image).
Proof: is a limit of continuous functions each of which has a fixed point by Brouwer. Choose one fixed point from each to get a sequence of points which has a convergent subsequence by Bolzano–Weierstrass. Let be the limit of this convergent subsequence. If did not contain , then would not be in the graph of . Since is closed graph, a ball around would not be in the graph of , which contradicts the fact that must contain functions with graphs arbitrarily close to the graph of with fixed points arbitrarily close to and thus points in their graphs arbitrarily close to .
This theorem is not equivalent to Kakutani's fixed point theorem. There exist continuous graph limits which are not point wise convex (but only in more than one dimension). For example the function from to which sends every point to the circle of points distance 1 from is not Kakutani, but is a continuous graph limit. It is the limit of functions given by .
However, this theorem is strictly stronger that Kakutani's fixed point theorem (although sometimes harder to use, since showing a function is Kakutani might be easier than showing it is a continuous graph limit)
Theorem 2: Given compact convex subsets and of , every Kakutani function from to is a continuous graph limit.
Proof: We define a function as follows. Take a finite set of radius open balls in such that each ball intersects the graph of , the coordinates of the centers of all the balls are distinct, and the balls cover the entire graph of . This induces a covering of by radius open balls by taking balls centered at the coordinates of the centers of . We continuously map each point in to a weighted average of balls in as follows. If a point is the center of some ball, it is sent to 100% that ball. Otherwise, it is sent to a combination of all the balls in which it is contained with weight proportional to (the reciprocal of the distance to the center of that ball) minus . This gives a function from to by mapping each point to the weighted average of the coordinates of the centers of the balls in with weights equal to the weights of the corresponding balls in above. One can verify that is continuous.
Observe that the graph of contains the centers of all balls in . Thus, contains points within of every point is the graph of . Thus, if did not converge to , it must be because infinitely many contain points a distance from the graph of bounded away from . Consider a convergent subsequence of these points. This gives a point not in the graph of and a subsequence of the with points in their graphs converging to .
Let be half the distance between and , and consider the set of all points in at most from the nearest point in . Note that is convex. Note that all in the graph of with sufficiently close to must have in , since otherwise there would be with converging to which must have a convergent subsequence with converging to a point not in , contradicting the fact that has closed graph.
However must send all points within distance of to a point in the convex hull of the images under of points within of . But, we showed that for sufficiently small and sufficiently large all of these points must be in . Therefore, for all sufficiently large , must send all points within of to points in , which are bounded away from , contradicting the assumption that points in the converge to .
Corollary: Kakutani's fixed point theorem
We have proven (a strengthening of) Kakutani's Fixed Point Theorem from Brouwer's fixed point theorem, and given a way to think about Kakutani functions as just limits of graphs of continuous functions, and thus have better intuitions about what (a superset of) Kakutani functions look like. We will now take this further and think about Kakutani as a consequence of an analogue of Brouwer using Hyperreal infinitesimal numbers. (I am going to be informal now. I am not going to use standard notation here. I am not going to make sense to people who don't already know something about non-standard analysis. Sorry.)
Given a compact convex subset of , we can define to be the set of all equivalence classes of infinite sequences of elements of , where two sequences are equivalent if they agree on a set that matters according to some ultrafilter on . A function from to itself is defined by a sequence of functions from to itself , where you apply the functions pointwise. I will call a function hyper-continuous if each of the component functions is continuous. (I am not sure what this is actually called.) Each point in has a standard part, which is a point in , which is the unique point such that a subset of components that matter converges to .
Claim: Every hyper continuous function from to itself has a fixed point.
Proof: Just take the sequence of the fixed points of the individual component functions.
Claim: Continuous graph limits from to are exactly those closed graph set-valued functions such that there exists a hyper-continuous function such that if and only if there exist points and with standard parts and respectively such that
Proof: Use the same sequence of functions with graphs converging to that of as the sequence of functions defining .
Thus, we can view continuous graph limits (and thus Kakutani functions) as something you get when looking at just the standard part of a hyper-continuous function from the hyper version of to itself. The fixed point will fix everything, including the infinitesimal parts, and we do not have to deal with any set-valued functions.
For example, consider our original function from to itself which sends negative inputs to , positive inputs to , and sends to the entire interval. We can view this as a function involving infinitesimals where everything with positive real part is sent to something with real part , everything with negative real part is sent to something with real part , and the infinitesimal numbers very close to 0 are sent to something in-between. If we use the sequence of functions from above and let the infinitesimal be , then zooming in on the inputs between and , will just be a steep linear function with slope .
Now to be even more vague and connect things back up with agent foundations, perhaps this can give some good intuitions about what is happening with reflective oracles and probabilistic truth predicates. The oracle/truth predicate is effectively "zooming in" on the area around a specific probability, and when you stack oracle calls or truth predicates within each other, you can zoom in further. The fact that the probabilistic truth predicate does not know that it is reflectively consistent, can be viewed as it not believing a sentence akin to "If I assign probability less that to , then I also assign probability less that to ," which seems very reasonable. It also makes reflective oracles and the probabilistic truth predicates look more similar to other approaches to the same problem that are more hierarchy forming solutions to the same problem like normal halting oracles. Here the hierarchy comes from zooming in further and further on the infinitesimal in the Kakutani function.
This post was originally published on Oct 6th 2017, and has been brought forwarded as part of the AI Alignment Forum launch sequences.
Tomorrow's AIAF sequences post will be 'Iterated Amplification and Distillation' by Ajeya Cotra, in the sequence on iterated amplification.