I think your title "The Axiom of Choice is Not Controversial" is literally false, because "controversial" is a social property referring to how much disageement there is, and not whether something is true. Rejecting the AoC is a minorinty view but it is not that minority. It's also a respected alternative view, and texts often go out of their way to mention when they are acceping the AoC. It may if anything be the most controversial thing in math.
I think among working mathematicians it is that much of a minority. E.g. it is much less controversial than something like many-worlds among physicists, or something like heritability of intelligence among geneticists. It is broadly incredibly accepted.
EDIT: But I do agree it's a respected alternative view, and I do think it should stay that way because it is interesting to investigate. I just think people get the wrong idea about its degree of controversy among working mathematicians.
I tried to search for surveys of mathematicians on the axiom of choice, but couldn't find any. I did find one survey of philosophers, but that's a very different population, asked whether they believed AC/The Continuum Hypothesis has an answer rather than what the answer is: https://thephilosophyforum.com/discussion/13670/the-2020-philpapers-survey
My subjective impression is that my Mathematician friends would mostly say that asking whether AC is true or not is not really an interesting question, while asking what statements depend on it is.
Yes sorry to be clear I’m not talking about whether it is true, I am talking about whether they would use it or not in proving ‘standard’ results.
My understanding is that yes, axiom of choice (or more generally non-constructive methods) is convenient and it "works", and if you naively take definitions and concepts from those realm and see what results / properties hold when removing the axiom of choice (or only use constructive methods), many of the important results / properties no longer hold (as you mentioned: Tychonoff, existence of basis, ... ).
But it is often the case that you can redevelop these concepts in a choice-free / constructive context in such a way that it captures the spirit of what those definitions and concepts originally intended to capture, and yes it is harder this way, but 1) doing so often lets one recover the "morally correct" equivalent of those results / properties that do in fact hold in this context, and more importantly, 2) doing so has a lot of conceptual value.
For example, equivalent definitions become non-equivalent (such as finiteness; trying to do computable analysis and make sense of the intermediate value theorem in this context leads to new ideas like locale theory, abstract stone duality, overtness (dual of compactness, which is trivial classically), etc) where each has different and new interpretation, and the role of computability and approximation is made explicit which requires bringing in new / additional mathematical structures, etc. Also, many classical theorems have their choice-free / constructive equivalent, eg Tychonoff's theorem for locales (arbitrary coproduct of compact frames is compact - no axiom of choice required to prove!) - and all of this gives us new and sometimes deep insight about the concept that would have been overlooked in the classical realm[1].
To put it differently: Choice turns structure into property. Without choice, we can instead treat those structure as additional data. This lets various theorems re-emerge, often in many non-equivalent forms - and this is good.
See also: Five Stages of accepting Constructive Mathematics, Expanding the domain of discourse reveals structure already there but hidden.
I have only heard of these examples (I am not at all familiar with them) in the context of constructive / computable analysis, but I expect this to be a lesson that holds broadly throughout mathematics (and more narrowly: that it is possible to come up with a "morally correct" choice-free equivalent of the theory that in its current form crucially depends on choice, and that this gives new conceptual insights), and this not having been done already for some subject X is more of an issue of lack-of-mathematician-years put into it.
Oh yeah, I totally agree that it is interesting to investigate what happens in non-choice worlds as an exercise in mathematical foundations, which sounds kind of like what you’re saying? But correct me if I’m misinterpreting.
I know someone doing a PhD in this, and I think it’s pretty cool stuff.
I think you misinterpreted me - my claim is that working without choice often reveals genuine hidden mathematical structures that AC collapses into one. This isn't just an exercise in foundations, in the same way that relaxing the parallel postulate to study the resulting inequivalent geometries (which were equivalent, or rather, not allowed under the postulate) isn't just an "exercise in foundations."
Insofar as [the activity of capturing natural concepts of reality into formal structures, and investigating their properties] is a core part of mathematics, the choice of working choice-free is just business as usual.
Oh, yeah sure. I mean I think as a matter of pragmatics it mostly is an exercise in foundations these days. But I agree that splitting up concepts of finitude -- for example -- is a super interesting investigation. Just, like, the majority of algebraic geometers, functional analysts, algebraic topologists, analytic number theorists, algebraic number theorists, galois theorists, representation theorists, differential geometers, etc etc etc, would not be all that interested in such an investigation these days.
Agree with the last sentence. I think in a majority of the fields, lines of investigation with higher insight-per-effort, in the current margin, are those done with choice (or with even more controversial things like the large cardinal axioms).
Edit: this comment by Terence Tao also expresses a similar perspective:
In general, it seems that infinitary methods are good for “long-range” mathematics, as by ignoring all quantitative issues one can move more rapidly to uncover qualitatively new kinds of results, whereas finitary methods are good for “short-range” mathematics, in which existing “soft” results are refined and understood much better via the process of making them increasingly sharp, precise, and quantitative. I feel therefore that these two methods are complementary, and are both important to deepening our understanding of mathematics as a whole.
I tend to agree with the idea that AC is rather unfairly described as ‘to be rejected’, especially Banach-Tarski.
We are no strangers to strange things in mathematics, especially with infinity, so I don't really understand the argumentative structure: it's strange, then I don't know what to conclude from that.
It's no stranger than many other things, it seems to me : such as the fact that there are as many even integers as there are integers, or that there are as many real numbers between 0 and 1 as there are real numbers.
There are many strange things.
Also, with "the argument of strange things", we come back to rejecting the axiom of infinity, rather, I would say ?
But for me, that's not the heart of the problem:
An axiom is often seen, wrongly (in my opinion), as something that we posit, or don't posit, or posit the opposite of, and then we work within that framework.
This is not entirely wrong, but it is the point of view that seems to me to be erroneous as a point of view:
We do not state that ‘groups are now Abelian’; no, we are going to study Abelian groups if we want to, but we do not forget that non-Abelian groups exist nonetheless; we have not ‘stated’ Abelianity, we are simply going to look at this specific type of object.
Similarly, we do not ‘assume’ that the ring is commutative; at best, we say that we will only consider commutative rings in the rest of the text (book, course, etc.).
In short, an axiom is not there to be assumed or not; it is there to describe a type of object, in my opinion.
So here, it is not a question of ‘postulating’ whether or not the ZF-universe verifies AC, but rather of seeing whether in our work it is more practical to work with ZF-universes that verify AC or whether we do not need to restrict ourselves to those that verify AC in order to work, and thus produce a more general result on ZF-universes; but potentially more costly to prove because it requires more generality.
Or even to specify results on ZF-universes that do not verify AC, even if they are potentially less frequent.
So really seeing ZF-universes as objects, and not as somewhat transcendent entities, etc.
I think I basically agree that this is how one should consider this.
But I think there is a reasonable defence of the "ZF-universe as somewhat transcendent entities" and that is that we do virtually all of our actual maths in ZF-universes, by saying that ultimately we will be able to appeal to ZF-axioms. This makes ZF-objects pretty different from groups. E.g. I think there's a pretty tight analogy between forcing in ZF and galois extensions (just forcing is much more complicated), but the consequences of forcing for how we do the rest of maths can be somewhat deep (e.g. CH is doomed in ZFC, consequences about Turing computability, etc). So the mystical reputation is somewhat deserved. Woodin would defend some much more complicated and involved version of this as I discussed in my post about the constructible universe.
But I agree ultimately, with our current understanding of ZF-universes that they are just another mathematical object, they just happen to be an object that we use to do other maths with, and we can step outside those objects these days with large cardinal axioms, if we'd like, and analyze other consequences of them. It's pretty similar to how we started viewing logic and logics after Gödel's results (i.e. clearly first-order logic is very useful because of compactness/completeness, but that doesn't mean "Second order logic is wrong.")
Now, where did the weirdness come from here. Well, to me it seems clear that really it came from the fact that the reals can be built out of a bunch of shifted rational numbers, right? But everyone agrees about that.
I do not think everyone agrees about that! I think that people who reject AoC would say "sure, any real number can be a shifted rational, but not all of them, there's just no reasonable procedure which does this for the entire set at once."
Yeah that’s probably right. But then there’s introducing this weird distinction between “I can do it for any x” and “I can do it for all x.”
It quickly becomes pretty philosophical at that point, about whether you think there’s a distinction there or not. I guess my claim in this post is more like "working mathematicians in fields outside of foundations have collectively agreed on an answer to this philosophical puzzle, and that answer is actually quite defensible."
If you write me a function of type , I can point out the place in its source code where you included a value of type , but I can't write a function of type .
I think I must've not paid enough attention in type theory class to get this? Is this an excluded middle thing? (if it's a joke that I'm ruining by asking this feel free to let me know)
foo ::
foo f = f 4
Look, there's an integer! It's right there, "4". Apparently is inhabited.
bar ::
bar fo = ???
There's nothing in particular to be done with fo... if we had something of type to give fo, we would be open for business, but we don't know enough about to make this any easier than coming up with a value of type , which is a non-starter.
Sorry, I think explaining without using type theory what you are trying to say may help me understand better?
EDIT: like, in particular, insofar as its relevant to the axiom of choice.
I am giving an example of something I can do "for any x" but not "for all x". In the first case, the x is given in a fully constructed, reified form, and I can look at its internals to build a bespoke response. In the second case, I would have to give a general procedure that can work with all x while interacting with the x only by means of its external interface.
Ah okay, I think I understand, if I'm remembering my type theory correctly. I think this is downstream of "standard type theory" i.e. type theory created by Löf not accepting the excluded middle? Which does also mean rejecting choice, for sure.
EDIT: But fwiw, I think the excluded middle is much less controversial than Choice (it should technically be strictly less controversial). I think that may be a less interesting post, but I'm sure philosophers have already written that. Though I think a post defending rejecting the excluded middle from a type theory perspective actually could be quite good, because lots of people don't seem to understand the arguments from the other side here, and think they're just being ridiculous.
I can come-up-with-math-to-model any problem, but I can't come-up-with-math-to-model all problems, by diagonalization.
Well put! I guess if I can define a function from problems to math-to-model-it, then for every problem I can pick out the right math-to-model-it?
Or, indeed, perhaps not? ;)
From a computationalist semantics perspective, the axiom of choice asserts that every nondeterministic function can nondeterministically be turned into a deterministic function. When the input space has decidable equality (as in countable choice), that is easy to imagine, as one can gradually generate the function, using an associative list as a cache to ensure that the answers will be consistent. However, when the input space doesn't have decidable equality, this approach doesn't work, and the axiom of choice cannot be interpreted computationally.
Many of the "pathologies" when rejecting choice have similar computational problems as choice does, so bringing them up doesn't really disrupt this argument.
Many of the "pathologies" when rejecting choice have similar computational problems as choice does, so bringing them up doesn't really disrupt this argument.
Example: Let P be some proposition. Theorem: If every vector space has a basis, then . Proof: Define a vector space where . Let be a basis over . Express in this basis. If , we have a basis element , and thus . Otherwise, we know and thus .
Ah yeah, this is a Hamel basis version of Diaconescu’s theorem (a very cool theorem)! Lovely proof!
I expect you already know some of these, but for anyone interested:
Asaf Karagila’s Anti-anti Banach–Tarski arguments. A short blog post whose main point is that “The axiom of choice is not at fault here. The axiom of infinity is.” As an illustration, he shows that if, instead of the axiom of choice, one assumes dependent choice and that all sets of reals are Lebesgue measurable, then there is a partition of the real line into strictly more parts than elements.
By the same author: Zornian Functional Analysis, or: How I Learned to Stop Worrying and Love the Axiom of Choice. A 30-page article discussing some of the (often counterintuitive) consequences of rejecting the axiom of choice.
ZF(C) is not the only way to axiomatize set theory as a first-order theory. Lawvere’s Elementary Theory of the Category of Sets (ETCS) is a reasonable alternative. I find it interesting that in ETCS the axiom of choice is built in without much hesitation, whereas the axiom of replacement is not part of the core theory (though it can be added to recover equivalence with ZFC). While the set theorists I have spoken to tend to regard replacement as a major axiom, I do not often see arguments that it should be rejected in order to avoid paradoxes. (For a short introduction to ETCS, I personally recommend Tom Leinster’s Rethinking set theory, an 8-page article navigating between intuitions and formalism.)
On the Banach–Tarski paradox, Vsauce’s Michael Stevens has made a video that gives a clear explanation and helpful visualisation: link.
Now, where did the weirdness come from here. Well, to me it seems clear that really it came from the fact that the reals can be built out of a bunch of shifted rational numbers, right?
I think the weirdness comes from trying to assign a real number measure, instead of allowing infinitesimals. I've never understood why infinite sets are readily accepted, but infinitesimal/infinite measures are not.
EDIT: To explain my reasoning more, suppose you were Pythagoras and your student came to you and drew a geometric diagram with lengths not in a ratio of whole numbers. You have two options here:
Finding the right extension is not an easy problem. Should we extend the numbers to allow square roots (including nesting), but nothing else? This suffices for geometry. But it's actually more useful to use something like a Cauchy sequence completion: Let any sequence of rational numbers that gets closer and closer together "converge" to a real number. Historically, extending your system of numbers has been what has worked.
When we come across an "immeasurable" set, this to me feels like the same kind of problem. Perhaps we don't yet have a general consensus on what the "right" extension is to infinitesimals/infinities. However, there clearly are some sets with infinitesimal measure, like the set you constructed. We should figure out a way to give that set infinitesimal measure, not just call it immeasurable.
If you're talking about surreals or hyperreals, the issue is basically that there's not one canonical model of infinitesimals, you can create them in many different ways. I'll hopefully end up writing more about the surreals and hyperreals at some point, but they don't solve as many issues as you'd hope unfortunately, and actually introduce some other problems.
As a motivating idea here, note that you need the Boolean Prime Ideal Theorem (which requires a weak form of choice to prove) to show that the hyperreals even exist in the first place, if you're starting from the natural numbers as "mathematically/ontologically basic." (maybe there's another way to define them but none immediately come to mind, there is another way to define the surreals, but there are other issues there).
I sometimes speak to people who reject the axiom of choice, or who say they would rather only accept weaker versions of the axiom of choice, like the axiom of dependent choice, or most commonly the axiom of countable choice. I think such people should stop being silly, and realize that obviously we need the axiom of choice for modern mathematics, and it’s not that weird anyway! In fact, it’s pretty natural.
So what is the axiom of choice? The axiom of choice simply says that given any indexed collection of (non-empty) sets, you can make one arbitrary choice from each set. It doesn’t matter how many sets you have — you could have an uncountable number of sets. You can still make a choice from all of the sets, saying “I’ll take this element from the first set, that element from the second set, this element from the third set…” and so on.
The axiom of choice is the only explicitly non-constructive axiom of set theory[1], and for that reason, in the 1920s and 1930s, it was contentious. No longer, however. Almost all modern working mathematicians will accept the axiom of choice without much thought, as they should[2].
Why do people reject the Axiom of Choice?
When people reject the Axiom of Choice, there’s usually two main examples that they’ll give:
Let’s go through each of these and show you why they’re really not that bad:
Why paradoxes are not that bad
Essentially, what happens with the axiom of choice is that people have heard that there were arguments about non-constructivism in the history of maths[3] and then learn that the axiom of choice allows a paradox to occur. However, when you really dig into what causes the paradox to occur, it’s to do with the weird nature of other objects, not the axiom of choice. The axiom of choice just gets the blame for unjustified historical reasons. It often allows us to realise that something weird is going on, but it’s not the cause of the weirdness! You’ll see this pattern recur as we go through the examples.
Banach-Tarski Paradox
Banach-Tarski — which I won’t explain in too much detail, for a nice visualisation, see here — is by far the most commonly given reason why people reject the axiom of choice. The Banach-Tarski paradox uses the axiom of choice to prove that if you have a ball (i.e. a non-hollow 3-dimensional sphere), then you can rearrange certain subsets of the ball to create two identical copies of the ball, without adding in any new pieces.
But the axiom of choice is not what causes the weirdness in the Banach-Tarski paradox! This is pretty obvious when you actually read the proof of Banach-Tarski. The axiom of choice simply lets you take the weirdness and construct something out of it geometrically.
The thing that causes something weird to happen — which Banach and Tarski use to build two copies of the unit ball — is that the free group on 2 elements is something called a non-amenable group. That is, there is no consistent way to define a “measure” on the group[4]. This is where the true weirdness lies, and you can prove the free group on 2 elements is non-amenable within ZF alone. Choice is not necessary for something weird to happen here. It is weird that there are non-amenable groups, but they exist already, without the axiom of choice (in fact, more groups are non-amenable if you reject the axiom of choice than if you accept it[5]).
Once we have this non-amenable group, we then find a “paradoxical decomposition” of the group in two different ways. See your Honour, the “paradox” has entered the room before choice even got here, so it can’t be guilty! If the group’s not legit, you must acquit!
Banach-Tarski then shows that there’s a copy of this non-amenable group, the free group on 2 elements, within rotations of the 3-dimensional ball[6], and uses this group to divide the ball into different classes.
Essentially, the classes are defined so that a ~ b iff there is some series of rotations, using this group, that moves a to b. You can then choose one representative from each of these classes. — this is where choice gets used — and then by carefully using the paradoxical decompositions of the free group, two copies of the original ball can be constructed.
All choice is doing here is letting you take the paradoxical behaviour of the free group on 2 elements, and apply it to the sphere. Sure, without choice you would be barred from applying it to the sphere, but that doesn’t really make the paradoxical behaviour go away. You’ve just stopped it from being realised in physical space. Choice is not to blame here! On to count 2!
Vitali sets
Before I can jump into Vitali sets, I’ll first explain the very basics of measure theory (I promise I’ll be quick). A measure is something that basically assigns a “size” to a subset of the real numbers. This “size” should satisfy some nice properties and intuitions that we have about how “size” should work for the real numbers. I’ll quickly define the classic measure, μ, on the real numbers. It works as follows:
This measure also satisfies one other nice property. The key one for our purposes is that if we take two sets A and B with empty intersection[8], then μ(A∪B) = μ(A) + μ(B).
In fact, this applies for countable unions of sets, so that if we have countably many sets: A, B, C, …, where no element appears in any two sets, then μ(A∪B∪C∪…) = μ(A) + μ(B) + μ(C) + …
This seems fine, right? Wrong! This is where the Vitali sets come in.
Vitali sets are pathological subsets of the reals. They’re a subset of the reals for which this measure cannot assign a size (it’s not that the size of a Vitali set is 0, there’s no size we can give it at all). To build a Vitali set, we first need to start with the rational numbers. Based on the measure we built above, the set of rational numbers must have size 0. It’s not too hard to see why: the set of rationals is built from countably many individual numbers, and each individual number has measure 0, if you sum up countably many sets of size 0, you end up with a set of… size 0.
This makes sense — we’d naively expect the rational numbers to have size 0. There’s a really tiny amount of rational numbers compared to the reals. It might feel weird because on a number line, you can’t find anywhere a rational number isn’t. But this is the wrong intuition. Everyone agrees the rational numbers should have measure 0.
Now consider the rest of the real numbers. The whole thing. We can start counting all of the real numbers using the rationals to help us! Let’s begin:
There are all the real numbers that a rational number plus Liouville’s constant.
… and so on.
We can break up all the real numbers this way into different “shifts” of the rational numbers. Now, for each “shift,” there are many other shifts that would give us the same collection of real numbers, for example instead of shifting the rationals by π in step 3, I could’ve shifted them by π+1, π+2, π+0.5, etc.
So, let’s make a choice out of all of these shifts that I could’ve written — and that’s where the axiom of choice comes in. For each set, let’s choose a representative between 0 and 1 (so for the π set, we’d choose e.g. π-3). This creates a set which we’ll call the Set Of Crazy Reals.
Now for this Set Of Crazy Reals: the representatives we’ve selected for each of our “shifts;” what is the measure of it? Well, let’s try to use this Set O.C.R. to cover all the reals between 0 and 1.
There’s actually a really easy way to do it! Which is to add all the rational numbers between 0 and 1 to our Set O.C.R. (and subtract 1 if they go over 1). So the numbers between 0 and 1 can be split up by taking:
and so on. Eventually all the numbers between 0 and 1 will appear in one of these sets, and no two of these sets contain any of the same elements[9]. Then, if this Set Of Crazy Reals is measurable, we can analyze the results using our measure rule from earlier.
So the set [0,1) is made up of countable unions of our Set Of Crazy Reals, which have just been shifted. If this Set Of Crazy Reals had measure 0, then [0,1) would have to also have measure 0 (since it comes from adding a countable number of sets of measure 0 together). If this Set Of Crazy Reals had measure more than 0, let’s say 0.1; then [0,1) is going to have infinite measure, since we’re adding infinitely many sets[10] of measure 0.1 together to get all of [0,1). So there’s a contradiction, it can’t be measurable!
Now, where did the weirdness come from here. Well, to me it seems clear that really it came from the fact that the reals can be built out of a bunch of shifted rational numbers, right? But everyone agrees about that. The part where we chose a representative of each of those shifts didn’t seem to add any weirdness, it just realized it. In fact, if we didn’t have the axiom of choice, I think there’d be a weirder consequence. We could still build the reals out of shifted rationals, but there’d be no set to witness how it’s done. If there was no choice, that set would just be banned from existing at all.
You can’t run from your fears forever, choice-haters, you have to overcome them!
Why we should keep the axiom of choice?
Okay, if you’ve read this far, perhaps you’re wondering “Sure, maybe the axiom of choice just lets us realise these monstrosities, but why not banish them? After all, what has the axiom of choice done for me recently?” Let’s go through the list:
And:
If you reject choice, in favour of some sort of abomination like Solovay’s model, then you’ll encounter other, much worse, pathologies. You can’t even state the continuum hypothesis in the traditional sense without the axiom of choice! You can take a product of infinitely many copies of the integers, forming a 2D grid, then a 3D grid, then a 4D grid, until you take a product of infinitely many copies of the integers and then… you can’t say anything about that structure[11]N. In choice-world, that’s just an infinite-dimensional “grid,” without choice, you can say nothing!
Modern mathematicians all accept the axiom of choice for a reason — it’s extremely useful! It works, and therefore is no longer controversial. Stop blaming the axiom of choice for the pathologies of other sets; what did it ever do to you?
Although this is debatable, the axioms of power-set and infinity aren’t that constructive.
N.B. on consistency-strength grounds, ZF is consistent iff ZFC is consistent, so there’s no grounds to reject on the grounds of trying to reduce consistency-strength requirements.
Or the history of the philosophy of maths — take your pick.
Which means, very simply, there is no way to assign to the collection of subsets of G a function μ such that μ(G) = 1 and for any subset of the group A and any element g in the group, μ(A) = μ(gA). For example, for finite groups it is easy to see that they’re amenable, for each subset A, you just take μ(A) = |A|/|G| (the size of A divided by the size of G).
Not in the sense that more groups are provably non-amenable, in the sense that fewer groups are provably amenable.
Note that you need at least 3 dimensions here, since in 2 dimensions there’s “not enough space” for the free group to be realised.
A+4 here is just the set of every element in A plus 4. So if our original set was [0,1] our new set would be [4,5].
That is, there is no number that is in A and in B. A and B are entirely “disjoint.”
Exercise left to the reader, but you can take it on faith if you prefer!
Note for pedants: disjoint sets.
Note, this is not quite right. It only occurs in Solovay's model for countable infinities, but does occur for uncountable infinities and for certain sets (not the integers, since they have an ordering which allows us to make a choice, the linked post is incorrect about that claim).