Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

The Promise and Peril of Finite Sets

8AlexMennen

3Alexander Gietelink Oldenziel

1davidad

2Eigil Rischel

New Comment

the still-confusing revised slogan that all

computablefunctions are continuous

For anyone who still finds this confusing, I think I can give a pretty quick explanation of this.

The reason I'd imagine it might sound confusing is that you can think of what seem like simple counterexamples. E.g. you can write a short function in your favorite programming language that takes a floating-point real as input, and returns 1 if the input is 0, and returns 0 otherwise. This appears to be a computation of the indicator function for {0}, which is discontinuous. But it doesn't *accurately* implement any function on at all, because its input is a floating-point real, and floating-point arithmetic has rounding errors, so you might apply your function to some expression which equals something very small but nonzero, but gets evaluated to 0 on your computer, or which equals 0, but gets evaluated to something small but nonzero on your computer. This problem will arise for any attempt to implement a discontinuous function; rounding errors in the input can move it across the discontinuity.

The conventional definition of a computation of a real function is one for which the output can be made accurate to any desired degree of precision by making the input sufficiently precise. This is essentially a computational version of the epsilon-delta definition of continuity. And most continuous functions you can think of can in fact be implemented computationally, if you use an arbitrary-precision data type for the input (fixed-precision reals are discrete, and cannot be sufficiently precise approximations to arbitrary reals).

Yes, I highly recommend that second link, and Andrej Bauer's work in general.

I'm not sure about the claim "there are infinite many different notions of finiteness"; on that particular page I count 10 notions of "finite set" (although I'm not going to claim the collection of all such notions is a finite set), most of which are rarely useful. In the OP I assume finiteness means Bishop-finiteness, which that page calls the "standard definition". I've also found Kuratowski-finiteness useful, and the categorical generalization to finitely generated and finitely presented objects.

**By far the most useful generalization of finiteness, in my view, is ****compactness**. From a realizability perspective, a compact space is one where universal quantification (i.e. testing whether a semidecidable predicate holds throughout all of ) is itself semidecidable. This is discussed in various places by Andrej Bauer (MO answer), Paul Taylor (monograph), and Martín Escardó (slideshow). A typical geometric or intuitive way to establish that a metric space is compact is to see it as isometric to a bounded and closed subset of and use the Heine–Borel theorem (although not *all* compact metric spaces, and certainly not all compact Hausdorff spaces, can be established as compact in such a way).

I strongly recommend Escardó's Seemingly impossible functional programs, which constructs a function `search : ((Nat -> Bool) -> Bool) -> (Nat -> Bool)`

which, given a predicate on infinite bitstrings, finds an infinite bitstring satisfying the predicate if one exists. (In other words, if `p : (Nat -> Bool) -> Bool`

and any bitstring at all satisfies `p`

, then `p (search p) == True`

(Here I'm intending `Nat`

to be the type of natural numbers, of course) .

Has this ever happened to you?

The underlying cause of such dilemmas is that somebody, somewhere in the setup of your ontology and the framing of your problem statement, explicitly or implicitly, made the following move:

## The Promise

This is an

incrediblypopular simplifying assumption, easily competitive with "we assume this distribution is Gaussian" and "we neglect higher-order terms". It's plausibly the single most popular mathematical modeling assumption in the world, because one needs so little background knowledge to make use of it.It's also extremely powerful. Here's just some of the mathematical structure you get for free if you accept the assumption:

Computable representation. Saying that X is a finite set can be formalized as ∃n:N.X≅{i:N|i<n}, which means that there's some number k=⌈log2n⌉ such that any element x:X can be losslessly and efficiently represented with k bits.Metric. Given two elements a,b:X, we say that d(a,b)=0 if a=b, and d(a,b)=1 otherwise. This is well-defined and makes X into a well-behaved metric space.Compact Hausdorff topology. That metric induces a topology, which is also very well-behaved (in particular, it's compact and Hausdorff, which together make it a compactum).Symmetry group. The automorphisms of {i:N|i<n} can be transported along the isomorphism to X, so now we have a permutation group, which is useful for formalizing things along the lines of "all elements of X will be treated equally".Canonical probability measure. By the principle of transformation groups (which I discussed in the middle of this Shortform post about maximum entropy), this permutation group justifies normalized counting measure as an uninformative prior for an uncertain element of X (assuming all elements really should be considered interchangeable).Ordering. If we have choice as a background assumption, a well-ordering isn't anything special, but from any specific witness of X's finiteness we get an eminently realizable total order, which is not merely a well-ordering but acompletely distributive lattice. This is useful in tie-breaking mechanisms, for example.Coproduct decomposition. If X≅{i:N|i<n}, then X≅∐i<n1. This means that applying any coproduct-preserving operation to X is quite easy, even if it preserves only finite coproducts.I could go on (we can define addition and multiplication operators modulo n, forming a ring structure...), but I think these are the most commonly leaned-upon consequences of the finite-set assumption.

Definitely not about finite sets, but similar in spirit, are assumptions that X≅N or X≅Z. Here the rig or ring structures are more significant, but we lose compactness and the canonical probability measure. These assumptions are more common when X is taken to refer to a temporal or spatial dimension, rather than a type of objects.

## The Peril

Sometimes, one simply cannot construct a lossless way of matching all the things of type X up with some {i:N|i<n}. It's not uncommon for this to become a serious bottleneck in adjusting your model to match reality. The opening conundrum is how this usually manifests itself. The specific examples I briefly alluded to there are:

Even thornier problems arise when one tries to define absolute utility by adding up the utility of all individuals. Although counting humans is pretty reliable to date, with the thorniest edge cases still being relatively uncontroversial (twins who are conjoined at the brain and share a thalamus; they're 2 people), but if there are ever any morally relevant digital individuals (which I hope there will be!), the edge cases will get much worse.

Typically, as soon as these issues start to creep in, they can in principle grow to cause unbounded discrepancies, which can be exploited by adversaries—either actual adversaries in the problem domain (like in Sybil attacks) or rival modelers (as happens with biodiversity measurements).

## The Remedy

Here's how to fix it:

## 1. Name the

typeof things that you'd been struggling to squeeze into the shape of interchangeable, disconnected parts.Yes, this is a little bit about type theory versus set theory. When you assume that a type is a set, one thing you assume in particular is that there's a well-defined equality predicate—that ∀a,b:X,(a=b)∨(a≠b). And so, yes, this is a little bit about constructivism or intuitionistic logic versus classical logic. Practically speaking, sometimes boundaries between a and b are fuzzy, and the middle is not cleanly excluded. Geometrically, excluded middle is a kind of atomization of the space of possibilities; even if this is metaphysically justified, you can't always actually pick apart the atoms. Reducing the relationship of a and b to always exactly one of a=b or a≠b is, when you're running into these difficulties, either fundamentally impossible, pragmatically infeasible, or possible but only by grossly neglecting subtleties that matter.

So, instead of talking about the

setof species, we adjust the frame to talk about thetypeof species, or species-space. (I think these frames are both good here and very closely related; see also "In what sense does Homotopy Type Theory 'model types as spaces'?")## 2. Choose a more general kind of space than 'finite set' to model this type of thing.

Note, I don't say "more

structuredthan 'finite set'". Finite sets are very structured; as listed above, they come equipped with all kinds of structures!Sometimes you will end up with something "more structured" in a certain way. For example, you might find that you were neglecting the possibility of arbitrary convex combinations of what had seemed like atomic elements, and change to modeling X as a convex space. Finite sets don't have convex structure, so in this way you are moving to a more structured kind of space.

More commonly, you will end up with a less structured kind of space, like a topological space, or a metric space, or a measurable space. Again, you might have a default classical view of a topological space as a set equipped with certain additional structure, but if you define a topological space in type theory (as a type equipped with additional structure), you never have to assume that there's set structure (a realizable equality predicate) on the underlying type.

## 3. Retrace the path to setting up your problem statement, and figure out how you can get the structure you need.

For example:

quantifythe similarity between specimens genetically. For a lot more breadth and depth on this, see Leinster's book Entropy and Diversity, which partly inspired this post.instantiatingsuch a base measure is still just as open of a problem as actually counting up all the moral theories, but I think it opens the door to a more promising research avenue.)There may also be some theorems out there that can help you construct some structures from others in a canonical way. For example, if you have a compact Hausdorff structure and a group structure, then you can construct Haar measure, which is uniquely invariant to the group operation.

## Aside: Computability

There’s a certain cluster of mental patterns that strongly hesitate to go this route, because non-finite types (especially uncountable types)oftenaren’t possible to faithfully and fully represent on a computer, nor on paper, which brings pragmatic concerns and sometimes also metaphysical concerns. I was once of this persuasion myself; speaking from experience, what helped me to get more flexibility of view here was Weihrauch’s Type-2 Theory of Computability (often spelled Type-II or Type Two; a good recent survey is Schröder, 2020), and gaining an understanding of Brouwer’s famous claim that all functions are continuous, and the still-confusing revised slogan that all

computablefunctions are continuous (for great depth on this, see Steinberg et al., 2021). At a more elementary level, I think it also helped to learn about corecursion and coalgebraic data, as an alternative to the usual framing of computation with recursion and (algebraic) data.Once you believe that you can really compute with mathematical objects that aren’t discrete, it’s easier to see such objects as meaningful, and once you see such objects as meaningful, it’s easier to see how discreteness is actually quite a strong assumption.

## Conclusion

Moving away from an initial finite-set assumption is not easy, but can be feasible and fruitful. It also tends to illuminate a deeper level of insight into the structure of the problem. I'm not saying that one shouldn't use finite-set assumptions; much like certain programming languages, it's a great place to start "rapid prototyping" and is quite often a fine place to finish as well, but a modeler should be aware that it carries some risk of making your model inapplicable to realistically hard problem instances.