Ω 13

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Has this ever happened to you?

But how can we decide whether this {species/moral theory/consensus node/quantum world} is actually 2 different things, or just 1 thing? Or does it even count at all?

The underlying cause of such dilemmas is that somebody, somewhere in the setup of your ontology and the framing of your problem statement, explicitly or implicitly, made the following move:

For simplicity of analysis, we assume that  is a finite set.

The Promise

This is an incredibly popular simplifying assumption, easily competitive with "we assume this distribution is Gaussian" and "we neglect higher-order terms". It's plausibly the single most popular mathematical modeling assumption in the world, because one needs so little background knowledge to make use of it.

It's also extremely powerful. Here's just some of the mathematical structure you get for free if you accept the assumption:

1. Computable representation. Saying that  is a finite set can be formalized as , which means that there's some number  such that any element  can be losslessly and efficiently represented with  bits.
2. Metric. Given two elements , we say that  if , and  otherwise. This is well-defined and makes  into a well-behaved metric space.
3. Compact Hausdorff topology. That metric induces a topology, which is also very well-behaved (in particular, it's compact and Hausdorff, which together make it a compactum).
4. Symmetry group. The automorphisms of  can be transported along the isomorphism to , so now we have a permutation group, which is useful for formalizing things along the lines of "all elements of  will be treated equally".
5. Canonical probability measure. By the principle of transformation groups (which I discussed in the middle of this Shortform post about maximum entropy), this permutation group justifies normalized counting measure as an uninformative prior for an uncertain element of  (assuming all elements really should be considered interchangeable).
6. Ordering. If we have choice as a background assumption, a well-ordering isn't anything special, but from any specific witness of 's finiteness we get an eminently realizable total order, which is not merely a well-ordering but a completely distributive lattice. This is useful in tie-breaking mechanisms, for example.
7. Coproduct decomposition. If , then . This means that applying any coproduct-preserving operation to  is quite easy, even if it preserves only finite coproducts.

I could go on (we can define addition and multiplication operators modulo , forming a ring structure...), but I think these are the most commonly leaned-upon consequences of the finite-set assumption.

Definitely not about finite sets, but similar in spirit, are assumptions that  or . Here the rig or ring structures are more significant, but we lose compactness and the canonical probability measure. These assumptions are more common when  is taken to refer to a temporal or spatial dimension, rather than a type of objects.

The Peril

Sometimes, one simply cannot construct a lossless way of matching all the things of type  up with some . It's not uncommon for this to become a serious bottleneck in adjusting your model to match reality. The opening conundrum is how this usually manifests itself. The specific examples I briefly alluded to there are:

1. Trying to measure biodiversity by counting up species runs into difficulties when there's no clear answer to whether two specimens belong to different species.
2. Trying to resolve moral uncertainty by giving equal voice to a bunch of theories runs into difficulties when some theories seem to be disadvantaged by being too simple to admit many different variations.
3. Trying to reach global consensus by giving equal voice to each of a finite set of participants runs into difficulties called Sybil attacks. If you like anthropomorphizing your objects of study (I know I do sometimes), you can often view any failure of the finite-set assumption as analogous to being Sybil-attacked by the elements you're trying to model.
4. Trying to elaborate the Everett model as literally postulating "many worlds" runs into problems when you try to count how many.

Even thornier problems arise when one tries to define absolute utility by adding up the utility of all individuals. Although counting humans is pretty reliable to date, with the thorniest edge cases still being relatively uncontroversial (twins who are conjoined at the brain and share a thalamus; they're 2 people), but if there are ever any morally relevant digital individuals (which I hope there will be!), the edge cases will get much worse.

Typically, as soon as these issues start to creep in, they can in principle grow to cause unbounded discrepancies, which can be exploited by adversaries—either actual adversaries in the problem domain (like in Sybil attacks) or rival modelers (as happens with biodiversity measurements).

The Remedy

Here's how to fix it:

1. Name the type of things that you'd been struggling to squeeze into the shape of interchangeable, disconnected parts.

Yes, this is a little bit about type theory versus set theory. When you assume that a type is a set, one thing you assume in particular is that there's a well-defined equality predicate—that . And so, yes, this is a little bit about constructivism or intuitionistic logic versus classical logic. Practically speaking, sometimes boundaries between  and  are fuzzy, and the middle is not cleanly excluded. Geometrically, excluded middle is a kind of atomization of the space of possibilities; even if this is metaphysically justified, you can't always actually pick apart the atoms. Reducing the relationship of  and  to always exactly one of  or  is, when you're running into these difficulties, either fundamentally impossible, pragmatically infeasible, or possible but only by grossly neglecting subtleties that matter.

So, instead of talking about the set of species, we adjust the frame to talk about the type of species, or species-space. (I think these frames are both good here and very closely related; see also "In what sense does Homotopy Type Theory 'model types as spaces'?")

2. Choose a more general kind of space than 'finite set' to model this type of thing.

Note, I don't say "more structured than 'finite set'". Finite sets are very structured; as listed above, they come equipped with all kinds of structures!

Sometimes you will end up with something "more structured" in a certain way. For example, you might find that you were neglecting the possibility of arbitrary convex combinations of what had seemed like atomic elements, and change to modeling  as a convex space. Finite sets don't have convex structure, so in this way you are moving to a more structured kind of space.

More commonly, you will end up with a less structured kind of space, like a topological space, or a metric space, or a measurable space. Again, you might have a default classical view of a topological space as a set equipped with certain additional structure, but if you define a topological space in type theory (as a type equipped with additional structure), you never have to assume that there's set structure (a realizable equality predicate) on the underlying type.

3. Retrace the path to setting up your problem statement, and figure out how you can get the structure you need.

For example:

1. If we're measuring biodiversity, we can quantify the similarity between specimens genetically. For a lot more breadth and depth on this, see Leinster's book Entropy and Diversity, which partly inspired this post.
2. If we're trying to give equal weight to different moral theories, we can explicitly postulate a base measure over the space of moral theories. (Actually instantiating such a base measure is still just as open of a problem as actually counting up all the moral theories, but I think it opens the door to a more promising research avenue.)
3. If we're trying to give equal voice to participants in a permissionless system, we can postulate a base measure over participant-space. In this case, there are many reasonable instantiations, such as measuring hashes per second (Bitcoin), quantifying stake in the protocol (Ethereum 2.0), or the generalized approach of Power Fault-Tolerance.
4. If we're trying to count Everett worlds, we can just use the probability measure computed by the Born rule.

There may also be some theorems out there that can help you construct some structures from others in a canonical way. For example, if you have a compact Hausdorff structure and a group structure, then you can construct Haar measure, which is uniquely invariant to the group operation.

Aside: Computability

There’s a certain cluster of mental patterns that strongly hesitate to go this route, because non-finite types (especially uncountable types) often aren’t possible to faithfully and fully represent on a computer, nor on paper, which brings pragmatic concerns and sometimes also metaphysical concerns. I was once of this persuasion myself; speaking from experience, what helped me to get more flexibility of view here was Weihrauch’s Type-2 Theory of Computability (often spelled Type-II or Type Two; a good recent survey is Schröder, 2020), and gaining an understanding of Brouwer’s famous claim that all functions are continuous, and the still-confusing revised slogan that all computable functions are continuous (for great depth on this, see Steinberg et al., 2021). At a more elementary level, I think it also helped to learn about corecursion and coalgebraic data, as an alternative to the usual framing of computation with recursion and (algebraic) data.

Once you believe that you can really compute with mathematical objects that aren’t discrete, it’s easier to see such objects as meaningful, and once you see such objects as meaningful, it’s easier to see how discreteness is actually quite a strong assumption.

Conclusion

Moving away from an initial finite-set assumption is not easy, but can be feasible and fruitful. It also tends to illuminate a deeper level of insight into the structure of the problem. I'm not saying that one shouldn't use finite-set assumptions; much like certain programming languages, it's a great place to start "rapid prototyping" and is quite often a fine place to finish as well, but a modeler should be aware that it carries some risk of making your model inapplicable to realistically hard problem instances.

Ω 13

New Comment

the still-confusing revised slogan that all computable functions are continuous

For anyone who still finds this confusing, I think I can give a pretty quick explanation of this.

The reason I'd imagine it might sound confusing is that you can think of what seem like simple counterexamples. E.g. you can write a short function in your favorite programming language that takes a floating-point real as input, and returns 1 if the input is 0, and returns 0 otherwise. This appears to be a computation of the indicator function for {0}, which is discontinuous. But it doesn't accurately implement any function on  at all, because its input is a floating-point real, and floating-point arithmetic has rounding errors, so you might apply your function to some expression which equals something very small but nonzero, but gets evaluated to 0 on your computer, or which equals 0, but gets evaluated to something small but nonzero on your computer. This problem will arise for any attempt to implement a discontinuous function; rounding errors in the input can move it across the discontinuity.

The conventional definition of a computation of a real function is one for which the output can be made accurate to any desired degree of precision by making the input sufficiently precise. This is essentially a computational version of the epsilon-delta definition of continuity. And most continuous functions you can think of can in fact be implemented computationally, if you use an arbitrary-precision data type for the input (fixed-precision reals are discrete, and cannot be sufficiently precise approximations to arbitrary reals).

Yes, I highly recommend that second link, and Andrej Bauer's work in general.

I'm not sure about the claim "there are infinite many different notions of finiteness"; on that particular page I count 10 notions of "finite set" (although I'm not going to claim the collection of all such notions is a finite set), most of which are rarely useful. In the OP I assume finiteness means Bishop-finiteness, which that page calls the "standard definition". I've also found Kuratowski-finiteness useful, and the categorical generalization to finitely generated and finitely presented objects.

By far the most useful generalization of finiteness, in my view, is compactness. From a realizability perspective, a compact space  is one where universal quantification (i.e. testing whether a semidecidable predicate holds throughout all of ) is itself semidecidable. This is discussed in various places by Andrej Bauer (MO answer), Paul Taylor (monograph), and Martín Escardó (slideshow). A typical geometric or intuitive way to establish that a metric space is compact is to see it as isometric to a bounded and closed subset of  and use the Heine–Borel theorem (although not all compact metric spaces, and certainly not all compact Hausdorff spaces, can be established as compact in such a way).

I strongly recommend Escardó's Seemingly impossible functional programs, which constructs a function search : ((Nat -> Bool) -> Bool) -> (Nat -> Bool) which, given a predicate on infinite bitstrings, finds an infinite bitstring satisfying the predicate if one exists. (In other words, if p : (Nat -> Bool) -> Bool and any bitstring at all satisfies p, then p (search p) == True

(Here I'm intending Nat to be the type of natural numbers, of course) .