"I always remember, [Hamming] would come into my office and try to solve a problem [...] I had a very big blackboard, and he’d start on one side, write down some integral, say, ‘I ain’t afraid of nothin’, and start working on it. So, now, when I start a big problem, I say, ‘I ain’t afraid of nothin’, and dive into it."
—Bruce MacLennan
Update to my last shortform on "Why Homology?"
My current favorite frame for thinking about homology is as fixing Poincare's initial conception of "counting submanifolds up to cobordism". (I've learned this perspective from this excellent blog post, and I summarize my understanding below.)
In Analysis Situs, Poincare sought to count m-submanifolds of a given a n-manifold up to some equivalence relation - namely, being a boundary of some (m+1)-submanifold, i.e. cobordism. I personally buy cobordism as a concept that is as natural as homotopy for one to have come up with (unlike the initially-seemingly-unmotivated definition of "singular homology"), so I am sold on this as a starting point.
Formally, given a n-manifold and m-submanifolds (disjoint) , being cobordant means there's a (m+1)-submanifold such that . These may have an orientation, so we can write this relation as a formal sum where . Now, if there are many such (m+1)-submanifolds for which the form a disjoint boundary, we can sum all of these formal sums together to get where .
Now, this already looks a lot like homology! For example, above already implies themselves have empty boundary (because manifold boundary of manifold boundary is empty, and are disjoint). So if we consider two formal sums and to be the same if , then 1) we are considering formal sums of with empty boundary 2) up to being a boundary of a (m+1)-dimensional manifold. This sounds a lot like - though note that Poincare apparently put none of this in a group theory language.
So Poincare's "collection of m-submanifolds of up to cobordism" is the analogue of !
But it turns out this construction doesn't really work for some subtle issues (due to Heegaard). This led Poincare to a more combinatorial alternative to this cobordism idea that didn't face these issues, which became the birth of the more modern notion of simplicial homology.
(The blog post then describes how Poincare's initial vision of "counting submanifolds up to cobordism" can still be salvaged (which I plan to read more about in the future), but for my purpose of understanding the motivation behind homology, this is already very insightful!)
Perhaps relevant: An Informational Parsimony Perspective on Probabilistic Symmetries (Charvin et al 2024), on applying information bottleneck approaches to group symmetries:
... the projection on orbits of a symmetry group’s action can be seen as an information-preserving compression, as it preserves the information about anything invariant under the group action. This suggests that projections on orbits might be solutions to well-chosen rate-distortion problems, hence opening the way to the integration of group symmetries into an information-theoretic framework. If successful, such an integration could formalise the link between symmetry and information parsimony, but also (i) yield natural ways to “soften” group symmetries into flexible concepts more relevant to real-world data — which often lacks exact symmetries despite exhibiting a strong “structure” — and (ii) enable symmetry discovery through the optimisation of information-theoretic quantities.
Have you heard of Rene Thom's work on Structural Stability and Morphogenesis? I haven't been able to read this book yet[1], but my understanding[2] of its thesis is that: "development of form" (i.e. morphogenesis, broadly construed, eg biological or otherwise) depends on information from the structurally stable "catastrophe sets" of the potential driving (or derived from) the dynamics - structurally stable ones, precisely because what is stable under infinitesimal perturbation are the only kind of information observable in nature.
Rene Thom puts all of this in a formal model - and, using tools of algebraic topology, show that these discrete catastrophes (under some conditions, like number of variables) have a finite classification, and thus (in the context of this morphological model) is a sort of finitary "sufficient statistics" of the developmental process.
This seems quite similar to the point you're making: [insensitive / stable / robust] things are rare, but they organize the natural ontology of things because they're the only information that survives.
... and there seems to be the more speculative thesis of Thom (presumably; again, I don't know this stuff), where geometric information about these catastrophes directly correspond to functional / internal-structure information about the system (in Thom's context, the Organism whose morphogenic process we're modeling) - this presumably is one of the intellectual predecessors of Structural Bayesianism, the thesis that there is a correspondence between internal structures of Programs or the Learning Machine with the local geometry of some potential.
I don't think I have enough algebraic topology background yet to productively read this book. Everything in this comment should come with Epistemic Status: Low Confidence.
From discussions and reading distillations of Thom's work.
Thank you for the suggestion! That sounds like a good idea, this thread seems to have some good recommendations, will check them out.
Learning algebraic topology, homotopy always felt like a very intuitive and natural sort of invariant to attach to a space whereas for homology I don't think I have anywhere as close of an intuitive handle or sense of naturality of this concept as I do for homotopy. So I tried to collect some frames / results for homology I've learned to see if it helps convince my intuition that this concept is indeed something natural in mathspace. I'd be very curious to know if there are any other frames or Deeper Answers to "Why homology?" I'm missing:
Not exactly about adversarial error correction, but: there is a construction (Çapuni & Gács 2021) of a (class of) universal 1-tape (!!) Turing machine that can perform arbitrarily long computation subject to random noise in the per-step action. Despite the non-adversarial noise model, naive majority error correction (or at least their construction of it) only fixes bounded & local error bursts - meaning it doesn't work in the general case, because even though majority vote reduces error probability, the effective error rate is still positive, so something almost surely goes wrong (eg error burst of size greater than what majority vote can handle) as .
Their construction, in fact, looks like a hierarchy of simulated turing machines where the higher-level TM is simulated by a level below it but at a bigger tape scale, such that it can resist larger error bursts - and the overall construction looks like "moving" the "live simulation" of the actual program that we want to execute up the hierarchy over time to coarser and more reliable levels.
Notes and reflections on the things I've learned while Doing Scholarship the last two three weeks (i.e. studying math).
(EDIT (Nov 18): I will post these less frequently, maybe once a month or two, and also make it more self-contained in context, since journal-like content like this probably isn't all that useful for most people. I will perhaps make a blog for more regular learning updates.)
The past three weeks were busier than usual so I had slower progress this time but here it is:
Chapter 6 continued: Sard's theorem
Chapter 8: Vector Fields
Chapter 10: Vector Bundles
Chapter 11: Cotangent Bundle
Chapter 13: Riemannian Manifold
Then I read some Bredon for Algebraic Topology.
There's a couple easy ones, like low rank structure, but I never really managed to get a good argument for why generic symmetries in the data would often be emulatable in real life.
Right, I expect emulability to be a specific condition enabled by a particular class of algorithms that a NN might implement, rather than a generic one that is satisfied by almost all weights of a given NN architecture[1]. Glad to hear that you've thought about this before, I've also been trying to find a more general setting to formalize this argument beyond the toy exponential model.
Other related thoughts[2]:
This is more about how I conceptually think they should be (since my motivation is to use their non-genericity to argue why certain algorithms should be favored over others), and there are probably interesting exceptions of symmetries that are generically emulatable due to properties of the NN architecture (eg depth).
Some of these ideas were motivated following a conversation with Fernando Rosas.
Another solution: Practice dynamic visual acuity and predict the opponent's move via their hand shape.
The extreme version of this strategy looks like this robot.
The human version of this strategy (source) is to realize that rock is the weakest (since it is easy to recognize as there is no change in hand shape over time, given that the default hand-state is usually rock), and so conclude that the best strategy is to play paper if you recognize no change in hand shape, and play scissor if you recognize any movement (because it means it's either paper or scissor, and scissor gives win or draw)[1].
This is of course vulnerable to exploitation once the opponent knows you're using this and they also have good dynamic visual acuity (eg opponent can randomize the default hand-state, diagonalize against your heuristic by inserting certain twitches to their hand movement, etc).