This is an automated rejection. No LLM generated, assisted/co-written, or edited work.
Read full explanation
Feature Geometry, Topology, and Holonomy in Wide Data
One of the recurring frustrations in interpretability is that identifying features is not the same thing as understanding their structure. Sparse dictionary learning and sparse autoencoders can often recover useful features, but they do not by themselves explain how those features relate to one another geometrically. This gap matters a great deal in wide datasets, where there are many more features than samples, and where meaningful behavior usually comes not from isolated variables but from coordinated groups of features acting together.
This issue shows up across modern data science. In genomics, datasets may involve thousands of genes. In microbiome studies, one may measure thousands of bacterial subpopulations. In neural networks and language models, activation spaces and sparse autoencoder dictionaries can contain hundreds of thousands of learned features. In all of these settings, the core challenge is not simply to find “important” features, but to understand how features organize into meaningful families, neighborhoods, and transitions.
A natural first step is to place geometry on the feature set itself. If a dataset is represented as a matrix, with rows corresponding to samples and columns corresponding to features, then we can turn the usual viewpoint around and treat the columns as a dataset in their own right. Once we do that, we can build topological models on the feature space. Mapper-style constructions are especially useful here: one builds a cover of the feature set, forms a graph whose nodes represent overlapping local groups of features, and uses that graph as a coarse geometric model of the space of features.
This already gives something valuable. Instead of thinking about features one by one, we can study them in local groups. Each sample then induces a function on the nodes of the graph by averaging the values of the features in each group. Visualizing these functions as graph heat maps often reveals organized patterns that are much harder to see in the raw data matrix. In biology, for example, this can expose gene modules or microbial communities whose collective activity differs sharply across disease states. In language models, it can reveal coherent regions of sparse autoencoder feature space associated with syntax, semantic themes, or gradations from abstract concepts to specific named entities.
But topology alone is not the whole story. A topological model tells us which feature groups are nearby, overlap, or cluster together. It gives a notion of shape. What it does not tell us is whether local feature meanings can be aligned consistently across the whole dataset. That is where holonomy enters.
Holonomy is a way of measuring what happens when local information is transported around a space. The intuitive picture is simple: imagine that each local neighborhood of the feature graph comes with its own small coordinate system or semantic frame. Moving from one neighborhood to another requires translating between those local frames. If, after transporting that frame around a loop, you return to where you started and recover exactly the same frame, then the local descriptions are globally consistent. But if you come back with a twist, rotation, reflection, or some other mismatch, then the space has nontrivial holonomy.
This idea is especially relevant for mechanistic interpretability. A feature in a sparse autoencoder may have a stable local meaning in one cluster of contexts and a closely related local meaning in a neighboring cluster, yet there may be no single global identification that works everywhere. In other words, feature identity may be path-dependent. Two routes through context space may lead to slightly different interpretations of what looked like “the same” feature. Holonomy measures that failure of global consistency.
That shift changes the interpretability question in an important way. Without holonomy, we ask: which features cluster together, and what high-level concept do they appear to share? With holonomy, we ask a stronger question: can these local meanings actually be glued together into one global semantic dictionary, or is there an intrinsic obstruction? This turns interpretability from a problem of clustering into a problem of local-to-global coherence.
Seen this way, topology and holonomy play complementary roles. Topology gives the underlying organization of feature space: the neighborhoods, overlaps, flares, loops, and branches that reveal how features are distributed. Holonomy tells us whether those local structures support a globally consistent notion of feature meaning. Topology identifies the map; holonomy tells us whether the map’s local coordinate systems can be stitched together without distortion.
For wide datasets, this is a powerful perspective. In genomics or microbiome data, it suggests that the same apparent module of features may not play exactly the same role across all patient groups. In sparse autoencoder analysis, it suggests that some feature families may admit stable global interpretations while others are fundamentally context-dependent. In both cases, the goal is no longer just to group features that co-activate, but to determine whether their meanings remain coherent as we move through the geometry of the data.
That is why adding geometry to feature sets matters, and why holonomy is such a promising next step. Topological models help us see structure where raw matrices hide it. Holonomy helps us ask whether that structure supports a single global interpretation or whether the system is telling us that meaning itself is only local. Together, they offer a richer and more realistic foundation for interpreting complex feature spaces.
Feature Geometry, Topology, and Holonomy in Wide Data
One of the recurring frustrations in interpretability is that identifying features is not the same thing as understanding their structure. Sparse dictionary learning and sparse autoencoders can often recover useful features, but they do not by themselves explain how those features relate to one another geometrically. This gap matters a great deal in wide datasets, where there are many more features than samples, and where meaningful behavior usually comes not from isolated variables but from coordinated groups of features acting together.
This issue shows up across modern data science. In genomics, datasets may involve thousands of genes. In microbiome studies, one may measure thousands of bacterial subpopulations. In neural networks and language models, activation spaces and sparse autoencoder dictionaries can contain hundreds of thousands of learned features. In all of these settings, the core challenge is not simply to find “important” features, but to understand how features organize into meaningful families, neighborhoods, and transitions.
A natural first step is to place geometry on the feature set itself. If a dataset is represented as a matrix, with rows corresponding to samples and columns corresponding to features, then we can turn the usual viewpoint around and treat the columns as a dataset in their own right. Once we do that, we can build topological models on the feature space. Mapper-style constructions are especially useful here: one builds a cover of the feature set, forms a graph whose nodes represent overlapping local groups of features, and uses that graph as a coarse geometric model of the space of features.
This already gives something valuable. Instead of thinking about features one by one, we can study them in local groups. Each sample then induces a function on the nodes of the graph by averaging the values of the features in each group. Visualizing these functions as graph heat maps often reveals organized patterns that are much harder to see in the raw data matrix. In biology, for example, this can expose gene modules or microbial communities whose collective activity differs sharply across disease states. In language models, it can reveal coherent regions of sparse autoencoder feature space associated with syntax, semantic themes, or gradations from abstract concepts to specific named entities.
But topology alone is not the whole story. A topological model tells us which feature groups are nearby, overlap, or cluster together. It gives a notion of shape. What it does not tell us is whether local feature meanings can be aligned consistently across the whole dataset. That is where holonomy enters.
Holonomy is a way of measuring what happens when local information is transported around a space. The intuitive picture is simple: imagine that each local neighborhood of the feature graph comes with its own small coordinate system or semantic frame. Moving from one neighborhood to another requires translating between those local frames. If, after transporting that frame around a loop, you return to where you started and recover exactly the same frame, then the local descriptions are globally consistent. But if you come back with a twist, rotation, reflection, or some other mismatch, then the space has nontrivial holonomy.
This idea is especially relevant for mechanistic interpretability. A feature in a sparse autoencoder may have a stable local meaning in one cluster of contexts and a closely related local meaning in a neighboring cluster, yet there may be no single global identification that works everywhere. In other words, feature identity may be path-dependent. Two routes through context space may lead to slightly different interpretations of what looked like “the same” feature. Holonomy measures that failure of global consistency.
That shift changes the interpretability question in an important way. Without holonomy, we ask: which features cluster together, and what high-level concept do they appear to share? With holonomy, we ask a stronger question: can these local meanings actually be glued together into one global semantic dictionary, or is there an intrinsic obstruction? This turns interpretability from a problem of clustering into a problem of local-to-global coherence.
Seen this way, topology and holonomy play complementary roles. Topology gives the underlying organization of feature space: the neighborhoods, overlaps, flares, loops, and branches that reveal how features are distributed. Holonomy tells us whether those local structures support a globally consistent notion of feature meaning. Topology identifies the map; holonomy tells us whether the map’s local coordinate systems can be stitched together without distortion.
For wide datasets, this is a powerful perspective. In genomics or microbiome data, it suggests that the same apparent module of features may not play exactly the same role across all patient groups. In sparse autoencoder analysis, it suggests that some feature families may admit stable global interpretations while others are fundamentally context-dependent. In both cases, the goal is no longer just to group features that co-activate, but to determine whether their meanings remain coherent as we move through the geometry of the data.
That is why adding geometry to feature sets matters, and why holonomy is such a promising next step. Topological models help us see structure where raw matrices hide it. Holonomy helps us ask whether that structure supports a single global interpretation or whether the system is telling us that meaning itself is only local. Together, they offer a richer and more realistic foundation for interpreting complex feature spaces.