Abstraction As Symmetry and Other Thoughts

1st Feb 2023

5DragonGod

3Numendil

2DragonGod

3Numendil

3johnswentworth

4Zach Furman

2the gears to ascension

1Dalcy

1Zach Furman

New Comment

9 comments, sorted by Click to highlight new comments since: Today at 6:23 PM

I found this brainstorming interesting and nothing you suggested jumped out to me as obviously wrong.

As far as formalisations of natural abstractions go, the one I'm most sympathetic to/find most natural (pun acknowledged) is the redundant information concept.

I have a separate impression that good abstraction should allow you to compress the world better (more efficient world models). And the "redundant information" idea seems to gesture in the direction of high compressibility.

The notion of intelligence as compression is an old one; I believe Marcus Hutter was the first to formalize it back in the early 2000s (this is also where AIXI comes from). The problem with Hutter's formalism is that his definition of compressibility (Kolmogorov complexity) is uncomputable; "find the shortest Turing machine that outputs X" requires unbounded resources even if you have a halting oracle.

I believe that, in this paradigm, the NAH is fundamentally saying: well, for "natural" data, compressibility is computable; there's some minimal representation to which any sufficiently powerful (yet still finite) model will converge. The problem is, therefore, to figure out what a sufficiently powerful model looks like.

- By Noether's theorem, conserved quantities correspond to symmetries. [...]
- Therefore, sufficient statistics = conserved quantities = symmetries.

This was one of the ideas which eventually led to the resampling-based approach to natural abstraction. You can see the relevant generalization of Noether's Theorem in this Baez & Fong paper - I read the paper about a month or two before the resampling approach clicked, and it was one of the things on my mind at the time. Basically, for a Markov process, the conserved quantities correspond to eigenvectors with eigenvalue 1, which we can probe by looking for operators which commute with the transition matrix (the commutation is the "symmetry"). The main tricky part, at the time, was to figure out the right kind of "dynamics" such that the conserved quantities would be the natural abstractions; it wasn't obvious ahead of time that "just use a typical MCMC sampler" was the right answer.

sounds like you may have already come across it and been inspired by it, but if not, you may be interested to hear that symmetry has been investigated for using neural networks to model physical systems; see, eg, previous linkpost to a nice talk intro to the topic, or a more recent talk by the same person at IPAM, or one of the other several recent talks at IPAM which might connect strongly to natural abstraction. There are also interesting results for symmetry in various contexts related to learning on arxivxplorer.

The actual theorem is specific to classical mechanics, but a similar principle seems to hold generally.

Interesting, would you mind elaborating on this further?

This is something I've been thinking about recently. In particular, you can generalize this by examining *temporary* conserved quantities, such as phases of matter (typically produced by spontaneous symmetry-breaking). This supports a far richer theory of information-accessible-at-a-distance than only permanently conserved quantities like energy can provide, and allows for this information to have *dynamics *like a stochastic process. In fact, if you know a bit of solid-state physics you probably realize exactly how much of our observed macroscopic properties (e.g. object color) are determined by things like spontaneous symmetry-breaking. You can make all of this more rigorous and systematic by connecting to ergodic theory, but this is probably deserving of a full paper, if I can get around to it. Happy to discuss more with anyone else.

Epistemic status: slightly pruned brainstorming.I've been reading John Wentworth's posts on the natural abstraction hypothesis recently, and wanted to add some of my own thoughts. These are very much "directions to look in" rather than completed research, but I wanted to share some of the insights central to my model of the NAH, which might be useful for others trying to expand on or simply grok Wentworth's work.

every sufficient statistic for a distribution corresponds to a set of conserved quantities in the system that gives rise to that distribution.&c &c.humanscan do this (recognize a pattern they've never encountered before given only one example, even if it's "simple").coordinate space. It seems like the model's implicit biases might have to include the fact that this might be a thing; it would be non-obvious to a model used to viewing images as flattened vectors or even convolutions.samplesconditionally independent, but ML is looking for latents that renderfeatures(e.g. image pixels) conditionally independent.