Epistemic status: slightly pruned brainstorming.

I've been reading John Wentworth's posts on the natural abstraction hypothesis recently, and wanted to add some of my own thoughts. These are very much "directions to look in" rather than completed research, but I wanted to share some of the insights central to my model of the NAH, which might be useful for others trying to expand on or simply grok Wentworth's work.

  • The Telephone Theorem says that over large causal distances, information not conserved perfectly by each interaction is eventually destroyed. Wentworth provides two arguments, plus empirical evidence, that the limiting distribution should have maximum entropy. (I find this intuitively obvious, even if proving it rigorously and in the most general case is difficult.)
  • If we know in advance what information will be conserved (e.g. total energy of a system) and we know how this relates to the quantity we want to know the distribution of, then we can just find the maxent distribution subject to that constraint.
  • Since maxent distributions are usually exponential families, these constraints should correspond to a sufficient statistic for an exponential family.
  • We can probably say that every sufficient statistic for a distribution corresponds to a set of conserved quantities in the system that gives rise to that distribution.
  • By Noether's theorem, conserved quantities correspond to symmetries. (The actual theorem is specific to classical mechanics, but a similar principle seems to hold generally.)
  • Therefore, sufficient statistics = conserved quantities = symmetries.
  • The symmetry model is more natural for dealing with many kinds of abstractions. If we're pointing to a flower, it's not obvious how to label information as "conserved" or "not conserved". We can, however, say what we want to be able to do to the flower and have our algorithm still be able to recognize it: move it, view it from a different angle, let it bloom or wilt, replace all the molecules, &c &c.
  • This seems isomorphic to the view of abstraction as redundant information, but focused more on different views of the same object rather than different objects from the same class.
  • Machine learning already makes use of this idea. Many unsupervised representation learning algorithms follow a process like: take an input, distort it in some way that leaves it still recognizable (such as rotating or cropping an image), then train the model to output similar embeddings for the original and distorted version.
    • If the embeddings have high mutual information when humans would perceive the data points as having high mutual information, then the embedding should contain approximately the information humans find relevant? That's an interesting hypothesis that might be interesting to formalize.
  • However, if the NAH is true, maybe an AGI wouldn't have to see a bunch of examples of (e.g.) rotated images and be told that they're the same to grasp the concept that things might be rotated? I'm not sure of this, but it seems like to a truly general algorithm, two inputs that have a low-dimensional invertible transformation between them would just be obviously similar.
    • I'm not entirely sure humans can do this (recognize a pattern they've never encountered before given only one example, even if it's "simple").
      • Possibly adult humans have just seen too many different kinds of patterns for there to even be a "simple" one we've never seen before.
    • "Low-dimensional invertible transformation" might be too general to even be computable; in the case of a rotated image, it's a simple linear transformation, but it's a linear transformation in coordinate space. It seems like the model's implicit biases might have to include the fact that this might be a thing; it would be non-obvious to a model used to viewing images as flattened vectors or even convolutions.
    • "What the heck is a 'low-dimensional' transformation anyway?" seems like a good question for further research.
  • Wentworth's research mostly deals in sample space; that is, with the probability distribution over entire data points, but finding the true probability distribution of the data is most of the work in machine learning. He talks about latents that render samples conditionally independent, but ML is looking for latents that render features (e.g. image pixels) conditionally independent.
    • There might be a simple correspondence between these two. Another interesting direction for research.

New to LessWrong?

New Comment
9 comments, sorted by Click to highlight new comments since: Today at 6:23 PM

I found this brainstorming interesting and nothing you suggested jumped out to me as obviously wrong.

As far as formalisations of natural abstractions go, the one I'm most sympathetic to/find most natural (pun acknowledged) is the redundant information concept.

I have a separate impression that good abstraction should allow you to compress the world better (more efficient world models). And the "redundant information" idea seems to gesture in the direction of high compressibility.

The notion of intelligence as compression is an old one; I believe Marcus Hutter was the first to formalize it back in the early 2000s (this is also where AIXI comes from). The problem with Hutter's formalism is that his definition of compressibility (Kolmogorov complexity) is uncomputable; "find the shortest Turing machine that outputs X" requires unbounded resources even if you have a halting oracle.

I believe that, in this paradigm, the NAH is fundamentally saying: well, for "natural" data, compressibility is computable; there's some minimal representation to which any sufficiently powerful (yet still finite) model will converge. The problem is, therefore, to figure out what a sufficiently powerful model looks like.

  • By Noether's theorem, conserved quantities correspond to symmetries. [...]
  • Therefore, sufficient statistics = conserved quantities = symmetries.

This was one of the ideas which eventually led to the resampling-based approach to natural abstraction. You can see the relevant generalization of Noether's Theorem in this Baez & Fong paper - I read the paper about a month or two before the resampling approach clicked, and it was one of the things on my mind at the time. Basically, for a Markov process, the conserved quantities correspond to eigenvectors with eigenvalue 1, which we can probe by looking for operators which commute with the transition matrix (the commutation is the "symmetry"). The main tricky part, at the time, was to figure out the right kind of "dynamics" such that the conserved quantities would be the natural abstractions; it wasn't obvious ahead of time that "just use a typical MCMC sampler" was the right answer.

sounds like you may have already come across it and been inspired by it, but if not, you may be interested to hear that symmetry has been investigated for using neural networks to model physical systems; see, eg, previous linkpost to a nice talk intro to the topic, or a more recent talk by the same person at IPAM, or one of the other several recent talks at IPAM which might connect strongly to natural abstraction. There are also interesting results for symmetry in various contexts related to learning on arxivxplorer.

The actual theorem is specific to classical mechanics, but a similar principle seems to hold generally.

Interesting, would you mind elaborating on this further?

This is something I've been thinking about recently. In particular, you can generalize this by examining temporary conserved quantities, such as phases of matter (typically produced by spontaneous symmetry-breaking). This supports a far richer theory of information-accessible-at-a-distance than only permanently conserved quantities like energy can provide, and allows for this information to have dynamics like a stochastic process. In fact, if you know a bit of solid-state physics you probably realize exactly how much of our observed macroscopic properties (e.g. object color) are determined by things like spontaneous symmetry-breaking. You can make all of this more rigorous and systematic by connecting to ergodic theory, but this is probably deserving of a full paper, if I can get around to it. Happy to discuss more with anyone else.