Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Let's start with a few examples (borrowed from here) to illustrate what we're talking about:

  • We have a gas consisting of some huge number of particles. We throw away information about the particles themselves, instead keeping just a few summary statistics: average energy, number of particles, etc. We can then make highly precise predictions about things like e.g. pressure just based on the reduced information we've kept, without having to think about each individual particle. That reduced information is the "abstract layer" - the gas and its properties.
  • We have a bunch of transistors and wires on a chip. We arrange them to perform some logical operation, like maybe a NAND gate. Then, we throw away information about the underlying details, and just treat it as an abstract logical NAND gate. Using just the abstract layer, we can make predictions about what outputs will result from what inputs. Note that there’s some fuzziness - 0.01 V and 0.02 V are both treated as logical zero, and in rare cases there will be enough noise in the wires to get an incorrect output.
  • I tell my friend that I'm going to play tennis. I have ignored a huge amount of information about the details of the activity - where, when, what racket, what ball, with whom, all the distributions of every microscopic particle involved - yet my friend can still make some reliable predictions based on the abstract information I've provided.
  • When we abstract formulas like "1+1=2*1" and "2+2=2*2" into "n+n=2*n", we're obviously throwing out information about the value of n, while still making whatever predictions we can given the information we kept. This is what abstraction is all about in math and programming: throw out as much information as you can, while still maintaining the core "prediction" - i.e. the theorem or algorithm.
  • I have a street map of New York City. The map throws out lots of info about the physical streets: street width, potholes, power lines and water mains, building facades, signs and stoplights, etc. But for many questions about distance or reachability on the physical city streets, I can translate the question into a query on the map. My query on the map will return reliable predictions about the physical streets, even though the map has thrown out lots of info.

The general pattern: there’s some ground-level “concrete” model (or territory), and an abstract model (or map). The abstract model throws away or ignores information from the concrete model, but in such a way that we can still make reliable predictions about some aspects of the underlying system.

Notice that the predictions of the abstract models, in most of these examples, are not perfectly accurate. We're not dealing with the sort of "abstraction" we see in e.g. programming or algebra, where everything is exact. There are going to be probabilities involved.

In the language of embedded world-models, we're talking about multi-level models: models which contain both a notion of "table", and of all the pieces from which the table is built, and of all the atoms from which the pieces are built. We want to be able to use predictions from one level at other levels (e.g. predict bulk material properties from microscopic structure, or predict from material properties whether it's safe to sit on the table), and we want to move between levels consistently.

Formalization: Starting Point

To repeat the intuitive idea: an abstract model throws away or ignores information from the concrete model, but in such a way that we can still make reliable predictions about some aspects of the underlying system.

So to formalize abstraction, we first need some way to specify which "aspects of the underlying system" we wish to predict, and what form the predictions take. The obvious starting point for predictions is probability distributions. Given that our predictions are probability distributions, the natural way to specify which aspects of the system we care about is via a set of events or logic statements for which we calculate probabilities. We'll be agnostic about the exact types for now, and just call these "queries".

That leads to a rough construction. We start with some concrete model and a set of queries . From these, we construct a minimal abstract model by keeping exactly the information relevant to the queries, and throwing away all other information. By the minimal map theorems, we can represent directly by the full set of probabilities ; and contain exactly the same information. Of course, in practical examples, the probabilities will usually have some more compact representation, and will usually contain some extraneous information as well.

To illustrate a bit, let's identify the concrete model, class of queries, and abstract model for a few of the examples from earlier.

  • Ideal Gas:
    • Concrete model is the full set of molecules, their interaction forces, and a distribution representing our knowledge about their initial configuration.
    • Class of queries consists of combinations of macroscopic measurements, e.g. one query might be "pressure = 12 torr & volume = 1 m^3 & temperature = 110 K".
    • For an ideal gas, the abstract model can be represented by e.g. temperature, number of particles (of each type if the gas is mixed), and container volume. Given these values and assuming a near-equilibrium initial configuration distribution, we can predict the other macroscopic measurables in the queries (e.g. pressure).
  • Tennis:
    • Concrete model is the full microscopic configuration of me and the physical world around me as I play tennis (or whatever else I do).
    • Class of queries is hard to sharply define at this point, but includes things like "John will answer his cell phone in the next hour", "John will hold a racket and hit a fuzzy ball in the next hour", "John will play Civ for the next hour", etc - all the things whose probabilities change on hearing that I'm going to play tennis.
    • Abstract model is just the sentence "I am going to play tennis".
  • Street Map:
    • Concrete model is the physical city streets
    • Class of queries includes things like "shortest path from Times Square to Central Park starts by following Broadway", "distance between the Met and the Hudson is less than 1 mile", etc - all the things we can deduce from a street map.
    • Abstract model is the map. Note that the physical map also includes some extraneous information, e.g. the positions of all the individual atoms in the piece of paper/smartphone.

Already with the second two examples there seems to be some "cheating" going on in the model definition: we just define the query class as all the events/logic statements whose probabilities change based on the information in the map. But if we can do that, then anything can be an "abstract map" of any "concrete territory", with the queries taken to be the events/statements about the territory which the map actually has some information about - not a very useful definition!

Natural Abstractions

Intuitively, It seems like there exist "natural abstractions" - large sets of queries on a given territory which all require roughly the same information. Statistical mechanics is a good source of examples - from some macroscopic initial conditions, we can compute whatever queries we want about any macroscopic measurements later on. Note that such natural abstractions are a property of the territory - it's the concrete-level model which determines what large classes of queries can be answered with relatively little information.

For now, I'm interested primarily in abstraction of causal dags - i.e. cases in which both the concrete and abstract models are causal dags, and there is some reasonable correspondence between counterfactuals in the two. In this case, the set of queries should include counterfactuals, i.e. do() operations in Pearl's language. (This does require updating definitions/notation a bit, since our queries are no longer purely events, but it's a straightforward-if-tedious patch.) That's the main subject I'm researching in the short term: what are the abstractions which support large classes of causal counterfactuals? Expect more posts on the topic soon.


New to LessWrong?

New Comment
8 comments, sorted by Click to highlight new comments since: Today at 8:30 AM
To repeat the intuitive idea: an abstract model throws away or ignores information from the concrete model, but in such a way that we can still make reliable predictions about some aspects of the underlying system.

That's not quite my understanding what an abstraction is. What you have described is basic modeling. An abstraction is a model that works well in a large class of disparate domains, like polymorphism in programming. The idea of addition is such an abstraction, it works equally well for numbers, strings, sheep, etc. What you call a natural abstraction is closer to my intuitive understanding of the concept. I do not subscribe to your assertion that this is a "property of the territory". Sheep are not like bit strings. Anyhow, ideas are in the mind, and different sets of ideas can be useful for predicting different sets of observations of seemingly unrelated parts of the territory. Also, good luck with your research, whatever definition of abstraction you use, as long as it is useful to you.


Worth noting that 'abstraction' has different meanings in different disciplines. For example, Wikipedia has separate articles for abstraction in computer science, abstraction in mathematics, abstraction in linguistics, and abstraction in sociology.

Does abstraction also need to make answering your queries computationally easier? 

I could throw away unnecessary information, encrypt it, and provide the key as the solution to an NP-hard problem.

Is this still an abstraction?

Good question. In these posts, I generally ignored computational constraints - i.e. effectively assumed infinite compute. I expect that one could substitute a computationally limited version of probability theory (like logical induction, for instance) and get a qualitatively-similar notion of abstraction which sufficiently-computationally-"scrambled" info is no longer an abstraction, even if the scrambling is reversible in principle.

I think that there are some abstractions that aren't predictively useful, but are still useful in deciding your actions.

Suppose I and my friend both have the goal of maximising the number of DNA strings whose MD5 hash is prime.

I call sequences with this property "ana" and those without this property "kata". Saying that "the DNA over there is ana" does tell me something about the world, there is an experiment that I can do to determine if this is true or false, namely sequencing it and taking the hash. The concept of "ana" isn't useful in a world where no agents care about it and no detectors have been built. If your utility function cares about the difference, it is a useful concept. If someone has connected an ana detector to the trigger of something important, then its a useful concept. If your a crime scene investigator, and all you know about the perpetrators DNA is that its ana, then finding out if Joe Blogs has ana DNA could be important. The concept of ana is useful. If you know the perpitrators entire genome, the concept stops being useful.

A general abstraction is consistent with several, but not all universe states. There are many different universe states in which the gas has a pressure of 37Pa, but also many where it isn't. So all abstractions are subsets of possible universe states. Usually, we use subsets that are suitable for reasoning about in some way.

Suppose you were literally omniscient, knowing every detail of the universe, but you had to give humans a 1Tb summary. Unable to include all the info you might want, you can only include a summery of the important points, you are now engaged in lossy compression.

Sensor data is also an abstraction, for instance you might have temperature and pressure sensors. Cameras record roughly how many photons hit them without tracking every one. So real world agents are translating one lossy approximation of the world into another without ever being able to express the whole thing explicitly.

How you do lossy compression depends on what you want. Music is compressed in a way that is specific to defects in human ears. Abstractions are much the same.

How you do lossy compression depends on what you want.

I think this is technically true, but less important than it seems at first glance. Natural abstractions are a thing, which means there's instrumental convergence in abstractions - some compressed information is relevant to a far wider variety of objectives than other compressed information. Representing DNA sequences as strings of four different symbols is a natural abstraction, and it's useful for a very wide variety of goals; MD5 hashes of those strings are useful only for a relatively narrow set of goals.

Somewhat more formally... any given territory has some Kolmogorov complexity, a maximally-compressed lossless map. That's a property of the territory alone, independent of any goal. But it's still relevant to goal-specific lossy compression - it will very often be useful for lossy models to re-use the compression methods relevant to lossless compression.

For instance, maybe we have an ASCII text file which contains only alphanumeric and punctuation characters. We can losslessly compress that file using e.g. Huffman coding, which uses fewer bits for the characters which appear more often. Now we decide to move on to lossy encoding - but we can still use the compressed character representation found by Huffman, assuming the lossy method doesn't change the distribution of characters too much.

An abstraction like "object permanence" would be useful for a very wide variety of goals, maybe even for any real-world goal. An abstraction like "golgi apparatus" is useful for some goals but not others. "Lossless" is not an option in practice: our world is too rich, you can just keep digging deeper into any phenomenon until you run out of time and memory ... I'm sure that a 50,000 page book could theoretically be written about earwax, and it would still leave out details which for some goals would be critical. :-)

Abstraction means assigning a symbol to reference a set of other symbols. It saves time and memory: time by allowing retrieval of data based on a set of rules, memory by shrinking the size of the reference. 

For example: the words 'natural' and 'artificial': we sort things into one of these labels based on whether or not they were made by a human. A 'natural' thing could be 'physical' or 'biological'. An 'artificial' thing could be 'theory' or 'implementation'. If I don't need to distinguish between physical and biological things, instead of referring to them directly, I can use the more abstract reference of 'natural' things, saving space and time in my statement.

The challenge with natural language abstraction is agreeing on definitions. Many would define the terms in the above example differently. The more we can agree on definitions of terms, the better we can reason about their subsets.

In a logically valid system of abstraction, any symbol can be related to every other symbol: either by a common parent reference, or by using one to refer to the other.