Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Examples of Causal Abstraction

4Eigil Rischel

4johnswentworth

1Charbel-Raphaël

4Charlie Steiner

4Gordon Seidoh Worley

3johnswentworth

2Noosphere89

New Comment

There's some recent work in the statistics literature exploring similar ideas. I don't know if you're aware of this, or if it's really relevant to what you're doing (I haven't thought a lot about the comparisons yet), but here are some papers.

It is indeed relevant, I'll probably have a review of the Beckers & Halpern paper at some point (as well as their more recent extension). I'm working on essentially the same problem as them. Also thanks for the link to the Chalukpa-Perona-Eberhardt paper, I hadn't seen that one yet.

A tangent:

It sounds like there's some close ties to logical inductors here, both in terms of the flavor of the problem, and some difficulties I expect in translating theory into practice.

A logical inductor is kinda like an approximation. But it's more accurate to call it lots and lots of approximations - it tries to keep track of every single approximation within some large class, which is essential to the proof that it only does finitely worse than any approximation within that class.

A hierarchical model doesn't naturally fall out of such a mixture, it seems. If you pose a general problem, you might just get a general solution. You could try to encourage specialized solutions by somehow ensuring that the problem has several different scales of interest, and sharply limit storage space so that the approximation can't afford special cases that are too similar. But even then I think there's a high probability that the best solution (according to something that is as theoretically convenient as logical inductors) would be alien - something humans wouldn't pick out as the laws of physics in a million tries.

Somewhat related to the electrical circuits example, there might be something similar in software engineering, with levels being something like (depending on the programming paradigm):

- CPU instructions
- byte code or op code or assembly
- AST
- programming language instructions
- statements
- functions
- modules and classes
- patterns and DSLs
- processes
- applications/products

Yes definitely. I've omitted examples from software and math because there's no "fuzziness" to it; that kind of abstraction is already better-understood than the more probabilistically-flavored use-cases I'm aiming for. But the theory should still apply to those cases, as the limiting case where probabilities are 0 or 1, so they're useful as a sanity check.

I do want to note that probabilities 0 and 1 only correspond to no fuzziness if we assume a finite set. If we don't assume a finite set, then it's easy to cook up examples where probabilities are 0 or 1, but they aren't equivalent to either nothing or everything, and thus probabilities 0 or 1 can still introduce fuzziness.

I’m working on a

theory of abstractionsuitable as afoundation for embedded agencyand specifically multi-level world models. I want touse real-world examplesto build a fast feedback loop for theory development, so a natural first step is to build a starting list of examples which capture various relevant aspects of the problem.These are mainly focused on

causalabstraction, in which both the concrete and abstract model are causal DAGs with some natural correspondence between counterfactuals on the two. (There are some exceptions, though.) The list isn’t very long; I’ve chosen a handful of representative examples which cover qualitatively different aspects of the general problem.I’ve grouped the examples by

symmetry class:plate notation”), in which there are a number of conditionally IID componentsNote that many of the abstractions below abstract from one symmetry class to another - for example, MCMC abstracts a concrete time-symmetric model into an abstract plate-symmetric model.

I’m interested to hear more examples, especially examples which emphasize qualitative features which are absent from any of the examples here. Examples in which other symmetry classes play an important role are of particular interest, as well as examples with agenty behavior which we know how to formalize without too much mess.

## Finite DAGs: Examples from Electrical Circuits

Electrical engineers rely heavily on nested layers of abstraction, of exactly the sort I’m interested in (i.e. multi-level models of the physical world). Additionally, causal models are a natural fit for digital circuits. These properties make electrical circuits ideal starting points. They’re a great conceptually-simple use case.

A few of the major abstraction layers, from lowest to highest:

Note that real circuits usually do contain some repeated sub-components, but the symmetries in these DAGs aren’t particularly relevant to our purposes, so we’ll mostly ignore them.

Parallel to all this, somewhere along the way we usually abstract out the low-level continuous time-dependence, and adopt an abstract model of instantaneous input-output circuits coupled to clocked storage units (i.e. flip-flops/registers). We’ll include that abstraction separately in the time symmetry section; the levels from lumped circuit through floating point/modular arithmetic can all be specialized to memoryless input-output circuits for simplicity.

## Plate Symmetry: Statistical Toy Models

This is the simplest nontrivial symmetry class. The main new qualitative phenomena I see in this class are:

The use of sufficient statistics is a particularly simple example in this class, and adding the calculation of sufficient statistics as an explicit node in the DAG gives us the simplest embedded map. This is the easiest model I’ve used to ask questions like “when can we use the map in place of the territory?” - i.e. questions about abstractions embedded in the DAG itself.

Another example of interest in this class is an embedded reasoner which attempts to deduce model structure by leveraging symmetry. In particular, this introduces the possibility that a node in the DAG could detect (some) counterfactual modifications of the DAG - i.e. notice when it is in a counterfactual query.

## Time Symmetry: Equilibrium -> Causality

This is the main symmetry class of interest at the level of physics for most systems, so there’s a lot of examples. Most of them involve some kind of equilibrium abstraction: the concrete model is a DAG over time, while the abstract model captures long-run behavior with time removed.

The simplest example is circuit equilibrium, which we mentioned earlier. At the physical level, the behavior of electrical circuits is DAG-shaped only when viewed over time. Yet, in many applications, there are “inputs” and “outputs” and the

equilibrium stateof the electrical circuit implements a DAG of some sort. Where does the abstract causal structure come from? This problem is also very similar to causality arising in equilibrium in other areas, e.g. biochemical signalling circuits in cells, or markets/supply chains in which certain goods have very high/very low price elasticity.The next simplest example is timescale separation, in which a part of the system equilibrates much faster than the rest. A couple examples in this class:

MCMC is a particularly interesting example. The baby version of this example is the independence of widely-time-separated samples from a markov chain; that’s a simple prototypical example of abstracting time-symmetry into plate-symmetry. But MCMC adds DAG structure

withinthe plate, in a way which does not directly mirror the DAG structure of the concrete model (although it does mirror theundirectedstructure). It also involves probability calculations in each (concrete) node, which is a hint that an embedded map is present in the system.Of course, looking at abstractions of time-symmetric systems, we can’t omit feedback control. Despite loopy behavior on the concrete level, at the abstract level we can view the controller target point as causing system limiting behavior - and this abstract view will correctly handle many counterfactuals. In this case, the structure of the abstract equilibrium model might not match the concrete-level structure at all. Based on the

good regulator theorem, this is another case where embedded maps are likely to be involved.Finally, one particularly difficult example: the derivation of the Navier-Stokes equations from molecular dynamics. The main qualitative difference from the earlier examples (at least that I know of) is the importance of an

ontology shift: a move from particles to fields of delta functions, from Hamiltonian particle dynamics to Vlasov/Boltzmann equations. Without that shift, our DAG structure shifts over time - because interactions are spatially organized, particles interact with different particles depending on where they are. (Note that deriving Navier-Stokes from particle dynamics is arguably an open problem, depending on what exactly we count as a “derivation”, so there may be other interesting aspects to this example as well. Or possibly not - calculation difficulties, rather than fundamental/conceptual difficulties, seem to be considered the main blockade to a derivation.)