Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Definitions of Causal Abstraction: Reviewing Beckers & Halpern

13Generallyer

6Mark Xu

3johnswentworth

1Tomás Orozco

New Comment

4 comments, sorted by Click to highlight new comments since: Today at 10:40 AM

Update from almost 3 years in the future: this stream of work has continued developing in a few different directions. Both on the conceptual foundations, and some initial attempts to apply these tools to AI. Two recent works I was especially excited by (and their bibliographies): 'Towards a Grounded Theory of Causation for Embodied AI' (https://arxiv.org/abs/2206.13973, and here's an excellent talk by the author, https://youtu.be/5mZhcXhbciE), and 'Faithful, Interpretable Model Explanations via Causal Abstraction' (https://ai.stanford.edu/blog/causal-abstraction/).

I don't know if you've seen this, but https://arxiv.org/abs/1906.11583 is a follow-up that generalizes the Beckers and Halpern paper to a notion of approximate abstraction by measuring the non-commutativity of the diagram by using some distance function and taking expectations. I think the most useful notion that the paper introduces is the idea of a probability distribution over the set of allowed interventions. Intuitively, you don't need your abstraction of temperature to behave nicely w.r.t freezing half the room and burning the other half such that the average kinetic energy balances out. Thus you can determine the "approximate commutativeness" of the diagram by fixing a high-level intervention and taking an expectation over the low-level interventions that were likely to map to that high-level intervention.

Also, if you are willing to write up your counter example to the conjecture that Beckers and Halpern make, I am currently researching under Eberhardt and he (and I) would be extremely interested in seeing it. I also initially thought that the conjecture was obviously false, but when I tried to actually construct counter examples, all of them ended up as either not strong abstractions or not recursive (acyclic) causal models.

Turns out the particles -> fluid example doesn't work; it's not a -abstraction (which makes me think the range of applicability of -abstraction is considerably narrower than I first thought).

That said, here's a counterexample which I think works. Variables of the low-level model:

- follow an arbitrary structural model
- is a random permutation
- given by

... where U are iid noise terms. So we have some arbitrary structural model, we scramble the variables, and then we compute a function of each. For the high-level model:

- follow the same model as in the low-level model
- given by

... so it's the same as the low-level model, but with the variables unscrambled. The mapping between the two is what you'd expect: maps directly, and uses to unscramble : . Then the interventions are similarly simple:

Note that we can pick any we please for the last intervention, but we do need to pick one - we can't just leave it alone.

I'm pretty sure this checks all the boxes for strong -abstraction. But it isn't a constructive -abstraction, since all of the 's depend on the same low-level variable . In principle, there could still be some other which makes the high-level model a constructive abstraction (B&H's definition only requires that *some* exist between the two models), but I doubt it.

Let me know if you guys spot a hole in this setup, or see an elegant way to confirm that there isn't some other that magically makes it constructive.

Dear John Wentworth:

I have a doubt regarding the implications of Beckers' paper on abstractions. I am a lawyer by profession so I'm venturing pretty far afield here, and I hope my question will not be too trivial.

Given that every constructive abstraction is also a - abstraction, there must be some surjective function that is compatible with . Hence, for constructive abstractions, must there also be mappings such that, , where is the projection of onto the variables in ? In other words, must there also be a partition of the low-level exogenous variables where each partition is mapped to a distinct high-level variable? I missed in the definition of constructive abstraction..

Thank you!

[This comment is no longer endorsed by its author]

Author's Notes: This post is fairly technical, with little background and minimal examples; it is not recommended for general consumption. A general understanding of causal models is assumed. This post is probably most useful when read alongside the paper. If your last name is "Beckers" or "Halpern", you might want to skip to the last section.There’s been a handful of papers in the last few years on abstracting causal models. Beckers and Halpern (B&H) wrote an

entire paperondefinitionsof abstraction on causal models. This post will outline the general framework in which these definitions live, discuss the main two definitions which B&H favor, and wrap up with some discussion of a conjecture from the paper. I'll generally use notation and explanations which I find intuitive, rather than matching the paper on everything.In general, we’ll follow B&H in progressing from more general to more specific definitions.

## General Framework

We have two causal models: one “low-level”, and one “high-level”. There’s a few choices about what sort of “causal model” to use here; the main options are:

B&H use the first, presumably because it is the most general. That means that everything here will also apply to the latter two options.

Notation for the causal models:

Next, we need some connection between the high-level and low-level model, to capture the intuitive notion of “abstraction”. At its most general, this connection has two pieces:

Note that, for true maximum generality, both τ and ω could be nondeterministic. However, we’ll generally ignore that possibility within the context of this post.

Finally, the key piece: the high-level and low-level models should yield the same predictions (in cases where they both make a prediction). Formally:

P[XH|do(ω(XLS←XL∗S))]=P[τ(XL)|do(XLS←XL∗S)]

For the category theorists: this means that we get the same distribution by either (a) performing an intervention on the low-level model and then applying τ to XL, or (b) first applying τ to XL, then applying the high-level intervention (found by transforming the low-level intervention via ω).

The first definition of “abstraction” examined by B&H is basically just this, plus a little wiggle room: they don’t require

allpossible interventions to be supported, and instead include in the definition a set of supported interventions. This definition isn’t specific to B&H - it’s an obvious starting point for defining abstraction on causal models as broadly as possible. B&H adopt this maximally-general definition fromRubenstein et al, and dub it “exact transformation”.B&H then go on to argue that this definition is

toogeneral for most purposes. I won’t re-hash their arguments and examples here; the examples in the paper are pretty readable if you’re interested. They also introduce one slightly stronger definition which I will skip altogether; it seems to just be cleaning up a few weird cases, without any major conceptual additions.## τ-Abstraction

The main attraction in B&H is their definition of “τ-abstraction”. The main idea in jumping from the maximally-general framework above to τ-abstraction is that the function τ mapping low-level variables to high-level variables

inducesa choice of mapping between interventions; there’s no need to leave the choice of ω completely open-ended.In particular, since XH=τ(XL) by definition, it seems like τ should also somehow relate XH∗ to XL∗ in the interventions XLSL←XL∗SL and XHSH←XH∗SH. The obvious condition is XH∗=τ(XL∗). However, the interventions themselves only constrain XH∗ and XL∗ at the indices SH and SL respectively, whereas τ may depend on (and determine) the variables at other indices.

One natural condition to impose: each value of XH∗ consistent with the high-level intervention should correspond to at least one possible value of XL∗ consistent with the corresponding low-level intervention, and each possible value of XL∗ consistent with the low-level intervention should produce a value of XH∗ consistent with the high-level intervention. More formally: if our intervention values are XH∗SH=xH∗ and XL∗SL=xL∗, then we want equality between sets:

{XH∗|XH∗SH=xH∗}={τ(XL∗)|XL∗SL=xL∗}

This is the main criterion B&H use to define the “natural” mapping between interventions ωτ. (The exact definition given by B&H is a bit dense, so I won’t walk through the whole thing here.)

Armed with a natural transformation ωτ between low-level and high-level interventions, the next step is of course to define a notion of abstraction: modulo some relatively minor technical conditions, a τ-abstraction is an abstraction consistent with our general framework, and for which ω=ωτ.

One more natural step: A “strong” τ-abstraction is one for which all interventions on the high-level model are allowed.

## Constructive τ-Abstraction

In practical examples of abstraction, the high-level variables XH usually don’t all depend on all the low-level variables XL. Usually, the individual high-level variables XHi can each be calculated from

non-overlappingsubsets of the variables XL. In other words: we can choose apartitionσ of the low-level variables and break up τ such thatXHi=τi(XLσi).

Also including all the conditions required for a strong τ-abstraction, B&H call this a “constructive” τ-abstraction.

The interesting part: B&H conjecture that, modulo some as-yet-unknown minor technical conditions, any strong τ-abstraction is constructive.

I think this conjecture is probably wrong. Main problem: constructive τ-abstraction doesn’t handle ontology shifts.

My go-to example of causal abstraction with an ontology shift is a fluid model (e.g. Navier Stokes) as an abstraction of a particle model with only local interactions (e.g. lots of billiard balls). In this case, we have two representations of the low-level system:

The two are completely equivalent; each contains the same information. Yet they have very different structure:

bothspace and time.In this case, the high-level fluid model is a constructive abstraction of the Eulerian representation, but not of the Lagrangian representation: the high-level model only contains interactions which are local in both time and space.

Conceptually, the problem here is that our graph can have dynamic structure: the values of the variables themselves can determine which other variables they interact with. When that happens, an ontology shift can sometimes make the dynamic structure static, as in the Lagrangian -> Eulerian transformation. But that means that a constructive τ-abstraction on the static structure will not be a constructive τ-abstraction on the dynamic structure (since the partition would depend on the variables themselves), even though the two models are equivalent (and therefore presumably both are τ-abstractions).

This does leave open the possibility of weakening the definition of a constructive τ-abstraction to allow the partition σ to depend on XL. Off the top of my head, I don’t know of a counterexample to the conjecture with that modification made.