Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Causal Abstraction Toy Model: Medical Sensor

5Hazard

2johnswentworth

4waveman

2johnswentworth

3Ramana Kumar

2johnswentworth

New Comment

I enjoyed this! I had to read through the middle part twice; is the idea of the basically "it depends on what the distributions are, but there is another simple stat you can computer from the , which combined with their average, gives you all the info you need"?

I liked that this was a simple example of how choices in the way you abstract do or don't lose different information.

"it depends on what the distributions are, but there is another simple stat you can computer from the , which combined with their average, gives you all the info you need"

Yes, assuming it's a maximum entropy distribution (e.g. normal, dirichlet, beta, exponential, geometric, hypergeometric, ... basically all the distributions we typically use as fundamental building blocks). If it's not a maximum entropy distribution, then the relevant information can't be summarized by a simple statistic; we need to keep around the whole distribution P[X=x | M] for every possible value of x. In the maxent case, the summary statistics are sufficient to compute that distribution, which is why we don't need to keep around anything else.

It looks like I have to read the whole post to see whether it is of interest to me, because there is no summary. Instead you seem to just wade in to the detail.

I tried reading the first sentences of each paragraph but that was useless because they are almost all opaque references to the previous material.

I suggest you add a summary and start paragraphs with a sentence encapsulating the key idea of the paragraph.

Thanks, I had considered adding something at the top but didn't actually do that. Will add it now.

When you talk about counterfactuals do you mean interventions? Although I'm guessing the "everything still works" conclusion holds for both interventions and counterfactuals.

Yeah, I have a habit of not distinguishing between the two. At least for most of the problems I think about, as long as we're working with a structural model the difference doesn't really matter.

Author's Note: This post is a bunch of mathy research stuff with very little explanation of context. Other posts in this sequence will provide more context, but you might want to skip this one unless you're looking for mathy details.Suppose we have a medical sensor measuring some physiological parameter. The parameter has a constant true value X, and the sensor takes measurements M1…Mn over a short period of time. Each measurement has IID error (so the measurements are conditionally independent given X). In the end, the measurements are averaged together, and there’s a little bit of extra error as the device is started/stopped, resulting in the final estimate Y - the only part displayed to the end user. We can represent all this with a causal DAG:

Note that, conceptually, there are two main sources of error in the final estimate Y:

… so the node Y is not fully deterministic. The joint distribution for the whole system is given by

P[X,M1...Mn,Y]=P[X](∏iP[Mi|X])P[Y|1n∑iMi]

Since all the measurements are to be averaged together anyway, it would be nice if we could just glom them all together and treat them as a single abstract measurement, like this:

Formally, we can do this in two steps:

The second step is the interesting one, since it changes the substance of the model.

Main question: under the abstract model, what counterfactual queries remain valid (i.e. match the corresponding concrete queries), and how do they correspond to counterfactuals on the concrete model? What about probabilistic queries, like P[X|Y]?The concrete model supports three basic counterfactual queries:

… as well as counterfactuals built by combining multiple basic counterfactuals and possibly adding additional computation. In the abstract model:

… so counterfactuals on X and Y have a straightforward correspondence, whereas the correspondence between counterfactuals on M and {Mi} is more complicated and potentially underdetermined. But the important point is that any allowable counterfactual setting of M will correspond to

at least onepossible counterfactual setting of {Mi} - so any counterfactual queries on the abstract model are workable.(Definitional note: I’m using “correspond” somewhat informally; I generally mean that there’s a mapping from abstract nodes to concrete node sets such that queries on the abstract model produce the same answers as queries on the concrete model by replacing each node according to the map.)

Probabilistic queries, i.e. P[X|Y], run into a more severe issue: P[X|M]≠P[X|M1,...,Mn]. In the abstract model, node M retained all information relevant to Y, but not necessarily all information relevant to X. So there’s not a clean correspondence between probabilistic queries in the two models. Also, of course, the abstract model has no notion at all of the individual measurements Mi, so it certainly can’t handle queries like P[X|M1].

Now, in our medical device example, the individual measurements Mi are not directly observed by the end user - they just see Y - so none of this is really a problem. The query P[X|M1,...,Mn] will never need to be run anyway. That said, a small adjustment to the abstract model

doesallow us to handle that query.## Natural Abstraction for the Medical Sensor

Let’s modify our abstract model from the previous section so that P[X|M]=P[X|M1,...,Mn]. Rather than just keeping the information relevant to Y, our M node will also need to keep information relevant to X. (The next three paragraphs briefly explain how to do this, but can be skipped if you're not interested in the details.)

By the

minimal map theorems, all the information in {Mi} which is relevant to X is contained in the distribution P[X|{Mi}]. So we could just declare that node M is the tuple (1n∑iMi,(x→P[X=x|{Mi}])), where the second item is the full distribution of X given {Mi} (expressed as a function). But notation gets confusing when we carry around distributions as random variables in their own right, so instead we’ll simplify things a bit by assuming the measurements follow a maximum entropy distribution - just remember that this simplification is a convenience, not a necessity.We still need to keep all the information in {Mi} which is relevant to X, which means we need to keep all the information to compute P[X|{Mi}]. From the DAG structure, we know that P[X|{Mi}]=1ZP[X]∏iP[Mi|X], where Z is a normalizer. P[X] is part of the model, so the only information we need from {Mi} to compute P[X|{Mi}] is the product ∏iP[Mi|X]. If we assume the measurements follow a maxentropic distribution (for simplicity), then ∏iP[Mi|X]∝eλT∑if(Mi), for some vector λ and vector-valued function f (both specified by the model). Thus, all we need to keep around to compute P[X] is ∑if(Mi) - the

sufficient statistic.Main point: the node M consists of the pair (1n∑iMi,∑if(Mi)). If we want to simplify even further, we can just declare that f0 is the identity function (possibly with λ0=0), and then node M is just ∑if(Mi), assuming the number n of measurements is fixed.

What does this buy us?

First and foremost, our abstract model now supports all probabilistic queries: P[X|Y], P[X|M], P[M|Y], P[Y|X], etc, will all return the same values as the corresponding queries on the concrete model (with M corresponding to {Mi}). The same counterfactuals remain valid with the same correspondences as before, and the counterfactually-modified abstract models will also support the additional probabilistic queries.

We can even add in one extra feature:

Huh? What’s going on here?

Remember, M contains all of the information from {Mi} which is relevant to X or Y. That means {Mi} is conditionally independent of both X and Y, given M (this is a

standard resultin information theory). So we can add {Mi} into the DAG as a child of M, resulting in the overall distributionP[X,Y,M,Mi]=P[X]P[M|X]P[Y|M]P[{Mi}|M]

Since {Mi} is just a child node dangling off the side, any probabilistic queries not involving any Mi will just automatically ignore it. Any probabilistic queries which do involve any Mi will incorporate relevant information from X and Y via M.

What about counterfactuals?

Counterfactual settings of X, Y, and M still work just like before, and we can generally run probabilistic queries involving the Mi on the counterfactually-modified DAGs. Cutting the X→M arrow still corresponds to cutting all the X→Mi arrows in the concrete model. The addition of {Mi} to the model even lets us calculate which {Mi} are compatible with a particular counterfactual setting of M, although I don’t (yet) know of any useful interpretation to attribute to the distribution P[{Mi}|M] in that case.

We still can’t directly translate counterfactuals from the concrete model to the abstract model - e.g. a counterfactual setting of M1 in the concrete model does not easily correspond to anything in the abstract model. We also can’t directly run counterfactuals on {Mi} in the abstract model; we have to run them on M instead. But if a counterfactual modification is made elsewhere in the DAG, the probabilistic queries of {Mi} within the counterfactual model will work.

That brings us to the most important property of this abstraction, and the real reason I call it “natural”: what if this is all just a sub-component of a larger model?

Here’s the beauty of it: everything still works. All probabilistic queries are still supported, all of the new counterfactuals are supported. And all we had to account for was the

localeffects of our abstraction - i.e. M had to contain all the information relevant to X and Y. (In general, an abstracted node needs to keep information relevant to its Markov blanket.) Any information relevant to anything else in the DAG is mediated by X and/or Y, so all of our transformations from earlier still maintain invariance of the relevant queries, and we’re good.By contrast, our original abstraction - in which we kept the information relevant to Y but didn’t worry about X - would mess up any queries involving the information contained in {Mi} relevant to X. That includes P[A|{Mi}], P[B|{Mi}], etc. To compute those correctly, we would have had to fall back on the concrete model, and wouldn’t be able to leverage the abstract model at all. But in the natural abstraction, where M contains all information relevant to X or Y, we can just compute all those queries directly in the abstract model - while still gaining the efficiency benefits of abstraction when possible.