Formulating Reductive Agency in Causal Models

by johnswentworth1 min read23rd Jan 2020No comments


Ω 14

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

The previous post talked about what agenty systems look like, in the context of causal models. The reductive agency problem asks: how are agenty systems built out of non-agenty pieces?

In the context of causal models, we know that non-agenty models look like this:

… and agenty models look like this (see previous post for what the clouds mean):

So the reductive agency problem on causal models would be: how can we build something which looks like the second diagram, from pieces which look like the first?

Obvious first answer: we can’t. No amount of arrows will add a cloud to our diagram; it’s a qualitatively different type of thing.

Less obvious second answer: perhaps a non-agenty model can abstract into an agenty model. I’ve been going on and on about abstraction of causal models, after all.

Let’s review what that would mean, based on our earlier discussions of abstraction.

Abstraction of causal models means:

  • we take some low-level/concrete/territory causal model…
  • transform it into a high-level/abstract/map causal model…
  • in such a way that we can answer (some) queries on the low-level model by transforming them into queries on the high-level model.

The queries in question include both ordinary probabilistic queries (i.e. ) and interventions/counterfactuals (i.e. ).

We want our abstract model to include agenty things - i.e. clouds and specifically strange loops (clouds with arrows pointing inside themselves). As discussed in the previous post, the distinguishing feature of the clouds is that, if we change the model within the cloud (e.g. via a do() operation), then that changes the cloud, and anything downstream of the cloud will update accordingly. So, to get an abstract agenty model, there need to be queries on our low-level non-agenty model which produce the same answers (maybe modulo some processing) as model-changing queries in the agenty model.

Here be monsters already gave an example where something like this happens. There’s some hidden variable (possibly with complicated internal structure of its own), and a bunch of conditionally IID measurements . A “detector” node simply looks for outliers among the ’s: it’s 1 if it detects an outlier, 0 if not.

Assuming narrow error distribution on the ’s, the detector node will never actually light up. But if we perform an intervention - i.e. set one of the ’s to some value - then the detector (usually) will light up. So our system is equivalent to this:

… where the detector looks at the cloud-model and lights up if some of the arrows are missing. This still isn’t a full agenty model - we don’t have an arrow from a cloud pointing back inside the cloud itself - but it does show that ordinary cloud-less models can abstract into models with clouds.

More generally, we’d like a theory saying what low-level non-agenty models abstract into what agenty high-level models, and what queries are/aren’t supported.



Ω 14