Causal Abstraction Intro

by johnswentworth1 min read19th Dec 20196 comments

23

Ω 11

Abstraction
Frontpage
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I haven't been terribly satisfied by the first few posts in this sequence; they don't do a very good job introducing things. I think part of the problem is the format, so I decided to invest in a high-end studio and try making a video instead.

It's about 10 minutes, covers similar material to the first few posts, but IMO does a better job communicating what's going on.

Causal Abstraction Intro

Feedback on the format is appreciated (including "didn't watch the video but would have read a normal post" or vice versa). So far I think the video provides better explanation per-unit-effort, and I lean toward doing more of them. Obviously both the setup and the postprocessing were pretty low-investment on this one; I'll probably put a bit more effort into production if I'm going to do this regularly.

6 comments, sorted by Highlighting new comments since Today at 7:14 PM
New Comment

Didn't watch the video but would have read the post. Might watch the video only because previous posts have been appetising enough.

Two points.

First, I don't mind the new format as long as there is some equivalent written reference I can go to. The same way the embedded agency sequence has the full written document and the fun diagrams. This is to make it easier to reference individual components of the material for later discussion. On reddit, I find it's far more difficult to have a discussion about specific points in video content because it requires me to transcribe the section I want to talk about in order to quote it properly.

Second, I might have missed this, but is there a reason we're limiting ourselves to abstract causal models? I get that they're useful for answering queries with the do() operator, but there are many situations where it doesn't make sense to model the system as a DAG.

is there a reason we're limiting ourselves to abstract causal models?

Great question. I considered addressing that in the intro video, but decided to keep the "why this topic?" question separate.

I talk about this a fair bit in Embedded Agency via Abstraction. Major reasons for the choice:

  • Causal models are a well-characterized, self-contained model class. We know what all the relevant queries are. At the same time, they apply to a huge variety of real-world systems, at multiple levels of abstraction, and (with symmetry) even provide a Turing-equivalent model of computation.
  • Built-in counterfactuals mean we don't need a bunch of extra infrastructure to apply results to decision theory. It's hard to imagine a theory of agency without some kind of counterfactuals in it (since off-equilibrium behavior matters for game theory), and causal models are the simplest model class with built-in support for counterfactuals.
  • Combining the previous two bullets: I expect that causal models are a relatively well-characterized model class which is nonetheless likely to exhibit most of the key qualitative properties which we need to figure out for embedded agency.
  • Finally, my intuition is that causal models (with simple function-nodes) tend to naturally encourage avoiding black boxes, in a way that e.g. logic or Turing machines do not. They make it natural to think about computations rather than functions. That, in turn, will hopefully provide a built-in line of defense against various diagonalization problems.
I don't mind the new format as long as there is some equivalent written reference I can go to.

I'm still undecided on how to handle this. The problem with e.g. a transcription is that I'm largely talking about the diagrams, pointing at them, drawing on them, etc; that's a big part of why it feels easier to communicate this stuff via video in the first place. Maybe labeling the visuals would help? Not sure. I'm definitely open to suggestions on that front.


decided to invest in a high-end studio

I didn't catch that this was a lie until I clicked the link. The linked post is hard to understand - it seems to rely on the reader being similar enough to the author to guess at context. Rest assured that you are confusing someone.

Great video! It was easier to understand than the previous posts, and it got your point across well. I've been dwelling on similar ideas recently, and will be positing to this video as a reference.

Strongly agree causal models need lots of visuals. I liked the video, but I also realize I understood it because I know what Counterfactuals and Causal Inference is already. I think that is actually a fair assumption given your audience and the goals of this sequence. Nonetheless, I think you should provide some links to required background information.

I am not familiar with circuits or fluid dynamics so those examples weren't especially elucidating to me. But I think as long as a reader understands one or two of your examples it is fine. Part of making this judgment depends upon your own personal intuition about how labor should be divided between author and reader. I am fine with high labor, and making a video is, IMO, already quite difficult.

I think you should keep experimenting with the medium.