Is Causality in the Map or the Territory?

by johnswentworth2 min read17th Dec 201937 comments


Ω 11

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

steve2152 brought up a great example:

Consider a 1kΩ resistor, in two circuits. The first circuit is the resistor attached to a 1V [voltage] supply. Here an engineer would say: "The supply creates a 1V drop across the resistor; and that voltage drop causes a 1mA current to flow through the resistor." The second circuit is the resistor attached to a 1mA current source. Here an engineer would say: "The current source pushes a 1mA current through the resistor; and that current causes a 1V drop across the resistor." Well, it's the same resistor ... does a voltage across a resistor cause a current, or does a current through a resistor cause a voltage, or both, or neither? [...] my conclusion was that people think about causality in a way that is not rooted in physics, and indeed if you forced someone to exclusively use physics-based causal models, you would be handicapping them.

First things first: we're talking about causality, which means we're mainly talking about counterfactuals - questions of the form "what would the system do if we did X?". (See Pearl's book for lots of detail on how and why causality and counterfactuals go together.)

In the resistor example, both scenarios yield exactly the same actual behavior (assuming we've set the parameters appropriately), but the counterfactual behavior differs - and that's exactly what defines a causal model. In this case, the counterfactuals are things like "what if we inserted a different resistor?" and "what if we adjusted the knob on the supply?". If it's a voltage supply, then a voltage -> current model ("voltage causes current") correctly answers the counterfactuals:

  • Inserting a different resistor changes the current but not the voltage. In the voltage -> current model, we cut the arrow going into "current" and set that node to a new value.
  • Adjusting the knob on the supply changes the voltage, and the current adjusts to match. In the voltage -> current model, we set the "voltage" node to a new value, and the model tells us how to update the "current" node.

Conversely, if it's a current supply, then a current -> voltage model ("current causes voltage") correctly answers the counterfactuals. It is a mistake here to think of "the territory" as just the resistor by itself; the supply is a critical determinant of the counterfactual behavior, so it needs to be included in order to talk about causality.

Note that all the counterfactual queries in this example are physically grounded - they are properties of the territory, not the map. We can actually go swap the resistor in a circuit and see what happens.

But Which Counterfactuals?

Of course, there's still the question of how we decide which counterfactuals to support. That is mainly a property of the map, so far as I can tell, but there's a big catch: some sets of counterfactual queries will require keeping around far less information than others. A given territory supports "natural" classes of counterfactual queries, which require relatively little information to yield accurate predictions for the whole query class. In this context, the lumped circuit abstraction is one such example: we keep around just high-level summaries of the electrical properties of each component, and we can answer a whole class of queries about voltage or current measurements. Conversely, if we wanted to support a few queries about the readings from a voltage probe, a few queries about the mass of various circuit components, and a few queries about the number of protons in a wire mod 3... these all require completely different information to answer. It's not a natural class of queries.

So natural classes of queries imply natural choices of abstract model, possibly including natural choices of causal model. There will still be some choice in which queries we care about, and what information is actually available will play a role in that choice (i.e. even if we cared about number of protons mod 3, we have no way to get that information).

In our example above, the voltage -> current model is such a natural abstraction when we have a voltage supply. Using just the basic parameters of the system (i.e. resistance and the present system state), it allows us to accurately answer questions about both the system's current state and counterfactual changes. Same for the current -> voltage model when using a current supply.

But... although a voltage -> current model is a natural abstraction when we're using a voltage supply, it's not clear that the current -> voltage model is not. It won't correctly answer counterfactuals about swapping out a resistor or adjusting the knob on the supply, but perhaps there is some other class of counterfactual queries which would be correctly answered by the current -> voltage model? (One class of counterfactual queries which it would correctly answer is "swap the voltage supply for a current supply, and then...". But that class of queries just reinforces the idea that voltage -> current is a natural abstraction for a voltage supply, and current -> voltage is not.)



Ω 11