Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

steve2152 brought up a great example:

Consider a 1kΩ resistor, in two circuits. The first circuit is the resistor attached to a 1V [voltage] supply. Here an engineer would say: "The supply creates a 1V drop across the resistor; and that voltage drop causes a 1mA current to flow through the resistor." The second circuit is the resistor attached to a 1mA current source. Here an engineer would say: "The current source pushes a 1mA current through the resistor; and that current causes a 1V drop across the resistor." Well, it's the same resistor ... does a voltage across a resistor cause a current, or does a current through a resistor cause a voltage, or both, or neither? [...] my conclusion was that people think about causality in a way that is not rooted in physics, and indeed if you forced someone to exclusively use physics-based causal models, you would be handicapping them.

First things first: we're talking about causality, which means we're mainly talking about counterfactuals - questions of the form "what would the system do if we did X?". (See Pearl's book for lots of detail on how and why causality and counterfactuals go together.)

In the resistor example, both scenarios yield exactly the same actual behavior (assuming we've set the parameters appropriately), but the counterfactual behavior differs - and that's exactly what defines a causal model. In this case, the counterfactuals are things like "what if we inserted a different resistor?" and "what if we adjusted the knob on the supply?". If it's a voltage supply, then a voltage -> current model ("voltage causes current") correctly answers the counterfactuals:

  • Inserting a different resistor changes the current but not the voltage. In the voltage -> current model, we cut the arrow going into "current" and set that node to a new value.
  • Adjusting the knob on the supply changes the voltage, and the current adjusts to match. In the voltage -> current model, we set the "voltage" node to a new value, and the model tells us how to update the "current" node.

Conversely, if it's a current supply, then a current -> voltage model ("current causes voltage") correctly answers the counterfactuals. It is a mistake here to think of "the territory" as just the resistor by itself; the supply is a critical determinant of the counterfactual behavior, so it needs to be included in order to talk about causality.

Note that all the counterfactual queries in this example are physically grounded - they are properties of the territory, not the map. We can actually go swap the resistor in a circuit and see what happens.

But Which Counterfactuals?

Of course, there's still the question of how we decide which counterfactuals to support. That is mainly a property of the map, so far as I can tell, but there's a big catch: some sets of counterfactual queries will require keeping around far less information than others. A given territory supports "natural" classes of counterfactual queries, which require relatively little information to yield accurate predictions for the whole query class. In this context, the lumped circuit abstraction is one such example: we keep around just high-level summaries of the electrical properties of each component, and we can answer a whole class of queries about voltage or current measurements. Conversely, if we wanted to support a few queries about the readings from a voltage probe, a few queries about the mass of various circuit components, and a few queries about the number of protons in a wire mod 3... these all require completely different information to answer. It's not a natural class of queries.

So natural classes of queries imply natural choices of abstract model, possibly including natural choices of causal model. There will still be some choice in which queries we care about, and what information is actually available will play a role in that choice (i.e. even if we cared about number of protons mod 3, we have no way to get that information).

In our example above, the voltage -> current model is such a natural abstraction when we have a voltage supply. Using just the basic parameters of the system (i.e. resistance and the present system state), it allows us to accurately answer questions about both the system's current state and counterfactual changes. Same for the current -> voltage model when using a current supply.

But... although a voltage -> current model is a natural abstraction when we're using a voltage supply, it's not clear that the current -> voltage model is not. It won't correctly answer counterfactuals about swapping out a resistor or adjusting the knob on the supply, but perhaps there is some other class of counterfactual queries which would be correctly answered by the current -> voltage model? (One class of counterfactual queries which it would correctly answer is "swap the voltage supply for a current supply, and then...". But that class of queries just reinforces the idea that voltage -> current is a natural abstraction for a voltage supply, and current -> voltage is not.)

New Comment
36 comments, sorted by Click to highlight new comments since: Today at 2:06 PM

I don't fully trust my knowledge in this domain, but this particular example seems questionable to me just because "current sources" are kind of weird. I'll throw about a few ideas from my undergrad EE (I didn't focused on the electromagnetism side, so I'm a bit weak here).

  • Mentally, I use the abstraction that voltage (differences in electrical potential) causes current flows.
  • This probably isn't quite right, but "current sources" are in some sense a bit fictitious. The defining feature is that they maintain constant current regardless of the load placed across the terminals, but in practice, you can set up a device that behaves like that by supplying whatever voltage is necessary to maintain a fixed current. So you can model a "current-source" as "a device which adapts its voltage difference to produce constant current", which is compatible with a "voltage causes current" paradigm.
    • All real current sources have a limited range they can operate over dependent on how much voltage they can supply. If you had a truly ideal current source, you'd have an infinite energy machine.

See the Wiki entry on current sources, particularly the implementations. I didn't read through these, but a glance at several says they're on the inside they involve configurations of voltage sources. Figure 3 is pretty clear demonstration of how "current source" is made by an adaptive voltage source.

![](https://upload.wikimedia.org/wikipedia/en/thumb/0/00/V-to-i_op-amp_current_source_1000.jpg/700px-V-to-i_op-amp_current_source_1000.jpg)

Caption: In an op-amp voltage-controlled current source the op-amp compensates the voltage drop across the load by adding the same voltage to the exciting input voltage.

Now, notwithstanding that, there are still interesting questions about causality. (Again proceeding with pretty entry-level knowledge of the physics here– I'm hoping someone will show up and add certainty and clarity.) There might be some clarity from thinking about charge instead of voltage and current. We observe that if you have electric potential differences (more charge concentrated in a place than elsewhere) and a conductive path between them, then you get current flows. Of course, you get differences in charge concentrations by moving charges around in the first place, i.e. also current flows. [The "charge movement" picture gets more complicated by how things like moving magnetic fields create voltage differences/current flows, I'm really sure how to unify the two.]

Instructively, electric potential energy is similar in some ways to gravitational potential energy. At least, both are conservative forces obeying inverse square laws. I can get gravitational potential energy by moving two bits of mass apart. If I release them and there's a pass, the potential energy gets turned into kinetic energy and they move together. Of course, to separate them I had to move mass around. The motion of rolling a boulder up a hill and the motion of letting it roll down a

Electric potentials seem the same (at least when thinking about electrostatics). Separating charge (current flow) creates potential differences which can be released and translate into motion (current flow).

In terms of the causality though, there seems to be something asymmetric. In some cases I'm putting energy into the system, causing motion, and building up potential energy (be it electric or gravitational). In other cases, I'm extracting energy from the system by letting it be used up to create motion.

Cases where you have a current source that's giving you energy, it probably it is the case that the potential difference can be described as the cause of the flow (even if potential difference produced by a device is adaptive somehow to get fixed rate of motion/current). No one thinks that the motion of the car causes the combustion (use of potential chemical energy) rather than the other way round even if I built my engine to produce fixed speed no matter the mass the vehicle it's in.

I would venture that any competent electrical engineer has a picture at least this detailed and definitely does not think of voltage sources and current sources as black boxes rather than high-level descriptions of underlying physics which lead to very concrete and different predictions.

One problem I've been chewing on is how to think about causality and abstraction in the presence of a feedback controller. An un-blackboxed current supply is one example - my understanding it that they're typically implemented as a voltage supply with a feedback controller. Diving down into the low-level implementation details (charge, fields, etc) is certainly one way to get a valid causal picture. But I also think that abstract causal models can be "correct" in some substantive but not-as-yet-well-understood sense, even when they differ from the underlying physical causality.

An example with the same issues as a current supply, but is (hopefully) conceptually a bit simpler, is a thermostat. At the physical level, there's a feedback loop: the thermostat measures temperature, compares it to the target temperature, and adjusts the fuel burn rate up/down accordingly. But at the abstract level, I turn a knob on the thermostat, and that causes the temperature to change. I think there is a meaningful sense in which that abstract model is correct. By contrast, an abstract model which says "the change in room temperature a few minutes from now causes me to turn the knob on the thermostat" would be incorrect, as would a causal model in which the two are unconnected.

So... yes, the example given clearly does not match the underlying physical causality for a current supply. On the other hand, the same can be said with the voltage supply; the macroscopic measured behavior results from back-and-forth causal arrows between the EM fields and the charges. And that's all before we get down to quantum mechanics, at which point physical causality gets even more complicated. Point is: all of these models are operating at a pretty high level of abstraction, compared to the underlying physical reality. But it still seems like some abstract causal models are "right" and others are "wrong".

The OP is about what might underlie that intuition - what "right" and "wrong" mean for abstract causal models.

Yeah, that all seems fair/right/good and I see what you're getting at. I got nerdsniped by the current source example because it was familiar and I felt as phrased it got in the way of the core idea you were going for.

The person who properly introduced me to Pearl's causality stuff had an example which seems good here and definitely erodes the notion of causality being uni-directional in time. It seems equivalent to the thermostat one, I think. 

 Suppose I'm a politician seeking election:

  • At time t0, I campaign on a platform which causes people to vote for me at time t1.
  • On one hand, my choice of campaign is seemingly the cause of people voting for me afterwards.
  • On another hand, I chose the platform I did because of an action which would occur afterwards, i.e. the voting. If I didn't have a model that people would vote for a given platform, I wouldn't have chosen that platform. My model/prediction is of a real-world thing. So it kinda seems a bit like the causality flows backwards in time. The voting causes the campaign choice same as the temperature changing in response to knob-turning causes the knob-turning.

I like the framing that the questions can be posed both for voltage supply and current supply, that seems more on track to me.

Positive reinforcement for noticing getting nerdsniped and mentioning it!

This and the parent comment were quite helpful for getting a more nuanced sense of what you're up to.

Point is: all of these models are operating at a pretty high level of abstraction, compared to the underlying physical reality. But it still seems like some abstract causal models are "right" and others are "wrong".

Good summary.

There would be an analogous example in hydraulics where positive displacement pumps are constant flow (~current) sources and centrifugal pumps are constant(ish*) pressure (~voltage) sources. The resistor would be a throttle.

In this case it is the underlying physical nature of the types of pumps which causes the effect rather than a feedback loop.

*At least at lower flow rates.

Really nice example, thanks!

  • This probably isn’t quite right, but “current sources” are in some sense a bit fictitious. The defining feature is that they maintain constant current regardless of the load placed across the terminals, but in practice, you can set up a device that behaves like that by supplying whatever voltage is necessary to maintain a fixed current. So you can model a “current-source” as “a device which adapts its voltage difference to produce constant current”, which is compatible with a “voltage causes current” paradigm.
    • All real current sources have a limited range they can operate over dependent on how much voltage they can supply. If you had a truly ideal current source, you’d have an infinite energy machine.

This is all true, however, voltage sources are equally fictitious, and a truly ideal voltage source would also be an infinite energy machine. As you increase the load, a real-life voltage source will start to behave more like a current source (and eventually like a smoke generating machine).

Yes, I suppose that's right too. A voltage source can't supply infinite current, i.e. can't maintain that voltage is the load's resistance is too low, e.g. a perfectly conductive path.

The idea of using a causal model to model the behaviour of a steady state circuit strikes me as unnatural. Doubly so for a simple linear circuit. Their behaviour can be well predicted by solving a set of equations. If you had to force me to create a causal model it would be:

Circuit description(topology, elements) -> circuit state(voltage at nodes, current between nodes)

IIRC, this is basically how SPICE does it's DC and linear AC analysis. The circuit defines a set of equations. Their solution gives the voltages and currents that describe the steady state of the system. That way, it's easy to look at both mentioned counterfactuals, since each is just a change in the circuit description. The process of solving those equations is best abstracted as acausal, even if the underlying process is not.

This changes when you start doing transient analysis of the circuit. In that case, it starts to make sense to model the circuit using the state variables and to model how those state variables evolve. Then you can describe the system using diff equations and boundary conditions. They can be thought of as a continuous limit of causal DAGs. But even then, the state variables and the equations that describe their evolution are not unique. It's just a matter of what makes the math easier.

Not that this takes away from your point that you need different abstractions to answer different queries. For instance, the mass and physical space occupied by the circuit is relevant to most mechanical design queries, but not to most electrical design queries.

Note that all the counterfactual queries in this example are physically grounded - they are properties of the territory, not the map. We can actually go swap the resistor in a circuit and see what happens.

Objection: unless we actually do go swap the resistor, it seems that you are grounding counterfactuals in more counterfactuals. (you used the word "can!") Unless you mean to ground them in possibles, like shminux advocates.

I'm kind of amazed it took this long for someone to bring that up; I figured it would be the very first comment!

I think this is basically the right argument to make, and the conclusion to draw is that counterfactuals need to be in the map, not the territory, at the end of the day.

With that out of the way, I think the main takeaway of the OP/discussion is: yes, counterfactuals are ultimately in the map, but that does not imply anywhere near as much subjectivity as it might seem at first glance. A map has to match the territory in order to be useful; a map which matches the territory is an instrumentally convergent tool for a wide variety of objectives. Just as that puts some major constraints on probabilities, it also puts some major constraints on counterfactuals.

The knob on the current supply is a part of the territory, but "being able to adjust the knob" being an affordance of the territory, while "setting the knob so that it outputs a desired voltage" isn't (or at least, is a less central example) is part of our map.

The other thing this reminds me of is the reductionist point (Sean Carroll video for laypeople here) that the laws of physics seem to be simplest when thought of not in terms of causes, but in terms of differential equations that enforce a pattern that holds between past and future.

The knob on the current supply is a part of the territory, but "being able to adjust the knob" being an affordance of the territory, while "setting the knob so that it outputs a desired voltage" isn't (or at least, is a less central example) is part of our map.

On the one hand, yeah, that seems like the obvious objection, and it's a solid one. On the other hand... it does seem like "setting the knob so that it outputs a desired voltage" requires either a very unusual coincidence or a very artificial setup, and I'm not entirely convinced that the difference is purely a matter of perspective. It feels - at least to me - like there's something unusual going on there, in a sense which is not specific to human aesthetics.

I'm gonna think out loud here for a moment...

The big red flag is that "setting the knob so that it outputs a desired voltage" implies either one hell of a coincidence, or some kind of control system. And control systems are one of the things which I know, from other examples, can make a system with one causal structure behave-as-though it follows some other causal structure. (See e.g. the thermostat example in my response to Ruby's comment.) In fact, so far every example I've seen where the abstract causal structure doesn't match the causal structure of the underlying system involves either feedback control or some kind of embedded model of the system (and note that, by the good regulator theorem, feedback control implies some kind of embedded model).

This is obviously very speculative, but... I suspect that we could say something like "any set of counterfactual queries which flip the direction of this arrow need to involve an embedded model of the system (probably embedded in a controller)".

I'm not totally sure I'm objecting to anything. For something that thinks about and interacts with the world more or less like a human, I agree that turning a knob is probably an objectively better affordance than e.g. selecting the location of each atom individually.

You could even phrase this as an objective fact: "for agents in some class that includes humans, there are certain guidelines for constructing causal models that, if obeyed, lead to them being better predictors than if not." This would be a fact about the territory. And it would tell you that if you were like a human, and wanted to predict the effect of your actions, there would be some rules your map would follow.

And then if your map did follow those rules, that would be a fact about your map.


I think there's a way to drop the "for humans/agents like humans" part. Like, we could drop our circuit in the middle of the woods, and sometimes random animals would accidentally turn the knob on the supply, or temperature changes would adjust the resistance in the resistor. "Which counterfactuals actually happen sometimes" doesn't really seem like the right criterion to use as fundamental here, but it does suggest that there's something more universal in play.

I think another related qualitative intuition is constructive vs. nonconstructive. "Just turn the knob" is simple and obvious enough to you to be regarded as constructive, not leaving any parts unspecified for a planner to compute. "Just set the voltage to 10V" seems nonconstructive - like it would require further abstract thought to make a plan to make the voltage be 10V. But as we've learned, turning knobs is a fairly tricky robotics task, requiring plenty of thought - just thought that's unconscious in humans.

This point might be useless, but it feels like we are substituting sub-maps for the territory here. This example looks to me like:

Circuits -> Map

Physics -> Sub-map

Reality -> Territory

I intuitively feel like a causal signature should show up in the sub-map of whichever level you are currently examining. I am tempted to go as far as saying the degree to which the sub-map allows causal inference is effectively a measure of how close the layers are on the ladder of abstraction. In my head this sounds something like "perfect causal inference implies the minimum coherent abstraction distance."

I do agree with the sub-maps point, and think it is relevant, although I also don't think we currently understand abstraction well enough to figure it out.

I intuitively feel like a causal signature should show up in the sub-map of whichever level you are currently examining...

Counterexample: feedback control. In day-to-day activity, I use a model in which turning the dial on a thermostat causes a room to heat up. The underlying reality is much more complicated, with a bunch of back-and-forth causal arrows. One way to say it: the purpose of a feedback controller is to make a system behave, at the abstract level, as if it had a different causal structure.

I have a hard time thinking of that example as a different causal structure. Rather I think of it as keeping the same causal structure, but abstracting most of it away until we reach the level of the knob; then we make the knob concrete. This creates an affordance.

Of course when I am in my house I am approaching it from the knob-end, so mostly I just assume some layers of hidden detail behind it.

Another way to say this is that I tend to view it as compressing causal structure.

Here is a steam engine. If I connect the steam supply, the piston will reciprocate and make the flywheel spin. If I disconnect the steam, I can spin the flywheel and make the piston reciprocate.

My knowledge of these linkages is in my map, but the map works by being similar in structure to the territory. The linkages are also present in the territory.

Possibly helpful resource for people on this topic (and the source of my knowledge here): Academian's slides on What Causality Is, covering Pearl's stuff.

Underdermination by the territory, the baisc physics, is one thing. But the flipside is we often want to identify causes in order to solve human level problems, and that can help us to focus on "the" (in context) cause.

Everything has multiple causes. All fires are caused, among other things, by oxygen, but we rarely consider oxygen to be "the" cause of a fire. When we pick out something as "the" cause, we are usually looking for something that varies and something that we can control. All fires require oxygen, but oxygen is not a, variable enough factor compared to dropped matches and inflammable materials.

Context matters as well. A coroner could find that the cause of Mr Smiths death was ingestion of Arsenic, while the judge finds that it was Mrs Smith. It would be inappropriate to put the arsenic on trial and punish it, because it is not a moral agent.. but it is a causal factor nonetheless.

Although there is definitely a lot to causality that is relevant to human interests and therefore on the map, it should not be concluded that there is not also a form of causality in the territory. That would be a version of the fallacy that says that since probability and counterfactuals are can be found in low-resolutions maps, they are therefore not in the territory.

Don't. Use. Counterfactuals. Period. They are a tempting but misguided idea. They are about changing the unchangeable past, and you want to affect the future.

Reformulate your question in terms of possible decisions, not impossible ones.

Instead of asking "what would the system do if we did X?" ask "what will the system do in a similar setup if I do X?" Since you don't yet know the future, the question is well posed, as you performing a similar experiment is in the realm of possibility. So there are no counterfactuals, only possible factuals.

I dislike the model of "causality" to begin with, except as used by Hawking and Ellis in their book on General relativity, "Large-scale structure of spacetime", where they describe causal patches of spacetime as those where one can construct at least a finite Cauchy development of an initial hypersurface (known as global hyperbolicity).

But if you feel like, you can use those causal graphs beloved by the FDT people and say something like "if I take this setup and change the resistor, I will measure the following current and voltage." There you claim causality about your own actions, not about the circuit, avoiding the confusion you end up in otherwise. To make this prediction you can use a convenient model, such as current source or voltage source, without claiming any cause and effect except as an internal to the model itself. You can talk about "natural abstractions" all day long, for some people they are natural and for others they are not. Reality has no joints, it just is, the "reality joints" are in the map.


Not using counterfactuals is not an option. The ultimate aim of this work is to build foundations for a reductive version of game theory, and in game theory, expected off-equilibrium behavior (a.k.a. counterfactual behavior) matters. Even if we're going to avoid asking counterfactual questions, the agents in our models cannot avoid asking counterfactual questions, so we need a theory which lets them do so.

Typical example: kidnapper takes a child, and threatens to shoot them if a ransom is not paid. The parents are going to pay the ransom, but in order to make that decision, they need to think about the counterfactual in which they don't pay - they need to have beliefs about that counterfactual. Likewise, when deciding whether to kidnap the child in the first place, the kidnapper has to consider the counterfactual in which he does/doesn't do so.

Typical example: kidnapper takes a child, and threatens to shoot them if a ransom is not paid. The parents are going to pay the ransom, but in order to make that decision, they need to think about the counterfactual in which they don't pay - they need to have beliefs about that counterfactual.

Uh. A counterfactual is "what would have been if...", not about what may or may not come to pass. Your example is about possible futures (to kidnap or not to kidnap, or to pay ransom or not to pay ransom), so there is no issue there. From SEP:

counterfactual modality [...] concerns what is not, but could or would have been.

Maybe you have a better example?

Remember the topic is *reductive* agency. The parents and the kinapper are all made of atoms. For our purposes, they're deterministic. They will, in fact, only take one of the "possible" actions. The other actions are counterfactual.

I don't think you are using the standard definition of a counterfactual. Future possibilities are never counterfactual. Unless MIRI has its own non-standard definition.

At a guess, those in favor of counterfactuals hold that the sense in which multiple things can happen (in the future) is also the sense in which counterfactuals could have happened.

Would a quantum random number generator that is as likely to output "0" as "1" help? (It seems like there is a meaningful sense for such things that "this set up has a prior probability distribution which exists" - as opposed to a (deterministic) coin flip.).

At a guess, those in favor of counterfactuals hold that the sense in which multiple things can happen (in the future) is also the sense in which counterfactuals could have happened.

I understand that this is a tempting thought, but ultimately counterproductive. Future, whether set or not, is yet unknown. You can also evaluate probabilities of events that are unknown to you but have already happened. What I find useless is the reasoning of the type "I know that X happened, but what if it didn't, all else being equal?"

Let's, think of a use for it. For instance, if an outcome depends on a decision you made, considering what would have happened if you made a different decision can help refine your decision making processes.

considering what would have happened if you made a different decision can help refine your decision making processes.

"Considering what may happen in a similar setup in the future if you make a different decision can help refine your decision making processes. " FTFY

Considering what may happen in a similar setup in the future

... Prompted by what did or didn't work on the past.

Yep, definitely based on what worked and what didn't. But future-oriented, not past-oriented.

Instead of asking “what would the system do if we did X?” ask “what will the system do in a similar setup

Methodologically, these are identical. The model or equation you are using to explore counterfactual or future states does not know or care whether the input conditions gave occurred or not. So your actual objection is about the metaphysics. On the assumption that the time is deterministic , the counterfactual could not have occurred. (The assumption that the past is fixed is not enough) You feel uncomfortable about dealing with falsehoods, even though it is possible, and sometimes useful.

It can be identical. Or it can be of the type of "What if Martin Luther King had died when he was stabbed in 1958" (from SEP), which is of little value except for alt history novels.