This really seems like the sort of thing where we should be able to get a mathematical theorem, from first principles, rather than assuming.
Doing my usual exercise of "stop reading here and try to derive an independent model"...
Here's the way my anthropics-poisoned brain would frame this:
Assume that your window of observations is limited to some small part of the universe, or perhaps that it's coarse (you only observe some high-level information, not full low-level details). Suppose you either narrow this window further (stop observing some part of the system), or try to predict/control low-level dynamics from the high-level state.
In the former case, the moment you stop observing a part of the system, your observations become consistent with those from a set of timelines differing by what happened in that part of the system (under the constraint of the laws of physics). E. g., if you started with a perfect "snapshot" of the gas molecules' positions and velocities (we're assuming they're billiard balls), as time goes on, there's more and more information leaking from the unobserved environment as they bounce off walls composed of other molecules (whose exact positions and velocities you didn't observe, so you couldn't predict their effects on the gas particles).[1] You might be able to correctly time when to open/close the door to correctly sort the molecules that started very close to it, but this ability would quickly degrade. Your uncertainty increases; entropy rises.
In the latter case, you already started out unsure regarding which timeline consistent with the given high-level information (e. g., pressure and temperature in the box) you're in. Your uncertainty over the low-level state is already maximized under the high-level constraints.
And Maxwell's demon basically maintains perfect observation of the state of the system you'd otherwise fail to (perfectly) observe.
The exact amount of uncertainty generated (aka the amount of data the demon's observational array should be gathering for the demon to work) depends on the physics of the system you're in (how quickly the set of possible timelines grows; how many low-level states are consistent with the high-level state). But the overall idea should work in all universes and on all levels of abstraction.
Moving from anthropics to embedded agency: the above assumes that you have the correct model of the laws of your abstract environment (its state-transition rules, potentially stochastic ones), and enough computational resources to simulate that environment (at the given level of abstraction).
If you're operating at the lowest level of abstraction of our universe, your uncertainty is "in the territory": this setup should reduce to the quantum-mechanical uncertainty principle, with the unobserved part of the system becoming Schrödinger's cat's box. The anthropics reasoning then reduces to the many-worlds reasoning.[2]
At higher levels, it's "pseudo-territorial": if we model a coarse agent as genuinely discarding the low-level information it can't integrate into its abstract models, the setup should be structurally equivalent to the lowest-level one.[3]
In a way, this is trivial. As long as the underlying state-transition rules are stochastic, or are modeled as stochastic – either because they're "genuinely" uncertain as in QM, or because you have imperfect information (up to the th significant digit), or because you're using an abstract model that bakes in imperfect information – as long as it's the case, any mechanism that tries to control which state you end up in would need to "oppose"/"de-stochasticify" the stochastic dynamics.
So: with n molecules, the KL divergence between the two distributions is , i.e. bits, as one might intuitively guess.
Right, if you're not making any new observations, your uncertainty would be maximized over the set of the allowed timelines consistent with your past observations and the laws of physics (or the laws of your abstract environment). Since molecules can't disappear/leave the box/etc., there are only such timelines.
Alternatively, you can model it as having snapshotted the state only up to some th significant digit of the molecules' positions/velocities, and as it evolves, the knowledge of the digits past the th one becomes ever more necessary for predicting the dynamics.
This can probably be exactly formalized via this approach,[4] but these margins etc. etc. (Note to John: this is related to the potential (dubious) QM-based line of reasoning I'd mentioned in the write-up I sent you.)
Except with the stochastic state-transition mechanism being Markovian instead of non-Markovian, because that information is not actually discarded (we can't use it, but we still observe it, so it still differentiates our exact observations between different timelines) – so there wouldn't be macro-scale quantum effects; see the linked paper.
Bullshit check: Scott Aaronson thinks this paper is valid (but not enlightening).
Now for the key idea: we’re going to compare the distribution of states achieved by the demon with policy , to the distribution of states which would be achieved by the demon if it took the same distribution of actions completely independent of its observations - i.e. if it just blindly tried to sort the molecules without looking at them.
Interesting! I've previously looked at this method as a solid definition of "optimization" (and Utility functions and whatnot) but I never thought of applying it to Maxwell's Demon.
Why can't the daemon just continuously look at a tiny area around the gate and decide just based on that? A tiny area seems intuitively sufficient for both recognizing that a molecule would go from left to right when opened, and no molecule would go from right to left. This would mean that it doesn't need to know a distribution over molecules at all.
Basically: Why can't the daemon just solve a localised control task.
Let’s start with the classic Maxwell’s Demon setup.
We have a container of gas, i.e. a bunch of molecules bouncing around. Down the middle of the container is a wall with a tiny door in it, which can be opened or closed by a little demon who likes to mess with thermodynamics researchers. Maxwell[1] imagined that the little demon could, in principle, open the door whenever a molecule flew toward it from the left, and close the door whenever a molecule flew toward it from the right, so that eventually all the molecules would be gathered on the right side. That would compress the gas, and someone could then extract energy by allowing the gas to re-expand into its original state. Energy would be conserved by this whole process, since the gas would end up cooler in proportion to the energy extracted, but it would violate the Second Law of Thermodynamics - i.e. entropy would go down.
Landauer famously proposed to “fix” this apparent loophole in the Second Law by accounting for the information which the demon would need to store, in order to know when to open and close the door. Each bit of information has a minimal entropic “cost”, in Landauer’s formulation. This sure seems to be correct in practice, but it’s unsatisfying: as has been pointed out before[2], Landauer derived his bit-cost by assuming that the Second Law holds and then asking what bit cost was needed to make it work. This really seems like the sort of thing where we should be able to get a mathematical theorem, from first principles, rather than assuming.
Also, Landauer’s approach is a bit weird for embedded agency purposes. It feels like almost the right tool, but not quite; it’s not really framed in terms of canonical parts-of-an-agent, like e.g. observations and actions and policy. And it’s too dependent on physical entropy, which itself grounds out in the reversibility of low-level physics. Ideally, we’d like something more agnostic to the underlying physics, so that e.g. we can apply it directly even to high-level systems with irreversible dynamics.
So we’d like a theorem, and we’d like it to be more directly oriented toward embedded agency rather than stat mech. To that end, we present the Do-Divergence Theorem.
Rather than focus on “memory”, we’ll take a more agentic frame, and talk about the demon’s observations () and actions (). The observations are the inputs to the demon’s decisions (presumably measurements of the initial state of the molecules); the actions are whether the door is open or closed at each time. Further using agentic language, the demon’s policy specifies how actions are chosen as a function of observation: is a distribution from which the action is sampled. Downstream, the actions and observations together cause some outcome , the final state of the molecules.
Now for the key idea: we’re going to compare the distribution of states achieved by the demon with policy , to the distribution of states which would be achieved by the demon if it took the same distribution of actions completely independent of its observations - i.e. if it just blindly tried to sort the molecules without looking at them.
We express the “blind sorting” model as a do-operation on the causal diagram above: , below, indicates that the demon samples an action from independent of its observations . So, under the model , we have
… in contrast to the original model, under which
To compare the distribution achieved by the demon to the “blind sorting” distribution, we’ll use KL-divergence; more on what that looks like after the theorem.
Now for the theorem itself:
Where is the mutual information between actions and observations. Proof:
Now let’s unpack what the theorem means, when applied to Maxwell’s Demon.
When the demon takes actions independent of observations (i.e. independent of the state of the molecules), molecules are just as likely to move from left container to right container as from right to left. So, the distribution should end up roughly uniform across both sides, as is normal for a single connected container of gas.
On the other hand, if the demon perfectly sorts the molecules on to the right side, then the molecules end up roughly uniform on only one side of the container.
The KL-divergence between these distributions is then roughly
So: with n molecules, the KL divergence between the two distributions is , i.e. bits, as one might intuitively guess. In a case like this, the KL divergence is just the entropy change.
The do-divergence theorem therefore says that the demon’s actions must have at least n bits of mutual information with its observations of the molecule states, in order to sort the molecules. All the change in entropy of the system must be balanced by mutual information between the demon’s actions and observations.
yes, same guy as the electromagnetic laws
notably by Earman and Norton in a 1999 paper