639

LESSWRONG
LW

638
EntropyKolmogorov ComplexityLogic & Mathematics PhysicsWorld Modeling
Frontpage

76

Thermodynamic entropy = Kolmogorov complexity

by Aram Ebtekar
17th Feb 2025
1 min read
14

76

This is a linkpost for https://doi.org/10.1103/PhysRevE.111.014118

76

Thermodynamic entropy = Kolmogorov complexity
5Stephen Fowler
7Aram Ebtekar
1Matt Dellago
4Stephen Fowler
3Yoav Ravid
2Aram Ebtekar
4Noosphere89
2Yoav Ravid
4Aram Ebtekar
1Matt Dellago
6Alex_Altair
3Matt Dellago
2Aram Ebtekar
1UnderTruth
New Comment
14 comments, sorted by
top scoring
Click to highlight new comments since: Today at 10:57 AM
[-]Stephen Fowler7mo*50

A while ago, I watched recordings of the lectures given by by Wolpert and Kardes at the Santa Fe Institute*, and I am extremely excited to see you and Marcus Hutter working in this area. 

Could you speculate on if you see this work having any direct implications for AI Safety?

 

Edit:

I was incorrect. The lectures from Wolpert and Kardes were not the ones given at the Santa Fe Institute.

Reply
[-]Aram Ebtekar7mo*72

Were those recorded!?

For direct implications, I'd like to speak with the alignment researchers who use ideas from thermodynamics. While Shannon's probabilistic information theory is suited to settings where the law of large numbers holds, algorithmic information theory should bring more clarity in messier settings that are relevant for AGI.

Less directly, I used physics as a testing ground to develop some intuitions on how to apply algorithmic information theory. The follow-up agenda is to develop a theory of generalization (i.e., inductive biases) using algorithmic information theory. A lot of AI safety concerns depend on the specific ways that AIs (mis)generalize beliefs and objectives, so I'd like us to have more precise ideas about which generalizations are likely to occur.

Reply
[-]Matt Dellago7mo10

I would be interested in seeing those talks, can you maybe share links to these recordings?

Reply
[-]Stephen Fowler7mo*40

These recordings I watched were actually from 2022 and weren't the Sante Fe ones. 

Reply1
[-]Yoav Ravid7mo3-1

How does this relate to cross-entropy? 

Reply
[-]Aram Ebtekar7mo20

They are equal! As discussed in the comments to that post, the difference is at most a constant; it's possible to make that constant vanish by an appropriate choice of reference universal Turning machine.

Reply
[-]Noosphere897mo42

Notably, the difference can become non-constant if we allow for semi-measures on infinite strings:

https://www.lesswrong.com/posts/KcvJXhKqx4itFNWty/k-complexity-is-silly-use-cross-entropy-instead#LNgzYraTeGcEyry9a

Reply
[-]Yoav Ravid7mo20

Isn't that significant?

Reply
[-]Aram Ebtekar7mo*40

It is, when dealing with sequences that go on to infinity. In that case you get the "KM complexity", from Definition 4.5.8 of Li & Vitanyi (2019). For online sequence prediction, Solomonoff's prior needs to sum the weights from every program.

No such complications appear in the entropy of a bounded system at a fixed precision. And ultimately, it appears that for entropy to increase, you need some kind of coarse-graining, leading us to finite strings. I discuss this in the Background section and around Corollary 1.

Reply
[-]Matt Dellago7mo10

Very good work, thank you for sharing!

Intuitively speaking, the connection between physics and computability arises because the coarse-grained dynamics of our Universe are believed to have computational capabilities equivalent to a universal Turing machine [19–22].

I can see how this is a reasonable and useful assumption, but the universe seems to be finite in both space and time and therefore not a UTM. What convinced you otherwise?

Reply
[-]Alex_Altair2mo62

Note that being finite in spatial extent is different from being finite in number of possible states. A Turing machine needs the latter but not the former. Finite spatial extent isn't a blocker if your states can have arbitrarily fine precision. As an intuition pump, you could imagine a Turing machine where the nth cell of the tape has a physical width of 1/2^n. Then the whole tape has length 1 (meter?) but can fit arbitrarily long binary strings on it.

Separately, when I've looked into it, it seemed far from consensus whether the universe was likely to be spatially infinite or not, as in the global curvature was still consistent with being flat (= infinite space). Although, I think the "accessible" universe has finite diameter.

Lastly, I claim that whether or not something is best modeled as a Turing machine is a local fact, not a global one. A Turing machine is something that will compute the function if you give it enough tape. If the universe is finite (in the state space way) then it's a Turing machine that just didn't get fed enough tape. In contrast, something is better modeled as a DFA if you can locally observe that it will only ever try to access finitely many states.

Reply
[-]Matt Dellago2mo30

Good point! My intuition was that the Berkenstein bound (https://en.wikipedia.org/wiki/Bekenstein_bound) limits the amount of information in a volume. (Or more precisely the information surrounded by an area.) Therefore the number of states in a finite volume is also finite.

I must add: since writing this comment, a man called george pointed out to me that, when modeling the universe as a computation one must take care, to not accidentally derive ontological claims from it.

So today I would have a more 'whatever-works-works'-attitude; UTMs, DFAs both just models, neither likely to be ontologically true.

Reply
[-]Aram Ebtekar7mo*20

Thanks. Obviously this claim needs some interpretation, but a UTM still seems a better model of the Universe than, say, any lower automation in the Chomsky hierarchy. For the purposes of defining entropy, it's important that we can use a small base machine, plus a memory tape that we may think of as expanding in an online fashion.

Reply
[-]UnderTruth7mo10

I am reminded of this paper, on the equivalence of information-theoretic formalisms and those of physics. I am linking the paper here not as an endorsement, but because it may provide some unusual, but useful, lines of thought.

Reply
Moderation Log
More from Aram Ebtekar
View more
Curated and popular this week
14Comments
EntropyKolmogorov ComplexityLogic & Mathematics PhysicsWorld Modeling
Frontpage

Direct PDF link for non-subscribers

Information theory must precede probability theory, and not be based on it. By the
very essence of this discipline, the foundations of information theory have a finite combinatorial character.

-  Andrey Kolmogorov

Many alignment researchers borrow intuitions from thermodynamics: entropy relates to information, which relates to learning and epistemology. These connections were first revealed by Szilárd's resolution of Maxwell's famous thought experiment. However, the classical tools of equilibrium thermodynamics are not ideally suited to studying information processing far from equilibrium.

This new work reframes thermodynamics in terms of the algorithmic entropy. It takes an information-first approach, delaying the introduction of physical concepts such as energy and temperature until after the foundations are set. I find this approach more conceptually principled and elegant than the traditional alternatives.

It's based on a 30-year-old workshop paper, which until now was largely abandoned. Roughly speaking, the algorithmic entropy of a physical state is its Kolmogorov complexity; that is, the length of the shortest program that outputs a precise description of its microscopic configuration (to some bounded precision). This definition does away with probability distributions and macrovariables, and satisfies very general laws!

The paper is long, in part because I tried to make it self-contained. If you find yourself using entropy in a setting that is not described by a large number of identically distributed variables, then consider reframing your intuitions in terms of the algorithmic entropy!

Mentioned in
43Help me understand: how do multiverse acausal trades work?