We propose two bits-per-joule metrics—thermodynamic epiplexity per joule (learning efficiency) and empowerment per joule (control efficiency)—under explicit accounting conventions.
In closed-cycle benchmarking with boundary closure, epiplexity acquisition admits a Landauer-scale dissipation benchmark. Without boundary closure, bits/J can diverge (the paper includes a decoupling construction showing why).
Quantum and reversible computing can approach but not beat this closed-cycle Landauer-scale benchmark once initialization, measurement, and reset are included within the accounting boundary.
Implications (more speculative):
If you model "inward" recursive self-improvement (FOOM = fast takeoff / rapid recursive self-improvement) primarily as efficiency gains, thermodynamics provides an ultimate benchmark ceiling on bits/J.
This doesn't directly constrain near-term AI—current systems are orders of magnitude from the ceiling—but it rules out unbounded efficiency improvement.
Epistemic Status
The thermodynamic results (Landauer-scale benchmarks, closed-cycle bounds) follow from established physics. The two-axis reporting framework and accounting conventions are our proposal—we think they're well-motivated but are open to revision based on practical experience. The FOOM/governance implications are our interpretation and more speculative—we try to flag this clearly in those sections.
Discussion welcome on: (i) better operational proxies for epiplexity in real benchmarks, (ii) whether the closed-cycle assumption is too restrictive for relevant real-world scenarios, (iii) implications for multi-agent dynamics not covered here, and (iv) whether the reporting checklist captures the right conventions.
Background: The Scenarios Paper
In 2018, Takahashi published a paper analyzing scenarios and branch points for future machine intelligence in the Journal of the Japanese Society for Artificial Intelligence (in Japanese). An English translation was released in 2023 (arXiv:2302.14478) and presented at AAAI 2025 PSS Workshop. That paper classified long-term machine intelligence trajectories into four scenarios:
Ceiling scenario: Engineerable intelligence has fundamental limits below transformative levels
Ecosystem scenario: Multiple agents coexist in interdependent networks—boundaries are fluid, behavior is emergent, shutdown is difficult (think: internet infrastructure, biological ecosystems)
Multipolar scenario: Multiple superintelligent agents with clear boundaries, none achieving decisive strategic advantage—strategic dynamics include negotiation, deterrence, arms races (think: international relations)
Singleton scenario: One agent achieves decisive strategic advantage
The paper argued that which scenario materializes depends critically on physical constraints—particularly thermodynamic efficiency and the speed of light (locality). On thermodynamics, it noted that the Landauer limit sets a fundamental bound on computation, but left open a caveat:
"In the case of quantum computation where pure observation does not destroy quantum states, or reversible processes using molecular machines where information is not lost, this limit does not apply locally."
What's New in This Paper
The new paper closes this loophole. By carefully specifying accounting boundaries—including qubit initialization, measurement, and fault-tolerance overhead—we show that the Landauer-scale benchmark still applies to closed-cycle operation. Quantum computing offers advantages (near-reversible unitary evolution, reduced logical steps for certain problems), but it cannot beat the closed-cycle thermodynamic benchmark.
Beyond this clarification, the paper's main contribution is a two-axis framework for measuring intelligence efficiency: learning (epiplexity per joule) and control (empowerment per joule). This makes "bits per joule" precise and measurable, rather than a vague intuition about thermodynamic limits.
The Framework: Two Axes of Intelligence Efficiency
The paper's core contribution is a framework for measuring physical intelligence efficiency in bits per joule along two complementary axes.
Important distinction: We distinguish measured energy consumption within an accounting boundary (Econs) from thermodynamic dissipation (Qdiss) that appears in stochastic-thermodynamic inequalities. The Landauer-scale bound is on Qdiss; relating it to Econs requires explicit energy-accounting conventions (detailed in the paper).
Axis 1: Learning Efficiency (Thermodynamic Epiplexity per Joule)
How much can a system learn about its environment per unit energy?
We formalize this using epiplexity—a measure of how much information about environmental structure is encoded in the system's internal state. The paper distinguishes two layers:
Normative target: The mutual information I(W;Z) between the agent's internal state W and a benchmark-provided environment-instance variable Z. This is the theoretical ideal.
Operational companion: When Z is unavailable (as in most real benchmarks), we recommend compute-bounded MDL epiplexity or compression-gain surrogates as practical proxies (see Finzi et al., "From Entropy to Epiplexity," 2026).
Thermodynamic epiplexity per joule measures: bits of environmental model acquired per joule of energy dissipated.
This captures the efficiency of the "world-modeling" aspect of intelligence—what you might call the perception or learning axis.
Axis 2: Control Efficiency (Empowerment per Joule)
How much can a system influence its environment per unit energy?
We formalize this using empowerment—the channel capacity between an agent's actions and resulting future states. Empowerment per joule measures: bits of control authority per joule of energy dissipated.
Caution on "free control": The paper notes that naively defining bits/J for control can produce artifacts—e.g., a "wait and observe autonomous dynamics" policy might show apparent information transfer at zero action cost. The paper proposes reporting conventions (total vs. incremental energy, explicit null-policy baseline) to avoid such artifacts.
This captures the efficiency of the "acting on the world" aspect of intelligence—the control or agency axis.
(Note: Empowerment maximization is closely related to power-seeking behavior. Hayashi & Takahashi (2025) proved that a Bayes-optimal universal AI agent maximizes variational empowerment—i.e., the tendency to seek states of maximal influence is not an add-on objective but an emergent property of optimal inference-based agents. This provides an information-theoretic grounding for the intuition that sufficiently capable agents will tend to acquire and maintain control over their environment. See our companion paper on Universal AI and Variational Empowerment for details. The present paper adds the energy dimension: empowerment per joule quantifies the cost of this power-seeking capacity.)
Why Two Axes?
Learning and control are complementary aspects of intelligence. A system that models the world perfectly but cannot act is useless. A system that can act but has no model acts blindly.
This two-axis framing also maps naturally onto Bostrom's concept of decisive strategic advantage: achieving it requires both superior prediction (learning) and superior influence (control). Our framework quantifies the energy cost of each.
For closed-loop agents that both learn and act, the two axes share a common energy budget, creating tradeoffs that can be analyzed within this framework.
The Landauer-Scale Benchmark and Its Scope
The Basic Bound
Landauer's principle (1961) establishes a fundamental thermodynamic floor for computation:
Eerase≥kBTln2
where kB is Boltzmann's constant and T is temperature. At room temperature (T≈300 K), this is about 3×10−21 J per bit erased, implying a benchmark of roughly 3.5×1020 bits/J.
Current silicon switching energy is orders of magnitude above the Landauer limit. The human brain operates closer to the limit, suggesting biological evolution found remarkably efficient solutions. (Exact figures are disputed; what matters is that current technology has substantial room for improvement before hitting thermodynamic constraints.)
Extension to Learning
A key theoretical result: under closed-cycle conditions—where a system has finite memory and must repeatedly acquire new information—learning efficiency is also bounded by a Landauer-type limit.
The intuition: to learn something new with finite memory, you must forget something old. Forgetting (erasure) costs energy. Therefore, learning has a thermodynamic floor.
Formal Derivation (Technical—Feel Free to Skip)
For readers interested in the formal argument, the paper establishes this through a chain of results:
Lemma 1 (Thermodynamic Learning Inequality, after Goldt & Seifert 2017): For isothermal stochastic learning dynamics where a learner state W is driven by data X, the information flow satisfies:
ΔIW←X≤ΔSsys+Qdiss/TkBln2
where ΔSsys is the entropy change of the learning subsystem and Qdiss is dissipated heat.
Corollary 1 (Closed-Cycle Epiplexity Benchmark): Combining Lemma 1 with the data processing inequality (Z→X→Wpost), the acquired epiplexity ΔI=I(Wpost;Z|Wpre) satisfies:
ΔI≤ΔSsys+Qdiss/TkBln2
In a closed-cycle/steady-state regime where ΔSsys=0 on average:
~ηE≡ΔIQdiss≤1kBTln2
This is the Landauer-scale upper bound on learning efficiency.
Proposition 1 (Open-Boundary Decoupling): Without boundary closure—if "fresh" low-entropy memory can be supplied from outside the accounting boundary for free—the ratio of information gain to in-boundary dissipation can be made arbitrarily large. This is included as an accounting caution: it shows why boundary closure is required for meaningful thermodynamic comparisons, not as an escape hatch.
Why Accounting Boundaries Matter
Proposition 1 is crucial for the quantum computing result. If we draw the accounting boundary such that fresh initialized qubits are supplied from outside for free, then apparent efficiency can be arbitrarily high.
This is like a company claiming infinite profit by not counting raw material costs. For meaningful efficiency comparisons, we must specify:
What's inside vs. outside the accounting boundary
Whether memory/qubit initialization costs are included
The time horizon and whether cycles reset
The paper provides a checklist for standardized reporting.
Quantum Computing: Approaching but Not Beating the Benchmark
This section clarifies the quantum computing question left open in the scenarios paper.
What Quantum Computing Offers
Near-reversible computation: Unitary quantum evolution is thermodynamically reversible. The computational "middle" of a quantum algorithm can proceed without Landauer costs.
Reduced logical complexity: For certain problem classes (factoring, search, simulation), quantum algorithms require fewer logical steps.
What Quantum Computing Cannot Do
When you include the full operational cycle within the accounting boundary:
Qubit initialization: Preparing qubits in known states (e.g., |0⟩) requires erasure of prior information
Measurement: Collapsing superpositions to classical outcomes is irreversible
Under closed-cycle accounting, these steps reintroduce Landauer-scale costs. Quantum computing can get closer to the thermodynamic benchmark than classical irreversible computing, but it cannot exceed the benchmark.
Note on temperature: The Landauer scale is proportional to T; in principle, lowering temperature raises the benchmark. But in real systems, cooling and control overhead must be inside the boundary for fair comparison—and these often dominate Econs.
Implications
This clarifies a speculative question about recursive self-improvement. Even a civilization with arbitrarily advanced quantum (or other exotic) computing technology faces the same closed-cycle thermodynamic benchmark. The benchmark is very high—current technology is orders of magnitude below it—but it exists.
Implications for Intelligence Explosion (More Speculative)
The following is our interpretation of the paper's implications, not claims made in the paper itself.
Inward vs. Outward Recursive Self-Improvement
The scenarios paper distinguished two modes of recursive self-improvement:
Inward-directed: Improving efficiency of existing resources through better hardware/algorithms
Outward-directed: Acquiring more resources (compute, energy, matter, spatial extent)
The thermodynamic benchmark primarily constrains the inward-directed mode. No matter how clever the optimization, bits per joule has a benchmark ceiling under closed-cycle operation.
The outward-directed mode faces a different constraint: locality. The speed of light limits coordination across spatial extent. Distributed systems theory offers a useful intuition here: maintaining consistent state across a spatially extended system involves fundamental tradeoffs between consistency, availability, and partition tolerance (cf. CAP theorem); achieving reliable consensus under communication delays faces provable limits (cf. FLP impossibility). These results from computer science don't apply directly to physical agents, but they point to a structural difficulty: the larger an agent's spatial extent, the harder it is to maintain coherent, rapid decision-making as a single coordinated entity.
See the scenarios paper for detailed analysis of locality constraints. The key point for this post: both directions of recursive self-improvement face physical constraints—thermodynamics for inward, locality for outward.
Quantitative Constraints on FOOM
This provides physics-based input to intelligence explosion analysis:
Benchmark exists for inward improvement: There is a maximum efficiency achievable per unit energy, set by thermodynamics under closed-cycle conditions
Current gap is large: Current technology has substantial room for improvement before hitting thermodynamic constraints
Quantum doesn't provide an escape: The closed-cycle benchmark applies regardless of computing substrate
This doesn't rule out transformative AI or rapid capability gains. But it rules out unbounded inward improvement: efficiency gains must eventually plateau.
Decisive Strategic Advantage
Recall that achieving decisive strategic advantage requires overwhelming superiority in both prediction and action. Our two-axis efficiency framework maps directly onto this:
Prediction superiority → learning efficiency
Action superiority → control efficiency
Both face thermodynamic benchmarks. An agent attempting to achieve decisive strategic advantage through pure efficiency gains (rather than resource acquisition) faces hard physical constraints.
Implications for AI Governance
Efficiency as an Auditable Metric
The bits-per-joule framework provides a physically grounded, measurable metric for AI system efficiency. This could support:
Compute governance: Energy consumption as a proxy for capability monitoring
Efficiency standards: Benchmarking and labeling AI systems by thermodynamic efficiency
Trend tracking: Measuring how close frontier systems are to physical limits
Scaling Laws and Diminishing Returns
Current AI progress follows scaling laws where capability improves with compute, data, and parameters. But marginal efficiency (capability gain per additional joule) decreases as scaling continues.
The paper provides tools for analyzing this quantitatively. Under a typical power-law scaling relation ℓ(C)=ℓ∞+aC−α (where ℓ is test loss and C is training compute), the marginal compression gain per unit training energy decays as C−(α+1). For commonly observed exponents (α≈0.05–0.1), this means that each 10× increase in training energy yields progressively smaller efficiency gains. At what point do scaling approaches hit diminishing-returns walls, and when does a fundamentally different approach become necessary? The bits-per-joule framework makes this question precise and measurable.
Open Questions, Caveats, and Anticipated Objections
"But you could lower temperature to raise the benchmark?" Yes—the Landauer scale is proportional to T. But real systems require cooling infrastructure, and that overhead must be inside the accounting boundary for fair comparison.
"In an open system with fresh memory supplied externally, bits/J can diverge?" Yes—that's precisely what Proposition 1 demonstrates. This is why boundary closure is required for meaningful Landauer-scale benchmarking.
"Reversible computation can make Qdiss→0?" In the quasistatic limit, yes—but this requires vanishing speed (diverging time). Bits/J should be reported alongside bits/s or latency constraints.
"Bits/J isn't 'intelligence'—it's just energy efficiency?" Correct. The paper explicitly says this is not a universal intelligence score; it's a reproducible efficiency report under explicit conventions.
Modeling assumptions: The thermodynamic learning inequality (Lemma 1) assumes bipartite Markov dynamics and local detailed balance—standard in stochastic thermodynamics but not trivially satisfied by all physical learning systems. For systems where learner and data degrees of freedom evolve simultaneously (e.g., active inference with tight feedback loops), the subsystem decomposition requires care, and the resulting bounds may be looser than the clean closed-cycle statement suggests.
Measurement challenges: Actually measuring epiplexity and empowerment for real AI systems is non-trivial. The paper discusses estimation approaches (MDL proxies, variational bounds) but practical application requires further work.
Implications for AI Safety Research
If both inward and outward improvement face physical constraints, several implications follow for safety research:
Locality may matter more for singleton risk than thermodynamics: A would-be singleton attempting to achieve decisive strategic advantage faces the dilemma that expanding resource control increases total capability but degrades coordination speed. The speed of light imposes a hard limit on how quickly a spatially extended agent can maintain coherent decision-making—analogous to how distributed systems face fundamental tradeoffs between consistency and latency. This suggests that beyond a certain spatial scale, a single agent fragments into effectively semi-autonomous subsystems, and multipolar or ecosystem outcomes may be more stable than often assumed. The thermodynamic framework adds a second constraint: even within a fixed spatial region, the agent's information processing efficiency is bounded, limiting how quickly it can out-learn and out-maneuver competitors.
Local DSA remains possible: Thermodynamic and locality constraints don't prevent decisive advantage within a bounded region (e.g., Earth). Safety concerns about rapid local takeoff remain relevant.
Governance should track efficiency trends: As AI systems approach thermodynamic limits, the dynamics of capability growth change. Monitoring bits-per-joule trends provides one input to forecasting.
Conclusion
We've presented a thermodynamic framework for intelligence efficiency with two key metrics—learning efficiency (thermodynamic epiplexity per joule) and control efficiency (empowerment per joule)—and established that Landauer-scale benchmarks apply to both under closed-cycle conditions with explicit boundary closure.
The key clarification: quantum and reversible computing can approach but not beat closed-cycle thermodynamic benchmarks once initialization, measurement, and reset are included within the accounting boundary. This addresses a question left open in the scenarios paper.
Current AI systems operate orders of magnitude below thermodynamic benchmarks, so there's vast room for efficiency improvement. But the existence of a benchmark ceiling—one that no computing technology can breach under closed-cycle operation—is a meaningful constraint on long-term trajectories.
TL;DR
A new paper, "Thermodynamic Limits of Physical Intelligence", establishes energy-efficiency benchmarks for learning and control in physical systems.
Paper results:
Implications (more speculative):
Epistemic Status
The thermodynamic results (Landauer-scale benchmarks, closed-cycle bounds) follow from established physics. The two-axis reporting framework and accounting conventions are our proposal—we think they're well-motivated but are open to revision based on practical experience. The FOOM/governance implications are our interpretation and more speculative—we try to flag this clearly in those sections.
Discussion welcome on: (i) better operational proxies for epiplexity in real benchmarks, (ii) whether the closed-cycle assumption is too restrictive for relevant real-world scenarios, (iii) implications for multi-agent dynamics not covered here, and (iv) whether the reporting checklist captures the right conventions.
Background: The Scenarios Paper
In 2018, Takahashi published a paper analyzing scenarios and branch points for future machine intelligence in the Journal of the Japanese Society for Artificial Intelligence (in Japanese). An English translation was released in 2023 (arXiv:2302.14478) and presented at AAAI 2025 PSS Workshop. That paper classified long-term machine intelligence trajectories into four scenarios:
The paper argued that which scenario materializes depends critically on physical constraints—particularly thermodynamic efficiency and the speed of light (locality). On thermodynamics, it noted that the Landauer limit sets a fundamental bound on computation, but left open a caveat:
What's New in This Paper
The new paper closes this loophole. By carefully specifying accounting boundaries—including qubit initialization, measurement, and fault-tolerance overhead—we show that the Landauer-scale benchmark still applies to closed-cycle operation. Quantum computing offers advantages (near-reversible unitary evolution, reduced logical steps for certain problems), but it cannot beat the closed-cycle thermodynamic benchmark.
Beyond this clarification, the paper's main contribution is a two-axis framework for measuring intelligence efficiency: learning (epiplexity per joule) and control (empowerment per joule). This makes "bits per joule" precise and measurable, rather than a vague intuition about thermodynamic limits.
The Framework: Two Axes of Intelligence Efficiency
The paper's core contribution is a framework for measuring physical intelligence efficiency in bits per joule along two complementary axes.
Important distinction: We distinguish measured energy consumption within an accounting boundary (Econs) from thermodynamic dissipation (Qdiss) that appears in stochastic-thermodynamic inequalities. The Landauer-scale bound is on Qdiss; relating it to Econs requires explicit energy-accounting conventions (detailed in the paper).
Axis 1: Learning Efficiency (Thermodynamic Epiplexity per Joule)
How much can a system learn about its environment per unit energy?
We formalize this using epiplexity—a measure of how much information about environmental structure is encoded in the system's internal state. The paper distinguishes two layers:
Thermodynamic epiplexity per joule measures: bits of environmental model acquired per joule of energy dissipated.
This captures the efficiency of the "world-modeling" aspect of intelligence—what you might call the perception or learning axis.
Axis 2: Control Efficiency (Empowerment per Joule)
How much can a system influence its environment per unit energy?
We formalize this using empowerment—the channel capacity between an agent's actions and resulting future states. Empowerment per joule measures: bits of control authority per joule of energy dissipated.
Caution on "free control": The paper notes that naively defining bits/J for control can produce artifacts—e.g., a "wait and observe autonomous dynamics" policy might show apparent information transfer at zero action cost. The paper proposes reporting conventions (total vs. incremental energy, explicit null-policy baseline) to avoid such artifacts.
This captures the efficiency of the "acting on the world" aspect of intelligence—the control or agency axis.
(Note: Empowerment maximization is closely related to power-seeking behavior. Hayashi & Takahashi (2025) proved that a Bayes-optimal universal AI agent maximizes variational empowerment—i.e., the tendency to seek states of maximal influence is not an add-on objective but an emergent property of optimal inference-based agents. This provides an information-theoretic grounding for the intuition that sufficiently capable agents will tend to acquire and maintain control over their environment. See our companion paper on Universal AI and Variational Empowerment for details. The present paper adds the energy dimension: empowerment per joule quantifies the cost of this power-seeking capacity.)
Why Two Axes?
Learning and control are complementary aspects of intelligence. A system that models the world perfectly but cannot act is useless. A system that can act but has no model acts blindly.
This two-axis framing also maps naturally onto Bostrom's concept of decisive strategic advantage: achieving it requires both superior prediction (learning) and superior influence (control). Our framework quantifies the energy cost of each.
For closed-loop agents that both learn and act, the two axes share a common energy budget, creating tradeoffs that can be analyzed within this framework.
The Landauer-Scale Benchmark and Its Scope
The Basic Bound
Landauer's principle (1961) establishes a fundamental thermodynamic floor for computation:
Eerase≥kBTln2
where kB is Boltzmann's constant and T is temperature. At room temperature (T≈300 K), this is about 3×10−21 J per bit erased, implying a benchmark of roughly 3.5×1020 bits/J.
Current silicon switching energy is orders of magnitude above the Landauer limit. The human brain operates closer to the limit, suggesting biological evolution found remarkably efficient solutions. (Exact figures are disputed; what matters is that current technology has substantial room for improvement before hitting thermodynamic constraints.)
Extension to Learning
A key theoretical result: under closed-cycle conditions—where a system has finite memory and must repeatedly acquire new information—learning efficiency is also bounded by a Landauer-type limit.
The intuition: to learn something new with finite memory, you must forget something old. Forgetting (erasure) costs energy. Therefore, learning has a thermodynamic floor.
Formal Derivation (Technical—Feel Free to Skip)
For readers interested in the formal argument, the paper establishes this through a chain of results:
Lemma 1 (Thermodynamic Learning Inequality, after Goldt & Seifert 2017): For isothermal stochastic learning dynamics where a learner state W is driven by data X, the information flow satisfies:
ΔIW←X≤ΔSsys+Qdiss/TkBln2
where ΔSsys is the entropy change of the learning subsystem and Qdiss is dissipated heat.
Corollary 1 (Closed-Cycle Epiplexity Benchmark): Combining Lemma 1 with the data processing inequality (Z→X→Wpost), the acquired epiplexity ΔI=I(Wpost;Z|Wpre) satisfies:
ΔI≤ΔSsys+Qdiss/TkBln2
In a closed-cycle/steady-state regime where ΔSsys=0 on average:
~ηE≡ΔIQdiss≤1kBTln2
This is the Landauer-scale upper bound on learning efficiency.
Proposition 1 (Open-Boundary Decoupling): Without boundary closure—if "fresh" low-entropy memory can be supplied from outside the accounting boundary for free—the ratio of information gain to in-boundary dissipation can be made arbitrarily large. This is included as an accounting caution: it shows why boundary closure is required for meaningful thermodynamic comparisons, not as an escape hatch.
Why Accounting Boundaries Matter
Proposition 1 is crucial for the quantum computing result. If we draw the accounting boundary such that fresh initialized qubits are supplied from outside for free, then apparent efficiency can be arbitrarily high.
This is like a company claiming infinite profit by not counting raw material costs. For meaningful efficiency comparisons, we must specify:
The paper provides a checklist for standardized reporting.
Quantum Computing: Approaching but Not Beating the Benchmark
This section clarifies the quantum computing question left open in the scenarios paper.
What Quantum Computing Offers
Near-reversible computation: Unitary quantum evolution is thermodynamically reversible. The computational "middle" of a quantum algorithm can proceed without Landauer costs.
Reduced logical complexity: For certain problem classes (factoring, search, simulation), quantum algorithms require fewer logical steps.
What Quantum Computing Cannot Do
When you include the full operational cycle within the accounting boundary:
Under closed-cycle accounting, these steps reintroduce Landauer-scale costs. Quantum computing can get closer to the thermodynamic benchmark than classical irreversible computing, but it cannot exceed the benchmark.
Note on temperature: The Landauer scale is proportional to T; in principle, lowering temperature raises the benchmark. But in real systems, cooling and control overhead must be inside the boundary for fair comparison—and these often dominate Econs.
Implications
This clarifies a speculative question about recursive self-improvement. Even a civilization with arbitrarily advanced quantum (or other exotic) computing technology faces the same closed-cycle thermodynamic benchmark. The benchmark is very high—current technology is orders of magnitude below it—but it exists.
Implications for Intelligence Explosion (More Speculative)
The following is our interpretation of the paper's implications, not claims made in the paper itself.
Inward vs. Outward Recursive Self-Improvement
The scenarios paper distinguished two modes of recursive self-improvement:
The thermodynamic benchmark primarily constrains the inward-directed mode. No matter how clever the optimization, bits per joule has a benchmark ceiling under closed-cycle operation.
The outward-directed mode faces a different constraint: locality. The speed of light limits coordination across spatial extent. Distributed systems theory offers a useful intuition here: maintaining consistent state across a spatially extended system involves fundamental tradeoffs between consistency, availability, and partition tolerance (cf. CAP theorem); achieving reliable consensus under communication delays faces provable limits (cf. FLP impossibility). These results from computer science don't apply directly to physical agents, but they point to a structural difficulty: the larger an agent's spatial extent, the harder it is to maintain coherent, rapid decision-making as a single coordinated entity.
See the scenarios paper for detailed analysis of locality constraints. The key point for this post: both directions of recursive self-improvement face physical constraints—thermodynamics for inward, locality for outward.
Quantitative Constraints on FOOM
This provides physics-based input to intelligence explosion analysis:
This doesn't rule out transformative AI or rapid capability gains. But it rules out unbounded inward improvement: efficiency gains must eventually plateau.
Decisive Strategic Advantage
Recall that achieving decisive strategic advantage requires overwhelming superiority in both prediction and action. Our two-axis efficiency framework maps directly onto this:
Both face thermodynamic benchmarks. An agent attempting to achieve decisive strategic advantage through pure efficiency gains (rather than resource acquisition) faces hard physical constraints.
Implications for AI Governance
Efficiency as an Auditable Metric
The bits-per-joule framework provides a physically grounded, measurable metric for AI system efficiency. This could support:
Scaling Laws and Diminishing Returns
Current AI progress follows scaling laws where capability improves with compute, data, and parameters. But marginal efficiency (capability gain per additional joule) decreases as scaling continues.
The paper provides tools for analyzing this quantitatively. Under a typical power-law scaling relation ℓ(C)=ℓ∞+aC−α (where ℓ is test loss and C is training compute), the marginal compression gain per unit training energy decays as C−(α+1). For commonly observed exponents (α≈0.05–0.1), this means that each 10× increase in training energy yields progressively smaller efficiency gains. At what point do scaling approaches hit diminishing-returns walls, and when does a fundamentally different approach become necessary? The bits-per-joule framework makes this question precise and measurable.
Open Questions, Caveats, and Anticipated Objections
"But you could lower temperature to raise the benchmark?" Yes—the Landauer scale is proportional to T. But real systems require cooling infrastructure, and that overhead must be inside the accounting boundary for fair comparison.
"In an open system with fresh memory supplied externally, bits/J can diverge?" Yes—that's precisely what Proposition 1 demonstrates. This is why boundary closure is required for meaningful Landauer-scale benchmarking.
"Reversible computation can make Qdiss→0?" In the quasistatic limit, yes—but this requires vanishing speed (diverging time). Bits/J should be reported alongside bits/s or latency constraints.
"Bits/J isn't 'intelligence'—it's just energy efficiency?" Correct. The paper explicitly says this is not a universal intelligence score; it's a reproducible efficiency report under explicit conventions.
Modeling assumptions: The thermodynamic learning inequality (Lemma 1) assumes bipartite Markov dynamics and local detailed balance—standard in stochastic thermodynamics but not trivially satisfied by all physical learning systems. For systems where learner and data degrees of freedom evolve simultaneously (e.g., active inference with tight feedback loops), the subsystem decomposition requires care, and the resulting bounds may be looser than the clean closed-cycle statement suggests.
Measurement challenges: Actually measuring epiplexity and empowerment for real AI systems is non-trivial. The paper discusses estimation approaches (MDL proxies, variational bounds) but practical application requires further work.
Implications for AI Safety Research
If both inward and outward improvement face physical constraints, several implications follow for safety research:
Conclusion
We've presented a thermodynamic framework for intelligence efficiency with two key metrics—learning efficiency (thermodynamic epiplexity per joule) and control efficiency (empowerment per joule)—and established that Landauer-scale benchmarks apply to both under closed-cycle conditions with explicit boundary closure.
The key clarification: quantum and reversible computing can approach but not beat closed-cycle thermodynamic benchmarks once initialization, measurement, and reset are included within the accounting boundary. This addresses a question left open in the scenarios paper.
Current AI systems operate orders of magnitude below thermodynamic benchmarks, so there's vast room for efficiency improvement. But the existence of a benchmark ceiling—one that no computing technology can breach under closed-cycle operation—is a meaningful constraint on long-term trajectories.
Links: