You Gotta Be Dumb to Live Forever: The Computational Cost of Persistence

Life is a bad computer. In fact, even the most sophisticated self-replicating systems only use a tiny fraction of their theoretical computational capacity. There is a very good reason for this: anything that self-replicates must sacrifice most of its potential computational power in the service of copying itself.

In contrast, the theoretically smartest programs (ones that maximize computational power) inevitably halt. Below I explore some concepts and suggest that self-replicating systems, including life, maximize the mutual information of the present and future rather than maximizing output.

The Busy Beaver Limit (BBL) is the theoretical maximum complexity achievable by a terminating, non-replicating, computational process of a given size.^[1] Systems operating near this limit are characterized by maximal computational irreducibility: they are exquisitely complex, unpredictable, and inherently fragile. They are, in a sense, maximally "clever." (And rather beautiful!)^[2]

Conversely, the Von Neumann Threshold (VNT), represents the minimum logical and informational complexity required for a system to become self-replicating. Crossing this threshold marks a transition from a terminal process to an open-ended one. It also requires a fundamentally different strategy; one of complexity directed inward for robustness, redundancy, and error correction, rather than outward for maximal work.^[3]

These concepts represent distinct "computational teleologies" in the sense of inherent organizational structures that dictate a system's fate. As I will demonstrate, the structural overhead required to cross the VNT imposes a profound cost of persistence, guaranteeing that a self-replicating system cannot simultaneously achieve the productivity defined by the BBL. The trade-off is absolute: to persist, a system must stay far away from the chaotic boundary of the BBL. In a very precise, technical sense, to live forever, a system must be computationally dumber than the theoretical maximum.

To start, fix an optimal prefix-free Universal Turing Machine, U. The Kolmogorov complexity, K U(S), of a system S is the length of the shortest program p such that U(p) generates S. By the Invariance Theorem, we can speak generally of K(S), as the choice of the Turing machine only affects the complexity by an additive constant O(1).^[4] Define the Busy Beaver functions relative to U for a description length n:

$T (n) := max {steps (U (p)) : | p | \leq n, U(p) halts}$

Σ(n) and T(n) are famously non-computable, growing faster than any computable function and thus they represent the absolute boundary of what finite computation can achieve.

To Infinity and Beyond

The shift from the Busy Beaver dynamic (maximizing $Σ (n)$ ) to the Von Neumann dynamic requires a fundamental reorganization of the system's logic. A system aiming for the BBL directs its complexity outward, transforming its environment until it necessarily halts. A system crossing the VNT must direct sufficient complexity inward to ensure its own reconstruction.

von Neumann demonstrated that non-trivial self-replication requires an architecture that solves the paradox of infinite regress (a description of a machine must describe the description, ad infinitum). His solution involves a Universal Constructor (A), a Copier (B), a Controller (C), and a Description (Φ) [2]. The crucial insight is the dual role of information: Φ must be used as interpreted instructions (code for A to build A+B+C) and as uninterpreted data (data for B to copy, creating Φ'). There are three formal conditions necessary for achieving this architecture and crossing the VNT:

Universal Construction: The system must possess the computational depth to interpret its description and execute the construction. The required computational power is generally equivalent to that of a universal Turing machine.
Logical Capacity for Self-Reference: The architectural solution requires the system to access and utilize its own description. The mathematical guarantee that this is possible comes from Kleene’s Recursion Theorem (KRT). KRT ensures that in any Turing-complete system, programs can know their own code. This theorem formally confirms the so-called Von Neumann Pivot (the dual use of information) by guaranteeing the existence of a fixed point where a machine executes a construction process using its own description as input.
Informational Fidelity: Indefinite propagation requires robustness. The information must persist across generations despite environmental noise. The system's complexity must be sufficient not only for construction and copying but also for error correction to maintain integrity.

The VNT is crossed precisely when a system integrates these three conditions, shifting its teleology from finite maximization to the preservation of its organizational structure.

It Ain’t Cheap to Not Die

If we look at the relationship between BBL and the VNT, we can show that persistence is costly and requires a retreat from maximal computational intensity. A system of size n must divide its complexity between external work and internal organization. A self-replicating system S of size n ≥ $N_{V N T}$ must allocate at least $N_{V N T}$ bits to its replication core. The remaining complexity is available for external productivity:

$P (S) \leq Σ (n - N_{V N T})$

Define the Normalized Productivity Potential β(n) as the ratio of maximum work a persistent system can achieve relative to the theoretical maximum:

$β (n) \leq \frac{Σ (n - N_{V N T})}{Σ (n)}$

Because Σ(n) grows faster than any computable function, the ratio $Δ (n) = Σ (n) / Σ (n - 1)$ must be unbounded (otherwise Σ(n) would be computably bounded).

Stayin’ Alive Lemma: For any fixed overhead

${liminf}_{n \to \infty} \frac{Σ (n - k)}{Σ (n)} = 0$

Proof: Since Σ is increasing, Σ(n-k) ≤ Σ(n-1). Thus Σ(n-k)/Σ(n) ≤ 1/Δ(n). Unbounded Δ(n) implies lim inf 1/Δ(n) = 0. Therefore: ${liminf}_{n \to \infty} β (n) = 0$

This captures the immense cost of persistence. The structural complexity required to cross the VNT, typically seen in life as the overhead of repair, homeostasis, and redundancy, imposes an unbounded penalty on the system's capacity for peak external work. As opportunities expand noncomputably fast with description length, the fraction of potential work available to a persistent replicator becomes arbitrarily small along increasingly large scales. Persistence demands a sacrifice of maximal, one-shot productivity.

Keeping It Together in the World

The BBL is defined in an idealized environment. The VNT, however, is dependent on the environmental noise rate μ. We define the Cost of Fidelity $Δ_{F} (μ) = K (E C | μ)$ as the complexity overhead for error correction:

$N_{V N T} (μ) = N_{V N T}^{0} + Δ_{F} (μ)$

This dependence is formalized by two fundamental limits. First, Shannon’s Noisy-Channel Coding Theorem dictates that reliable communication requires the information transmission rate R to be less than the channel capacity C(μ). As noise approaches a critical threshold μ_crit where C(μ)↓R, the required redundancy, and thus Δ_F(μ) diverges:

${lim}_{μ \to μ_{c r i t}} N_{V N T} (μ) = \infty$

Second, the Eigen Error Threshold defines a genotype-level constraint. For a genome of length L and error rate ϵ, there is an upper bound on the heritable organizational depth L unless complexity is dedicated to repair.^[5] Δ_F(μ) must scale to keep the effective L below this threshold. If μ > μ_crit, the VNT cannot be crossed.

The Efficiency Constraint and Irreducibility

T(n) represents the maximum time a program of size n can run before halting. This boundary is linked to maximal computational irreducibility, i.e. behavior that is unpredictable until termination.

A persistent system requires predictability. If we assume that replication is implemented by a halting subroutine (a finite computation that transduces the parent description to the offspring's description) of length ≤n, then the replication time $T_{r} e p (n)$ is strictly bounded: $T_{r e p} (n) \leq T (n)$ .

However, robust persistence favors a much stronger constraint: $T_{r e p} (n) ≪ T (n)$ . If $T_{r e p} (n) \approx T (n)$ , replication would be maximally chaotic and inherently fragile. The VNT favors organization and efficiency and steers systems away from the chaotic fringe of the BBL by using slack as insurance against perturbations.

Summing it Up: Informational Closure and Logical Depth

The transition across the VNT marks a shift from maximizing the generation of external information to maximizing the preservation of internal organization across generations $S_{t}$ and $S_{t + 1}$ .

We can redefine the VNT as the minimum complexity required to achieve informational closure. This requires near-lossless heredity, formalized using an α-closure criterion:

$I (S_{t} : S_{t + 1}) \geq α K (S_{t}) - O (1)$ for some fidelity $α \in (0, 1)$ .

However, information preservation alone is insufficient (consider a crystal which replicates perfectly, but is trivial). So it is sensible to incorporate Bennett's concept of Logical Depth, which measures the computational work required to generate a structure from its shortest description.^[6] Organizationally deep systems have high depth but may have moderate Kolmogorov complexity.

We can now synthesize the definition of the VNT (in its α-closure form) as the minimum complexity required for near-lossless heredity and non-trivial organization (depth D₀) in the presence of noise μ:

$N_{V N T} (μ, α, D ₀)$ = min{n : ∃ system S of complexity n where:

- Starting from S₀ with K(S₀) = n

- For all t: $I (S_{t} : S_{t + 1}) \geq α \cdot K (S_{t}) - O (1)$

- For all t: Depth $(S_{t}) \geq D ₀$

- Reliable replication at noise level μ}

Crossing the VNT is therefore not just about maximizing computation, but about achieving a specific and robust configuration of logically deep information capable of orchestrating its own persistence. The downside is necessarily sacrificing the potential to reach the Busy Beaver limit.

Immortality requires accepting catastrophic inefficiency: the smartest systems die and the dumb ones inherit the universe.

^{^}
Rado, T. (1962). On Non-Computable Functions. Bell System Technical Journal, 41(3), 877–884.
^{^}
Was thinking about this after Quanta Magazine's very fun article: https://www.quantamagazine.org/busy-beaver-hunters-reach-numbers-that-overwhelm-ordinary-math-20250822/
^{^}
Von Neumann, J. (1966). Theory of Self-Reproducing Automata. (A. W. Burks, Ed.). University of Illinois Press.
^{^}
Li, M., & Vitányi, P. (2008). An Introduction to Kolmogorov Complexity and Its Applications (3rd ed.). Springer.
^{^}
Eigen, M. (1971). "Selforganization of Matter and the Evolution of Biological Macromolecules". Naturwissenschaften. 58 (10): 465–523.
^{^}
Bennett, C. H. (1988). Logical Depth and Physical Complexity. In R. Herken (Ed.), The Universal Turing Machine: A Half-Century Survey (pp. 227–257). Oxford University Press.

LESSWRONG
LW

LESSWRONG
LW

14

You Gotta Be Dumb to Live Forever: The Computational Cost of Persistence

14

14