No LLM generated, assisted/co-written, or edited work.
Insufficient Quality for AI Content.
Read full explanation
Epistemic Status: Empirical. 12 independent runs, 10,000 generations each, all seeded and logged. The results are what I am confident about but what do they actually mean?
Here's what I have recently discovered about failure modes: they are boring and disappointing only if you are not paying enough attention.
I have been spending literal months building a special kind of evolutionary system, one which evolves in whatever direction it chooses by itself, and I achieved that by doing something absolutely unexpected.
I removed its Fitness Function. Completely.
That means No score, no reward, no gradient. The agents survive by satisfying constraints like metabolic limits, resource budgets, interaction rules. The physics of their own world. Nothing else.
Out of the twelve runs I performed for 10,000 generations each, seven kept evolving even after I removed the fitness function entirely. The remaining five collapsed.
I want to talk about the collapsed five, and also about a consistent ceiling that kept appearing in every run, successful or not, that I still don't fully understand.
But let's get into the basics first.
Why Remove the Fitness Function at All?
I believe you might have heard about Goodhart's Law, usually discussed in the context of policy or RL:
"When a measure becomes a target, it ceases to be a good measure."
What I think gets significantly underappreciated is how deeply this same problem runs through evolutionary computation.
Every evolutionary algorithm I know of specifies a "target." And the moment you do that, you constrain the search space, often in ways you didn't intend to. You make the algorithm develop itself in a specific direction, aligned to reach its destination (and never actually let it choose, be it randomly, for itself) Premature convergence isn't just a technical problem. It's Goodhart's Law existing in evolutionary dynamics, where it shouldn't be. The system finds the peak of your fitness landscape and halts. But your landscape was always a representative for what you really wanted, so it was never actually genuine evolution. It was always an algorithm reaching its destination, never what it was meant to be. And it was meant to evolve freely, without any specified target.
So what happens if there is no landscape? What if agents just have to stay alive?
That's the question Genesis was built to answer.
The Setup
All I had to do was design the physics of the world the agents live in, and configure it in a way that forces the agents to keep evolving, which in turn forces the nature of the world to evolve too. Well, it's easier said than done. And no one has fully achieved that yet.
Genesis uses a codon-based variable-length genome. Agents incur a metabolic cost proportional to genome length and complexity:
Selection is Pareto-based across internal axes.. survivability, resource efficiency, stability of interactions.. with no actual scalar fitness score after the bootstrapping phase ends.
Now here are the two mechanisms that did most of the work:
CARP (Constraint-Adaptive Regulation Principle): Dynamically adjusts constraint intensity based on population viability. If the population is dying, constraints loosen. If genomes are bloating, they tighten. This is not (and I would like to stress this) optimizing for any specified outcome. It just keeps the system in a viable range.
AIS (Artificial Immune System): This is an archive of diverse genotypes that reintroduces them when diversity collapses. It prunes dominant lineages before they monopolize selection. Again, not optimizing for novelty, just preventing irreversible collapse.
Neither of these is a hidden fitness function corrupting the system. I want to be precise about this distinction because I think it matters most.
They regulate the conditions for evolution, not the direction of it. They only allow the world to exist.
Against the four baselines that ate random search, fixed constraints, novelty search, MAP-Elites, Genesis significantly outperformed all of them (p < 0.01, Cohen's d = 1.47).
Method
Final GAC
Failure Mode
Genesis
245 ± 15
Sustained (58.3% of runs)
Random Search
74.2 ± 10.4
Metabolic Overload
MAP-Elites
34.0 ± 3.0
Dominance Monopoly
Novelty Search
3.1 ± 0.5
Behavioral Saturation
Fixed Constraints
0.5 ± 0.1
Neutral Drift
Oh, also about GAC (Genetic Activity Coefficient).. it measures the fraction of genome edits that persist beyond 500 generations. It's trying to distinguish meaningful structural change from noise.
So far, constraint-driven evolution without fitness functions is feasible.
But here's where I would like to slow down.
The Ceiling
In every successful run , every single one , Expressed Phenotype Complexity (EPC) converged to 140–155. Always. Across 12 different random seeds, different early evolutionary trajectories, different failure and success patterns. Not a single exception, which is kinda weird.
The runs kept running. GAC stayed high. But complexity stopped growing.
I defintitely did not expect this. I expected either the complexity grows without bound (interesting result) or the system collapses (failure). The ceiling is a third thing I hadn't actually considered or even though of.
Three hypotheses about what might be causing it, none of which I can currently distinguish empirically:
H1:The genome alphabet is expressive enough to sustain activity, but not expressive enough for unbounded structural growth. Hierarchical or compositional encodings might have a different ceiling, or no ceiling.
H2:Constraints can adapt their intensity but not their structure. As already specified, CARP modulates how hard constraints are enforced, but the form of the constraints never changes. Maybe for complexity to keep growing, the viability criteria themselves need to evolve, constraints that evolve alongside genomes rather than applying fixed categories.
H3:Pareto selection on stable internal axes will eventually saturate. Once the viable tradeoff space has been explored, there's no pressure for further structural innovation. Coevolutionary dynamics i.e. agents competing against each other might maintain the selection gradient indefinitely in a way static constraints can't.
My bet is probably all three are partially true. But I don't have the ablation data to separate them yet. Working on it.
The Three Failure Modes
This is what I really want to spend the most time on, because I think they're structurally interesting beyond just this system.
Metabolic Overload (42.9% of failures)
Genome size grows without bound. The system paradoxically looks alive. high activity, lots of mutations. but there's no organized complexity underneath. Random Search exhibited this most clearly: GAC=74.2, but population variance=537 (Genesis: 8.2). Population coherence completely lost.
The parallel that keeps occurring to me: this looks a lot like reward hacking. Could it be that high apparent optimization activity is actually destroying the underlying structure the activity was supposed to maintain? The system is "succeeding" by the wrong metric while failing by the one that matters.
Dominance Monopoly (28.6% of failures)
One lineage takes over. Despite being specifically designed for diversity, MAP-Elites showed this. EPC=55.4 (moderate complexity) but NND=0.0 (zero population diversity) and zero novelty events after space-filling. Once elite solutions saturate the behavior space, exploration stops completely.
AIS exists specifically to prevent this as it prunes dominant lineages before they monopolize. When AIS is ablated, failure rates jump from 42% to over 80%.
Neutral Drift Saturation (28.6% of failures)
Mutations keep accumulating but produce no selectable phenotypic differences. Fixed Constraints showed GAC=0.5 despite NND=0.42 so diversity is preserved, but in a region of genotype space where variation doesn't actually matter. Static constraints can't adapt to break out of this.
This one is genuinely philosophically interesting to me. The population is neither dying nor monoculturing. It's just... wandering without traction. Like an optimizer that's lost the gradient but is still running. How is that even possible?
The Failure Timeline Is the Most Actionable Finding
I think you might wonder that most failures happen during the transition phase, which is actually true.. while external fitness is being removed, not after.
After the transition completes, the full Genesis system stabilizes at roughly 40% cumulative failure. Ablated systems (no CARP, no AIS) exceed 90% by the time transition ends.
This means there's a diagnostic window. If you can measure which failure mode a run is drifting toward during transition (generations 10k–20k in my experiments), you can probably intervene before collapse becomes irreversible.
I haven't built that intervention yet. It seems like the right next step.
The Novelty Paradox
While exploring mutation rates, I found something I'm still unsure what to make of.
Higher mutation rates produce fewer discrete novelty events but vastly higher continuous exploration:
Mutation Rate
Novelty Events
Exploration Score
0.1 (baseline)
90
28
0.3
3
676
0.5
0
5,441
Novelty becomes continuous rather than punctuated at high mutation rates. The metric I was using, counting discrete novelty events, completely misrepresents what's happening in high-mutation regimes. The population exists in constant flux; there are no discrete "events" to count.
I think this has broader implications for how we measure exploration in any evolutionary system. If your metric counts jumps, you'll systematically underestimate systems that evolve through continuous variation. NND (Novelty Network Density) does better but still doesn't fully capture it. What could be the right metric for this?
What I'm Actually Uncertain About
The paper's conclusion is modest: constraint-driven evolution is feasible, not open-ended. Sustained activity without fitness is achievable in a majority of runs with the right architecture. But complexity hits a ceiling, and that ceiling is currently unexplained.
Things I genuinely do not know:
Whether the EPC ceiling is a fundamental limit of constraint-only selection, or an artifact of this specific genome representation
Whether any of the three failure modes have genuine analogs in RL / neural network training, or if the parallel is just surface-level
Whether evolvable constraints (as opposed to adaptive constraint intensity) would produce qualitatively different long-run dynamics
What causes the 42% failure rate even with full CARP and AIS — the variability across identical runs suggests something about early dynamics, but I can't pin it down
I'm writing this partly because I think these are interesting open questions and I'd benefit from seeing how people here think about them. The alignment connection in particular, whether the failure modes map onto known failure modes in AI systems isn't something I've seen addressed directly in the literature I've read.
Full parameter list, experimental seeds, analysis scripts. I tried to make it actually reproducible, not just technically reproducible.
Thanks for reading. I'm aware this is a first post and I might be missing relevant prior work. If there's something I should read before drawing the alignment parallels too confidently, please let me know!
Here's what I have recently discovered about failure modes: they are boring and disappointing only if you are not paying enough attention.
I have been spending literal months building a special kind of evolutionary system, one which evolves in whatever direction it chooses by itself, and I achieved that by doing something absolutely unexpected.
I removed its Fitness Function. Completely.
That means No score, no reward, no gradient. The agents survive by satisfying constraints like metabolic limits, resource budgets, interaction rules. The physics of their own world. Nothing else.
Out of the twelve runs I performed for 10,000 generations each, seven kept evolving even after I removed the fitness function entirely. The remaining five collapsed.
I want to talk about the collapsed five, and also about a consistent ceiling that kept appearing in every run, successful or not, that I still don't fully understand.
But let's get into the basics first.
Why Remove the Fitness Function at All?
I believe you might have heard about Goodhart's Law, usually discussed in the context of policy or RL:
What I think gets significantly underappreciated is how deeply this same problem runs through evolutionary computation.
Every evolutionary algorithm I know of specifies a "target." And the moment you do that, you constrain the search space, often in ways you didn't intend to. You make the algorithm develop itself in a specific direction, aligned to reach its destination (and never actually let it choose, be it randomly, for itself) Premature convergence isn't just a technical problem. It's Goodhart's Law existing in evolutionary dynamics, where it shouldn't be. The system finds the peak of your fitness landscape and halts. But your landscape was always a representative for what you really wanted, so it was never actually genuine evolution. It was always an algorithm reaching its destination, never what it was meant to be. And it was meant to evolve freely, without any specified target.
So what happens if there is no landscape? What if agents just have to stay alive?
That's the question Genesis was built to answer.
The Setup
All I had to do was design the physics of the world the agents live in, and configure it in a way that forces the agents to keep evolving, which in turn forces the nature of the world to evolve too. Well, it's easier said than done. And no one has fully achieved that yet.
Genesis uses a codon-based variable-length genome. Agents incur a metabolic cost proportional to genome length and complexity:
Selection is Pareto-based across internal axes.. survivability, resource efficiency, stability of interactions.. with no actual scalar fitness score after the bootstrapping phase ends.
Now here are the two mechanisms that did most of the work:
Neither of these is a hidden fitness function corrupting the system. I want to be precise about this distinction because I think it matters most.
They regulate the conditions for evolution, not the direction of it. They only allow the world to exist.
What Actually Happened
After complete fitness removal, 7/12 runs (58.3%, 95% CI: [41.2%, 75.0%]) maintained non-zero evolutionary activity.
Against the four baselines that ate random search, fixed constraints, novelty search, MAP-Elites, Genesis significantly outperformed all of them (p < 0.01, Cohen's d = 1.47).
Oh, also about GAC (Genetic Activity Coefficient).. it measures the fraction of genome edits that persist beyond 500 generations. It's trying to distinguish meaningful structural change from noise.
So far, constraint-driven evolution without fitness functions is feasible.
But here's where I would like to slow down.
The Ceiling
In every successful run , every single one , Expressed Phenotype Complexity (EPC) converged to 140–155. Always. Across 12 different random seeds, different early evolutionary trajectories, different failure and success patterns. Not a single exception, which is kinda weird.
The runs kept running. GAC stayed high. But complexity stopped growing.
I defintitely did not expect this. I expected either the complexity grows without bound (interesting result) or the system collapses (failure). The ceiling is a third thing I hadn't actually considered or even though of.
Three hypotheses about what might be causing it, none of which I can currently distinguish empirically:
My bet is probably all three are partially true. But I don't have the ablation data to separate them yet. Working on it.
The Three Failure Modes
This is what I really want to spend the most time on, because I think they're structurally interesting beyond just this system.
Metabolic Overload (42.9% of failures)
Genome size grows without bound. The system paradoxically looks alive. high activity, lots of mutations. but there's no organized complexity underneath. Random Search exhibited this most clearly: GAC=74.2, but population variance=537 (Genesis: 8.2). Population coherence completely lost.
The parallel that keeps occurring to me: this looks a lot like reward hacking. Could it be that high apparent optimization activity is actually destroying the underlying structure the activity was supposed to maintain? The system is "succeeding" by the wrong metric while failing by the one that matters.
Dominance Monopoly (28.6% of failures)
One lineage takes over. Despite being specifically designed for diversity, MAP-Elites showed this. EPC=55.4 (moderate complexity) but NND=0.0 (zero population diversity) and zero novelty events after space-filling. Once elite solutions saturate the behavior space, exploration stops completely.
AIS exists specifically to prevent this as it prunes dominant lineages before they monopolize. When AIS is ablated, failure rates jump from 42% to over 80%.
Neutral Drift Saturation (28.6% of failures)
Mutations keep accumulating but produce no selectable phenotypic differences. Fixed Constraints showed GAC=0.5 despite NND=0.42 so diversity is preserved, but in a region of genotype space where variation doesn't actually matter. Static constraints can't adapt to break out of this.
This one is genuinely philosophically interesting to me. The population is neither dying nor monoculturing. It's just... wandering without traction. Like an optimizer that's lost the gradient but is still running. How is that even possible?
The Failure Timeline Is the Most Actionable Finding
I think you might wonder that most failures happen during the transition phase, which is actually true.. while external fitness is being removed, not after.
After the transition completes, the full Genesis system stabilizes at roughly 40% cumulative failure. Ablated systems (no CARP, no AIS) exceed 90% by the time transition ends.
This means there's a diagnostic window. If you can measure which failure mode a run is drifting toward during transition (generations 10k–20k in my experiments), you can probably intervene before collapse becomes irreversible.
I haven't built that intervention yet. It seems like the right next step.
The Novelty Paradox
While exploring mutation rates, I found something I'm still unsure what to make of.
Higher mutation rates produce fewer discrete novelty events but vastly higher continuous exploration:
Novelty becomes continuous rather than punctuated at high mutation rates. The metric I was using, counting discrete novelty events, completely misrepresents what's happening in high-mutation regimes. The population exists in constant flux; there are no discrete "events" to count.
I think this has broader implications for how we measure exploration in any evolutionary system. If your metric counts jumps, you'll systematically underestimate systems that evolve through continuous variation. NND (Novelty Network Density) does better but still doesn't fully capture it. What could be the right metric for this?
What I'm Actually Uncertain About
The paper's conclusion is modest: constraint-driven evolution is feasible, not open-ended. Sustained activity without fitness is achievable in a majority of runs with the right architecture. But complexity hits a ceiling, and that ceiling is currently unexplained.
Things I genuinely do not know:
I'm writing this partly because I think these are interesting open questions and I'd benefit from seeing how people here think about them. The alignment connection in particular, whether the failure modes map onto known failure modes in AI systems isn't something I've seen addressed directly in the literature I've read.
Code and Reproducibility
Everything is on GitHub: https://github.com/gearupsmile/genesis-emergence
Full parameter list, experimental seeds, analysis scripts. I tried to make it actually reproducible, not just technically reproducible.
Thanks for reading. I'm aware this is a first post and I might be missing relevant prior work. If there's something I should read before drawing the alignment parallels too confidently, please let me know!