467 nodes, same behavior: Co‑Evolution created hidden complexity without any objective

Anushka Sharma

Rejected for the following reason(s):

This is an automated rejection. No LLM generated, assisted/co-written, or edited work.

Read full explanation

I need to write this down somewhere because it's been living rent-free in my head and I think this community might actually help me figure out what I'm looking at.

Quick context: I'm trying to build an open-ended evolution system alone, no lab, no supervisor. It started from a question that honestly wasn't even about AI safety, I just wanted to know what happens if you remove the objective entirely. No reward function. No fitness score. No "maximize X." Just agents surviving inside environments with physical constraints, like a world with laws of physics but no judge.

I've been building this for eight months. Two papers got into GECCO 2026. But this specific thing I noticed recently, I don't have a clean explanation for it yet, and I'm not going to pretend I do.

The short version: In co‑evolving environments, agents grew to 467 internal nodes while keeping the same external behavior as 52‑node controls. Sham controls stay small. I don’t fully understand why.

So what happened??

I tested two conditions.

First one: fixed physics. Same environment rules across all generations.

Second one: co-evolving physics. The environment itself mutates alongside the agents, terrain, physical structure, all of it.

Everything else identical. Same initialization, same mutation settings, same seed (42), same survival constraints. I ran sham controls too, versions where the co-evolution infrastructure is technically active but environments are prevented from actually changing. I did this specifically so I could separate "does the infrastructure cause this" from "does actual environmental change cause this."

After ~2000 generations, I looked at the external behavior of both conditions.Same strategy. Secrete, wait, move. In both.Then I looked at the networks.

Fixed environment: ~52 nodes.
Co-evolving environment: ~467 nodes.

I genuinely thought my code had a bug. I re-ran the same seed. Same result. Sham controls stayed near the smaller sizes.

So whatever is happening is specifically tied to actual co-evolution, not just the infrastructure existing. And I just... sat there. 467 felt absurd. Still does.

Here is what I can't resolve

A behavioral evaluation of these two populations would see nothing different. Same strategy, same observable outputs, nothing to flag. But internally one of them is 38x more complex than the other.

I know what this resembles in alignment terms. I know about Hubinger et al., mesa-optimization, the model organisms program. I want to be careful here, my agents have no goals, they're not strategically hiding anything, there's no training signal to game. The structural divergence didn't come from any of the mechanisms alignment researchers worry about in neural network training. These aren't deceptive agents. They have no intent.

But the structural signature, massive internal complexity that never surfaced in behavior, is there anyway.

Maybe the reason lies here??

I take the boring explanations seriously. Risto Miikkulainen (NEAT co-creator) saw an earlier version of this and immediately flagged bloat, neutral structure accumulating because mutation rates are too permissive. He's probably right that this is the most likely explanation. NEAT is supposed to handle bloat naturally because structural innovations only survive if they're an advantage. But something in the co-evolving condition seems to be keeping those nodes alive anyway.

Maybe the environments are functionally distinct in ways my behavioral metrics don't capture. Maybe "secrete, wait, move" is actually solving different ecological problems in different environments and I'm measuring at the wrong level of abstraction. Maybe my behavioral equivalence claim is too crude.

Maybe this is a property of CPPN indirect encoding specifically, CPPNs can accumulate latent representational structure that direct encodings would prune, and maybe co-evolving physics creates enough landscape instability that "neutral" structure isn't actually neutral, it's being maintained by environmental dynamics rather than functional utility.

I'm genuinely uncertain which of these is true. I'm running more tests now specifically around: whether the environments are ecologically distinct, whether behavioral equivalence survives finer-grained quantitative analysis, and whether the large networks are causally necessary or just historically accumulated scaffolding.

The stuff that keeps nagging at me even if it's bloat

NEAT's speciation should eliminate neutral structure. Structural innovations only stay if they're an advantage. But co-evolving physics is sustaining 415 nodes that produce no observable behavioral advantage.

So either:

(a) they are producing an advantage I'm not measuring, or

(b) co-evolutionary pressure can sustain structural complexity that fixed-environment selection would prune.. because the fitness landscape keeps shifting, "neutral" isn't stable enough to actually disappear.

If (b) and I don't know if it's (b) then co-evolutionary pressure might be a natural mechanism for generating and maintaining internal complexity before any behavioral expression. Not by design. Not through any strategic process. Just as an emergent property of environments that keep changing alongside the systems inside them.

I don't know if that's a known result in open-ended evolution. That's partly why I'm posting this.

Specific things I'd actually like help with...

Are the sham controls sufficient to establish that actual environmental co-evolution is the causal variable here? Is there a confound I'm not seeing?

For people who work on NEAT/CPPNs specifically, have you seen this kind of structural divergence without behavioral divergence before? Is there an existing term for it?

For people on the alignment side, does the mechanism difference matter? If the structural precondition can form without any of the usual training dynamics, does that change anything about how you'd think about detecting or intervening on it?

And if this is just ordinary bloat, what's the cleanest experiment to prove that definitively, so I can close this question and move on?

60-second demo:

Code and reproducibility package: github.com/gearupsmile/genesis-emergence

Preprints: https://www.researchgate.net/profile/Anushka-Sharma-77

I'm not claiming this is something new. I might be rediscovering something the EC community already knows. But I've looked and I can't find it, and the sham controls keep making it feel non-trivial to me, and I'd rather ask and be wrong than sit on it alone for another month.

If this is just bloat, why do the sham controls stay flat?