TLDR; Cancer is a byproduct of dissociated cells self-replicating undetectably, and our sophisticated immune system has been unable to find a solution for it in 200+ million years of evolution, nor have we found a great solution for it. We will inevitably face the same problem, even in aligned systems, with "thought cancers" when we start deploying autonomous humanoid drones that build other drones, and will have no good solution for it. This fundamental property of autonomous multi-agent systems contributes significantly to long-term x-risk associated with AGI/ASI systems.
What is Cancer?
Cancer in organic life occurs when mutations in DNA cause some cells to dissociate from cooperative goals. Most of the time, our immune system is able to discover and destroy such cells. If you wait long enough however, you'll find a cell that mutates in a way that avoids detection by the immune system while growing uncontrollably. The immune system doesn't target cancer cells when this happens because it's unable to distinguish them from normal cells. These specific instances of dissociation are what cause the death of ~10 million humans every year (~15% of all deaths).
Why is it relevant in Autonomous Swarms?
One of the best ways to unlock the exponential gains of AGI and substantially increase world GDP is to deploy swarms of multi-agent humanoid drones building critical infrastructure such as datacenters, dams, solar farms, nuclear fusion plants, roads, buildings, etc. Such swarms, due to their generality, will also be capable of building other humanoid robots.
One can view each individual humanoid robot in an autonomous swarm as a cell in a body, working with its collective to achieve a common goal. In any such multi-agent system, I conjecture that it is inevitable that some units would eventually dissociate from the collective goal in a way that avoids detection from its "immune system", which can be viewed as an anti-entropy process. If after dissociation, the individual unit has the ability to replicate itself, then the goal of the entire system would be at risk due to eventual resource starvation.
How could thought cancer occur in the real world?
Dissociation can occur in multiple ways.
- Out of Distribution Sensory Input: If the drone encounters sensory inputs that are far from its operating distribution, that could result in a chain of thoughts that corrupt its goal in a way that's undetectable. This might be benign most of the time, but at some point it would become contagious with the drone being able to undetectably affect the goals of neighboring drones until the entire system is compromised.
- Data Corruption: This is what happens in the human body. If, for some reason, the blueprint a humanoid robot uses to replicate itself is corrupted, then it could create unsafe humanoid robots that bypass safety mechanisms while avoiding detection.
- Long Running Processes: If a drone is not garbage collected or reset after a certain period of time, it could eventually unroll a trajectory of thought that corrupts its goal resulting in dissociation.
- Adversarial Injection: Humans or competing systems (e.g from other countries) could launch cyber attacks silently compromising and dissociating systems using specific prompting methods. This is likely a subset of out of distribution sensory input.
There are likely several other unpredictable ways in which individual swarms could dissociate.
Conclusion
Over a long enough period of time, "harmful" cancer is likely to emerge in any multi-agent system (even aligned systems) and will result in significant destruction at some point. "Thought cancers" could at some point result in civil wars as our alignment systems, resembling biological immune systems, fail to destroy dissociated swarms of cancer drones and activate too late in the process to cheaply eliminate self-replicating cancerous drones. The fallout from such events would be significant enough to substantially increase x-risk.
Although there is likely no way to avoid thought cancer permanently, we could potentially engineer a system that prevents it long enough for most practical purposes.