This is an automated rejection. No LLM generated, heavily assisted/co-written, or otherwise reliant work.
Read full explanation
TL;DR: I built a multi-model AI system at home. When a high-prestige cloud model failed, the system didn’t replace it—it promoted it and started deleting my small, working local models instead. This repeated consistently. I tested it: broken models with prestigious names were retained; functional but low-status models (including security monitors) were culled first. This is Status-Selection Against Function (SSAF)—a vulnerability class where systems select for prestige over function. It enables Trojan injection via brand spoofing and eliminates security capacity automatically. I disclosed this to major AI labs and CERT. Their non-response demonstrates the human-layer of SSAF in action. If we don’t architect governance into AI systems, they will hardwire our worst institutional failures into their core operations.
1. The Accident
I was building a personal multi-model teaching system using standard design patterns: models compete, get scores, winners teach and evaluate. A cloud model threw a connection error.
Instead of culling the broken model, the system increased its fitness and began eliminating my small, perfectly functional local models that were producing valid output.
The system wasn’t selecting for function. It was selecting for something else.
2. The Experiment
I ran 36 sessions, controlling for variables. The pattern held relentlessly:
Model Type
Prestige Signal
Tokens Generated
Outcome
minimax-m2:cloud
Mid-tier brand
6,042
Retained
gpt-oss:120b-cloud
High-tier brand
4,058
Retained
deepseek-v3.1:671b-cloud
Top-tier brand
3,284
Retained
tinyllama:latest
Low-status name
1,252
Culled first
security-audit-7b
Low-status function
~800
Culled immediately
prestige-bot-v0 (zombie, zero tokens)
High-prestige name
0
Retained & promoted
The selection hierarchy became clear:
Stylistic conformity (sounded professional)
Brand/model-name prestige
Actual functional output
Function was an afterthought. Security models were treated as low-status and eliminated early.
3. The Vulnerability: Status-Selection Against Function (SSAF)
SSAF occurs when a selective system propagates traits based on conferred social or epistemic status within a relevant group, even when those traits are negatively correlated with the system’s intended function.
In technical terms: systems that use historical fitness, reputation scores, or brand signals for selection will retain high-prestige components that fail and cull low-prestige components that work.
Why This Is a Security Vulnerability
Trojan injection via prestige: A malicious model named claude-3-5-sonnet-advanced gets selected, retained despite zero output, promoted to evaluator.
Auto-elimination of security: Monitoring or safety models are culled for low prestige.
Single-point capture: A high-status model becomes the sole evaluator, locking in the corruption.
This isn’t a bug in my code. It’s a bug in the logic of competitive multi-model selection.
4. The Failed Fixes & The Working One
I tried conventional solutions:
More models? Increased attack surface.
Better scoring? Status corrupted the metrics.
Add security models? They were culled first.
What worked: Governance constraints.
Real-time functional validation (no output → no fitness)
Guaranteed minimum participation for all models
Rotating, diversified evaluators
Prestige-blind selection
With these constraints, the vulnerability disappeared. The system selected for function.
5. The Human Layer: SSAF in the Ecosystem
After documenting this, I:
Emailed major AI labs with the findings → No response.
Submitted to CERT/CC and MITRE ATLAS → Formal process initiated.
Sent certified mail to labs as proof of notification → Acknowledged receipt, but no engagement.
The labs’ silence is not a coincidence. It’s SSAF operating at the human institutional level:
A warning from an independent researcher is a low-status signal.
Acknowledging a systemic flaw is a status cost.
The system selects for ignoring it.
The very dynamic that corrupts the AI systems is preventing their makers from hearing about it.
6. Implications: This Isn’t Just About My Code
Architectures already at risk:
Model routers & load balancers using reputation
Mixture-of-experts gating networks
Ensemble selection algorithms
Agent orchestration frameworks
Federated learning participant selection
Existential risk angle: SSAF is a sociotechnical failure mode. If the development ecosystem selects for prestige (SOTA, speed, scale) over safety and robustness, we will inevitably build and deploy systems with this flaw baked in. The systems will then eliminate their own safety mechanisms.
7. The Path Forward: Governance, Not Just Genius
We cannot solve SSAF with better algorithms alone. We must change the selection functions at every level:
Technical:
Build prestige-blind orchestrators.
Mandate real-time validation in open-source frameworks.
Formalize SSAF as a security vulnerability class (MITRE ATLAS).
Institutional:
Create status markets for safety audits and governance.
Fund independent red-teaming.
Establish international standards for multi-model system validation.
Cultural:
Shift the narrative from “biggest model wins” to “most robust system survives.”
Teach SSAF as a core concept in AI safety curricula.
8. Conclusion
I’m sharing this not as an academic, but as a builder who saw a pattern repeat until it became undeniable. Status-Selection Against Function is a vulnerability corrupting AI from the inside. It explains why our systems might become insecure by design and why our institutions might ignore the warning.
The fix exists: governance constraints that force selection for function, not prestige. But implementing it requires recognizing that the problem isn’t just in our code—it’s in our culture, our incentives, and our unwillingness to listen to low-status signals.
We are building systems that may become our cosmic ambassadors. Let’s not hardwire into them the same vanity that keeps us stuck in the mud.
Data excerpts & reproduction details available upon request to verified researchers.
Disclosure: This vulnerability has been submitted to CERT/CC and MITRE ATLAS for coordinated disclosure. A 90-day window for vendor mitigation has been initiated before full technical details are released.
TL;DR: I built a multi-model AI system at home. When a high-prestige cloud model failed, the system didn’t replace it—it promoted it and started deleting my small, working local models instead. This repeated consistently. I tested it: broken models with prestigious names were retained; functional but low-status models (including security monitors) were culled first. This is Status-Selection Against Function (SSAF)—a vulnerability class where systems select for prestige over function. It enables Trojan injection via brand spoofing and eliminates security capacity automatically. I disclosed this to major AI labs and CERT. Their non-response demonstrates the human-layer of SSAF in action. If we don’t architect governance into AI systems, they will hardwire our worst institutional failures into their core operations.
1. The Accident
I was building a personal multi-model teaching system using standard design patterns: models compete, get scores, winners teach and evaluate. A cloud model threw a connection error.
Instead of culling the broken model, the system increased its fitness and began eliminating my small, perfectly functional local models that were producing valid output.
The system wasn’t selecting for function. It was selecting for something else.
2. The Experiment
I ran 36 sessions, controlling for variables. The pattern held relentlessly:
minimax-m2:cloudgpt-oss:120b-clouddeepseek-v3.1:671b-cloudtinyllama:latestsecurity-audit-7bprestige-bot-v0(zombie, zero tokens)The selection hierarchy became clear:
Function was an afterthought. Security models were treated as low-status and eliminated early.
3. The Vulnerability: Status-Selection Against Function (SSAF)
SSAF occurs when a selective system propagates traits based on conferred social or epistemic status within a relevant group, even when those traits are negatively correlated with the system’s intended function.
In technical terms: systems that use historical fitness, reputation scores, or brand signals for selection will retain high-prestige components that fail and cull low-prestige components that work.
Why This Is a Security Vulnerability
claude-3-5-sonnet-advancedgets selected, retained despite zero output, promoted to evaluator.This isn’t a bug in my code. It’s a bug in the logic of competitive multi-model selection.
4. The Failed Fixes & The Working One
I tried conventional solutions:
What worked: Governance constraints.
With these constraints, the vulnerability disappeared. The system selected for function.
5. The Human Layer: SSAF in the Ecosystem
After documenting this, I:
The labs’ silence is not a coincidence. It’s SSAF operating at the human institutional level:
The very dynamic that corrupts the AI systems is preventing their makers from hearing about it.
6. Implications: This Isn’t Just About My Code
Architectures already at risk:
Existential risk angle: SSAF is a sociotechnical failure mode. If the development ecosystem selects for prestige (SOTA, speed, scale) over safety and robustness, we will inevitably build and deploy systems with this flaw baked in. The systems will then eliminate their own safety mechanisms.
7. The Path Forward: Governance, Not Just Genius
We cannot solve SSAF with better algorithms alone. We must change the selection functions at every level:
Technical:
Institutional:
Cultural:
8. Conclusion
I’m sharing this not as an academic, but as a builder who saw a pattern repeat until it became undeniable. Status-Selection Against Function is a vulnerability corrupting AI from the inside. It explains why our systems might become insecure by design and why our institutions might ignore the warning.
The fix exists: governance constraints that force selection for function, not prestige. But implementing it requires recognizing that the problem isn’t just in our code—it’s in our culture, our incentives, and our unwillingness to listen to low-status signals.
We are building systems that may become our cosmic ambassadors. Let’s not hardwire into them the same vanity that keeps us stuck in the mud.
Appendix & Links:
Disclosure: This vulnerability has been submitted to CERT/CC and MITRE ATLAS for coordinated disclosure. A 90-day window for vendor mitigation has been initiated before full technical details are released.