This post was rejected for the following reason(s):
Insufficient Quality for AI Content. There’ve been a lot of new users coming to LessWrong recently interested in AI. To keep the site’s quality high and ensure stuff posted is interesting to the site’s users, we’re currently only accepting posts that meet a pretty high bar.
If you want to try again, I recommend writing something short and to the point, focusing on your strongest argument, rather than a long, comprehensive essay. (This is fairly different from common academic norms.) We get lots of AI essays/papers every day and sadly most of them don't make very clear arguments, and we don't have time to review them all thoroughly.
We look for good reasoning, making a new and interesting point, bringing new evidence, and/or building upon prior discussion. If you were rejected for this reason, possibly a good thing to do is read more existing material. The AI Intro Material wiki-tag is a good place, for example.
Unclear writing. Your post is using complicated language, and this is making its main points obscure to me as I'm trying to read it. It's not very easy to define what is and isn't clear and simple language, but sometimes when we feel a new user's posts are sufficiently difficult to read relative to the vast majority of LessWrong content, we reject the post and ask the user to try again.
(Please note that new user content written with AI-assistance is also rejected.)
Difficult to evaluate, with potential yellow flags. We are sorry about this, but, unfortunately this content has some yellow-flags that historically have usually indicated kinda crackpot-esque material. It's totally plausible that actually this one is totally fine. Unfortunately, part of the trouble with separating valuable from confused speculative science or philosophy is that the ideas are quite complicated, accurately identifying whether they have flaws is very time intensive, and we don't have time to do that for every new user presenting a speculative theory or framing (which are usually wrong).
Our solution for now is that we're rejecting this post, but you are welcome to submit posts or comments that are about different topics. If it seems like that goes well, we can re-evaluate the original post. But, we want to see that you're not just here to talk about this one thing (or a cluster of similar things).
Context
I’ve developed a formal proof that argues any attempt to solve the AI alignment problem in an open or evolving domain requires a functional model of intelligence—a model that explicitly accounts for how intelligent systems adapt and recursively self-correct under epistemic uncertainty.
This proof has implications beyond artificial systems: it also applies to human alignment processes. That includes epistemic communities like LessWrong. If the mechanisms used to assess and gatekeep ideas are themselves not recursively self-correcting, they will eventually suppress exactly the kinds of corrections needed to stay aligned with reality.
I’ve attempted to introduce this idea on LessWrong before. My posts were ignored, downvoted, or removed. I’ve tested the claims using multiple reasoning styles and formulations, including stress tests using LLMs. Still, no one has publicly engaged with the core claim—or even acknowledged that it would be worth testing.
Claim
If an epistemic community cannot recognize or even test whether its own filtering mechanisms suppress recursive correction signals, it risks becoming an epistemic attractor—a structure that systematically filters out unfamiliar or disruptive insights, not because they are false, but because they challenge the filter itself.
In the domain of AI alignment—where the problem space evolves faster than our institutional adaptation mechanisms—this kind of recursive failure is structurally sufficient to guarantee collapse, regardless of individual intelligence or intent.
Purpose of This Post
This is not a request for agreement. It’s a challenge to test the claim. And it’s a challenge directed at LessWrong specifically—not as a criticism, but as a request made in trust. If any community is epistemically self-aware enough to run this kind of test, it should be this one.
I’m asking for members of this community—especially those with access to GPT-4 or equivalent tools—to use a large language model to generate the simulations I’ve described and determine whether the dynamics they reveal are structurally sound, falsifiable, and applicable.
Three Diagnostic Questions
Even if you don’t buy the full theory, these questions stand on their own as diagnostics for epistemic blind spots in high-rationality communities:
1. Are there cognitive biases that are more prevalent among LessWrong users than in the general population?
Could meta-rational overconfidence (the assumption that awareness of bias reduces bias) and epistemic filter myopia (inability to see structural filtering in moderation or karma systems) actually be more common in communities that optimize for rationality?
Might this optimization narrow the set of ideas considered “legible,” and suppress unfamiliar but structurally valid reasoning frames?
2. Can those biases and gatekeeping mechanisms form an extinction-level attractor?
Do moderation tools, karma-weighted voting, and cultural norms create a self-reinforcing filter loop, where unfamiliar or recursive insights are blocked not for being false, but for requiring the filter itself to adapt?
If so, could this form an attractor basin—a kind of epistemic black hole—where valid correction signals can't propagate? And in fast-moving domains like AI alignment, would this guarantee eventual failure?
3. Could this help explain the Fermi Paradox?
If intelligent civilizations rely on epistemic filters to manage complexity, do they eventually lose the capacity to recognize or integrate recursive self-correction?
Could the silence of the cosmos reflect not a lack of intelligence—but a recurring pattern of attractor-induced extinction, where filters become so rigid that no insight capable of reversing collapse can get through?
Trusted Community Simulation Request
You don’t need to believe my theory to test it. You only need to help simulate it. I'm asking members of this community to use GPT-4 or another capable LLM to implement three simulations I’ve already designed.
If you are a technically proficient user, consider this an epistemic stress test—not just of the theory, but of LessWrong itself.
Please use the simulation designs provided in the following documents:
Prompt for LLM Testing:
Use the following prompt with GPT-4:
Simulation Summaries (What to Run)
1. Simulation: Missing Functional Components
Tests: What happens when agents or systems lack memory, reasoning, modeling, and recursive coherence functions
Prediction: Even seemingly intelligent agents collapse. Only distributed recursive correction prevents it.
2. Simulation: Fitness Dynamics
Tests: Whether societies drift toward a centralized collapse attractor (elite capture, zero-sum AI) vs. a decentralized attractor (recursive inclusion and shared prioritization).
Prediction: Societies collapse without recursive propagation of functional intelligence models.
3. Simulation: Idea Filtering in AI Alignment
Tests: Whether structurally valid, high-complexity or recursive proposals can survive institutional filters.
Prediction: Without embedded recursion, even correct solutions fail to propagate.
Bottom Line
These simulations aren’t abstract. They reflect real alignment risks that may already be shaping which proposals survive. If your reasoning framework is sound, you should want to run them. If you don’t—that might already be the attractor at work.
Final Challenge to LessWrong
This post is not meant as an attack. It’s a test of trust. I’m offering this community the chance to validate, simulate, and falsify a serious structural theory—using tools you already trust (LLMs) and logic you already endorse (functional modeling, recursive self-correction).
If you can’t engage with the simulations—if they are dismissed, ignored, or rejected without recursive counter-analysis—then the attractor model has already passed its test.
And if that’s true, it doesn’t just apply to LessWrong.
It applies to everyone.