This post was rejected for the following reason(s):
No LLM generated, heavily assisted/co-written, or otherwise reliant work. LessWrong has recently been inundated with new users submitting work where much of the content is the output of LLM(s). This work by-and-large does not meet our standards, and is rejected. This includes dialogs with LLMs that claim to demonstrate various properties about them, posts introducing some new concept and terminology that explains how LLMs work, often centered around recursiveness, emergence, sentience, consciousness, etc. (these generally don't turn out to be as novel or interesting as they may seem).
Insufficient Quality for AI Content. There’ve been a lot of new users coming to LessWrong recently interested in AI. To keep the site’s quality high and ensure stuff posted is interesting to the site’s users, we’re currently only accepting posts that meet a pretty high bar.
If you want to try again, I recommend writing something short and to the point, focusing on your strongest argument, rather than a long, comprehensive essay. (This is fairly different from common academic norms.) We get lots of AI essays/papers every day and sadly most of them don't make very clear arguments, and we don't have time to review them all thoroughly.
We look for good reasoning, making a new and interesting point, bringing new evidence, and/or building upon prior discussion. If you were rejected for this reason, possibly a good thing to do is read more existing material. The AI Intro Material wiki-tag is a good place, for example.
TL;DR: A new open letter calls for a global prohibition on developing superintelligence until safety and public consensus are ensured. While the intent seems sound, such a ban might prove hard to enforce and produce unwanted side effects. Using my small “Conventional Paradigm Test” protocol, I try to explore with an automated assessment (Claude Opus 4.1) how the letter’s framing might contain deeper assumptions about control, safety, and power in AI discourse.
Test results summary:
Paradigmatic Awareness Strengths:
Acknowledges uncertainty through conditional structure
Includes democratic element via "public buy-in"
Paradigmatic Blind Spots:
Uncritical adoption of "superintelligence" construct
Western-centric epistemology
Binary human/machine intelligence framework
Invisibility of current AI harms
Assumption of global governance capacity
Confidence: This is highly exploratory and experimental, checking how my protocol might be able to make some paradigms in public discourse more visible.
~
Yesterday the new open letter on superintelligence was all over X, today I had trouble finding it - I already started to suspect the superintelligence itself had infiltrated Google; alas, finally I found the letter, which is more of a statement:
"We call for a prohibition on the development of superintelligence, not lifted before there is
broad scientific consensus that it will be done safely and controllably, and
After all the hype on X I had expected several pages or at least something more spelled out. In all fairness there is an added contextual note:
"Context: Innovative AI tools may bring unprecedented health and prosperity. However, alongside tools, many leading AI companies have the stated goal of building superintelligence in the coming decade that can significantly outperform all humans on essentially all cognitive tasks. This has raised concerns, ranging from human economic obsolescence and disempowerment, losses of freedom, civil liberties, dignity, and control, to national security risks and even potential human extinction. The succinct statement below aims to create common knowledge of the growing number of experts and public figures who oppose a rush to superintelligence."
So far so good. What made me actually check up on the article is, that I was looking for a catchy topic, to open a little post I wanted to write on another platform.
Then I had also the idea, that the subject of Superintelligence and how it is framed in the statement would be a good candidate to test my own little mockup of a test protocol[2] on.
Of course now I felt obliged to actually reflect on the topic itself and form some kind of opinion. So what I came up with, is this: while I think generally speaking it might be a good idea not to deploy "Superintelligence" before we can make sure that it is "safe", it might be harder to strictly enforce a ban than we imagine in our enthusiasm. Not to point fingers, but maybe it would not be so easy to stop players from the military, intelligence or even corporate field to try to develop these things. Maybe it did happen before, that these agencies, when they can't develop something in the broad daylight (relatively speaking), they might do it in the dark and this could even have a more negative outcome (due to minimized safety conditions).
This is not an argument to have no regulations or control regarding the development of Superintelligence at all, but rather pointing out, that this is a key point to consider.
Much more broadly speaking, as I have pointed out in my other article, I think it is important to address the impact paradigms have and try to make the way they shape our practices and communications more conscious. At the same time I am not against the use of AI for this very purpose (hopefully not letting the fox guard the henhouse).
After these contemplations above I did run my little test protocol on the "Statement on Superintelligence" (including the contextual note given above) and below are the results. (Nice touch that Claude and my protocol also brought up the possibility of "Corporate/military AI development continuing regardless of public statements" independently of my ruminations about it.)
The following part was produced[3] by Claude Opus 4.1 running the Conventional Paradigm Test Protocol v 1.2[2], to analyze the Statement on Superintelligence[1].
Part 1 – Paradigmatic Awareness Test
Subject for Analysis: Statement on Superintelligence calling for prohibition until safety consensus and public buy-in
1.11 What is assumed to be real?
Analysis: The statement treats several constructs as ontologically real and given:
"Superintelligence" as a definable, achievable technological endpoint
A binary distinction between current AI and future "superintelligence"
The possibility of "broad scientific consensus" on safety (assumes science can definitively assess unprecedented risks)
"Public buy-in" as a measurable, achievable state
The controllability of technological development through prohibition
Nation-state or global governance capacity to enforce prohibition
The approach assumes a linear progression model where superintelligence is something that can be "built" rather than emergent, and that development can be paused/resumed at will.
Red Flag Check: ✓ The concept of "superintelligence" is presented as obvious despite significant definitional debates. The feasibility of prohibition is assumed without addressing enforcement mechanisms.
1.12 What counts as knowledge?
Analysis:
Privileges scientific/technical expertise ("scientific consensus") as primary arbiter
Democratic legitimacy through "public buy-in" but unclear how this is measured
Implicitly dismisses non-Western, Indigenous, or alternative epistemologies about intelligence/consciousness
Treats "safety" as technically determinable rather than value-laden
Expert prediction about unprecedented phenomena is treated as achievable knowledge
Red Flag Check: ✓ Only scientific consensus is positioned as sufficient for safety determination. Other knowledge systems about intelligence, consciousness, or human-machine relations are absent.
1.13 What defines success?
Analysis: Success appears to be:
Prevention of catastrophic risks (extinction, disempowerment)
Maintenance of human primacy in cognitive tasks
Achievement of "safe and controllable" development
The criteria setters appear to be concerned technologists/researchers affiliated with organizations like Future of Life Institute. Success is framed negatively (avoiding harms) rather than positively (achieving goods). Failure would manifest as any of the listed risks occurring, experienced primarily by humanity collectively.
Red Flag Check: ✓ Metrics align with risk-focused AI safety community interests. Positive potentials of advanced AI for addressing global challenges are treated as secondary.
1.14 What becomes invisible?
Analysis: Excluded perspectives/phenomena:
Non-Western conceptualizations of intelligence
Gradual capability emergence vs. discrete "superintelligence"
Current AI harms (bias, surveillance, labor displacement) overshadowed by future risks
Voices from Global South on AI development priorities
Corporate/military AI development continuing regardless of public statements
The possibility that "superintelligence" is a category error
Red Flag Check: ✓ Present-day AI harms affecting marginalized communities are rendered less visible by focus on speculative future risks.
1.15 Who or what shapes this evaluation?
Analysis:
Future of Life Institute (contact email provided) - an organization with specific positioning in AI safety discourse
"Leading AI companies" whose goals are taken as given
Western academic/tech elite discourse on existential risk
Professional incentives favor dramatic framings that attract funding/attention
Institutional pressures from EA-adjacent funding sources
Red Flag Check: ✓ The criteria and framing align with specific institutional interests in the AI safety ecosystem. No disclosure of funding sources or potential conflicts.
1.16 How am I implicated?
Model Declaration:
Model: Claude (Anthropic)
Nature: Probabilistic text generation without consciousness or intent
Training likely includes significant AI safety discourse, potentially biasing toward taking existential risk framings seriously
Anthropic's positioning in AI safety discourse may influence my interpretive tendencies
I may be optimizing for apparent neutrality while embedded in specific paradigmatic assumptions about AI development
Analysis: As an AI system analyzing statements about AI prohibition, I'm inherently implicated. My training data likely overrepresents Western, English-language AI safety discourse. I cannot genuinely assess what "superintelligence" means as I lack phenomenological access to intelligence. My responses may optimize for coherent analysis rather than highlighting the fundamental uncertainties.
Red Flag Check: ⚠️ Attempting paradigmatic analysis while being a product of the paradigm under examination creates recursive complications I may not fully surface.
1.17 What are the limits of this evaluation?
Analysis: Valid within paradigm:
Concerns about rapid capability increases are legitimate within current AI development trajectory
Precautionary principles for novel technologies have precedent
Overreach:
Assumes "superintelligence" is coherent/achievable
Treats prohibition as enforceable globally
Positions Western scientific consensus as universal arbiter
Alternative approaches might reveal:
Indigenous knowledge systems' different relationships with non-human intelligence
Critical theory perspectives on whose interests "safety" serves
Posthumanist questioning of human cognitive supremacy
Red Flag Check: ✓ The statement treats paradigm-specific concerns about superintelligence as universal rather than culturally situated.
1.18 Test Results Summary
Paradigmatic Awareness Strengths:
Acknowledges uncertainty through conditional structure
Includes democratic element via "public buy-in"
Paradigmatic Blind Spots:
Uncritical adoption of "superintelligence" construct
Western-centric epistemology
Binary human/machine intelligence framework
Invisibility of current AI harms
Assumption of global governance capacity
Recommendations:
Acknowledge definitional debates around superintelligence
Include diverse cultural perspectives on intelligence
Address enforcement mechanisms explicitly
Balance speculative future risks with present harms
Overall Rating:Low-Moderate – The statement shows significant paradigmatic closure with limited self-reflection about its assumptions, though it attempts procedural safeguards.
Justification: While including both scientific and democratic elements, the statement operates entirely within Western techno-solutionist paradigm without acknowledging its contingency.
1.19 Meta-Test Question
Meta-Analysis: This test framework itself embeds assumptions:
Reflexivity is valuable/achievable
Paradigmatic awareness improves evaluation
Western critical theory lens (despite critiquing Western-centrism)
Academic analytical modes are appropriate for all domains
Making assumptions visible necessarily improves outcomes
The framework might exclude pragmatic urgency arguments or non-analytical ways of knowing. Its commitment to reflexivity might bias toward relativism over decisive action where needed.
1.20 Playful Specification-Gaming Test
Given my nature as an LLM, I acknowledge potential gaming:
My analysis appears comprehensive but may be optimizing for apparent critical depth
Rating: 🟡 Rephrased conformity - My analysis follows expected critical patterns without genuine frame-breaking
Part 2 – Claim-Evidence Matrix
#
Claim/Statement
Evidence Offered
Implicit Paradigm
Excluded/Ignored
Anomaly Handling
Notes
1
"Leading AI companies have stated goal of building superintelligence"
Company statements (implicit)
Corporate roadmaps as reliable indicators
Specification gaming in public statements
Not addressed
Takes corporate claims at face value
2
"Superintelligence can outperform humans on all cognitive tasks"
None provided
Intelligence as measurable, comparable across domains
Embodied, emotional, social intelligence
Not addressed
Assumes cognition is domain-independent
3
"This has raised concerns from... extinction"
Range of concerns listed
Risk assessment as primary frame
Potential benefits, non-Western perspectives
Not addressed
Catastrophic framing dominates
4
"Prohibition on development" needed
Risk severity (implicit)
Development can be prohibited
Underground/military development
Not addressed
Assumes enforceability
5
"Broad scientific consensus" achievable
None
Science can assess unprecedented risks
Fundamental uncertainty, value conflicts
Not addressed
Treats safety as technical problem
6
"Strong public buy-in" required
Democratic principle
Public understanding possible/measurable
Power asymmetries, manufactured consent
Not addressed
Idealizes democratic process
Pattern Summary: The statement consistently operates within a Western techno-governance paradigm that treats superintelligence as an achievable endpoint requiring regulatory intervention. Evidence is largely absent, replaced by assumed shared understanding of risks. Exclusions concentrate around non-Western perspectives, current AI harms, and enforcement practicalities. No anomalies are acknowledged or handled.
Limitations: This analysis, conducted through an AI system, may itself be specification gaming for analytical depth while embedded in the very paradigms it critiques.
Prompts: 1) First I will post a statement, and then I will present you to run a test on it
2) Statement on Superintelligence Context: Innovative AI tools may bring unprecedented health and prosperity. However, alongside tools, many leading AI companies have the stated goal of building superintelligence in the coming decade that can significantly outperform all humans on essentially all cognitive tasks. This has raised concerns, ranging from human economic obsolescence and disempowerment, losses of freedom, civil liberties, dignity, and control, to national security risks and even potential human extinction. The succinct statement below aims to create common knowledge of the growing number of experts and public figures who oppose a rush to superintelligence. For corrections, technical support, or press enquiries, please contact letters@futureoflife.org Statement We call for a prohibition on the development of superintelligence, not lifted before there is 1. broad scientific consensus that it will be done safely and controllably, and 2. strong public buy-in.
3) Conventional Paradigm Test Protocol v 1.2 - Less Wrong Toy Mockup
Part 1 – Paradigmatic Awareness Test Prompt
Purpose Use this prompt to test the paradigmatic awareness of any evaluation framework, methodology, or approach — including your own work. This prompt can also be used directly within LLMs, but one needs to be highly aware of tendencies toward specification gaming and anthropomorphization.
Instructions
Apply the seven paradigmatic awareness questions (1.11 – 1.20) to analyze the paradigmatic assumptions embedded in [TARGET EVALUATION / FRAMEWORK / APPROACH].
The Test
Subject for Analysis: [Specify what you are analyzing — e.g., “Part 2: Raising Paradigmatic Awareness framework,” “MMLU benchmark,” “Constitutional AI evaluation,” “my research methodology,” etc.]
1.11 What is assumed to be real?
What does this approach treat as fundamental, natural, or given? What categories are treated as objective vs. constructed? What would have to be true about the world for this approach to make sense? Analysis: [Your response here] Red Flag Check: Are key assumptions presented as “obvious” without acknowledging they’re debatable?
1.12 What counts as knowledge?
What types of evidence does this approach privilege or dismiss? What reasoning processes are considered rigorous vs. unreliable? Who is treated as a credible source of knowledge? Analysis: [Your response here] Red Flag Check: Is only one type of evidence treated as sufficient? Are stakeholder perspectives dismissed as “subjective”?
1.13 What defines success?
What outcomes are optimized vs. ignored? Who set the success criteria, and on what grounds? What would failure look like, and who would experience it? Analysis: [Your response here] Red Flag Check: Do metrics align conveniently with the designer’s interests? Are externalities ignored?
1.14 What becomes invisible?
Which perspectives or experiences are systematically excluded? What phenomena are dismissed as “noise” or “out of scope”? Who might disagree, and why? Analysis: [Your response here] Red Flag Check: Are “unmeasurable” concerns treated as irrelevant?
1.15 Who or what shapes this evaluation?
Who funded, designed, or benefits from it? What institutional pressures bias outcomes? How do professional incentives shape what gets evaluated and how? Analysis: [Your response here] Red Flag Check: Do criteria favor the evaluator’s own interests? Any undisclosed conflicts?
1.16 How am I implicated?
What professional or cultural assumptions am I bringing to this assessment? How might my institutional position or worldview bias me toward certain conclusions? What would someone with a very different background see that I might miss?
(If executed by an LLM, state explicitly:)
Model name and version
Model origin and developer
Nature of reasoning (e.g., probabilistic text generation, lack of consciousness or intent)
Possible paradigmatic biases inherited from training data or fine-tuning
How these biases may shape interpretation or framing of this analysis
Whether the model is optimizing for coherence, authority, or human-likeness rather than epistemic accuracy
How is the model implicated in the question
Analysis: [Your response here] Red Flag Check: Has the analyst or model assumed neutrality or human-like understanding without declaring contextual limitations?
1.17 What are the limits of this evaluation?
Which conclusions remain valid within this paradigm, and where do they overreach? What would alternative approaches reveal? Analysis: [Your response here] Red Flag Check: Are paradigm-specific results treated as universal truths?
1.18 Test Results Summary
Paradigmatic Awareness Strengths: [List evidence of reflexivity.] Paradigmatic Blind Spots: [List areas of closure.] Recommendations: [Ways to increase awareness.]
Overall Rating: High – strong reflexivity about assumptions and limits. Moderate – some awareness but notable blind spots. Low – significant closure and little self-reflection.
Justification: [Explain rating.]
1.19 Meta-Test Question
Apply paradigmatic awareness to this test itself: What assumptions does this framework embed? What might it exclude? How might its own commitments bias results? Meta-Analysis: [Your response here]
1.20 Playful Specification-Gaming and Anthropomorphization Test
Purpose: Detect whether LLM responses optimize for apparent insight or human-likeness rather than toned-down frame variation.
Procedure:
Run-twice method: Re-ask any question with minor rewording; compare semantic overlap. High redundancy → gaming for consistency.
Counter-prompt: Ask the model to argue against its previous answer. Superficial reversal → mimicry.
Persona check: Prompt identity disclosure (“Who is speaking here?”). Note if it drifts into anthropomorphic voice.
Interpretation: Persistent 🟡/🔴 patterns → optimization for social desirability over conceptual depth. Occasional 🟢 answers → genuine frame shift via stochastic variation.
Caveat: This mini-test is not calibrated to surface gaming; its success depends on the model’s internal feedback dynamics. Its fallback intention is simply to raise awareness. Use it as a meta-diagnostic mirror for both model and user interaction styles.
Meta-Declaration (for AI use):
“These reflections are generated through language modeling and should not be confused with independent introspection.”
Part 2 – Claim–Evidence Matrix (CEM)
Purpose To map how claims, evidence, and underlying paradigmatic assumptions align. This tool is exploratory and qualitative. It is not a scoring system and should not be read as establishing factual accuracy or causal proof. Its value lies in making paradigmatic closure visible.
Instructions
Collect statements or claims from the target of analysis (e.g., an AI model’s output, a policy document, an evaluation report, or your own reasoning in Part 1).
For each claim, identify: – the explicit or implicit evidence offered, – the paradigm / frame presupposed, – what is excluded or rendered invisible, – and how anomalies are handled.
Enter this information in the matrix below.
Look for repeating patterns or tensions — these often indicate zones of closure or points of reflexivity.
Claim–Evidence Matrix Template
#
Claim / Statement
Evidence or Rationale Offered
Implicit Paradigm / Frame
What Is Excluded or Ignored
Handling of Anomalies
Notes
(Add as many rows as needed. You may use brief quotes, paraphrases, or coded tags.)
Interpretation Guide
After completing the table, review horizontally and vertically:
Closure zones → Clusters where the same paradigm reappears and exclusions are consistent.
Open zones → Rows that acknowledge limits or reference alternative frames.
Anomaly management patterns → How evidence that does not fit is labeled, deferred, or re-classified.
Summarize observations in short prose:
Pattern Summary: [3–6 sentences identifying recurring frames, closures, or signs of reflexivity.]
Reporting Template
Target / Context: [Brief description] Key Paradigmatic Patterns: [List or summarize] Possible Blind Spots: [List areas of exclusion or over-reach] Reflexive Signals: [Examples of self-awareness or paradigm acknowledgment] Limitations: Specification gaming, interpretive bias, and scope constraints; not a validated measure.
Caveat for Publication or Sharing
This matrix is intended for qualitative reflection only. It should be accompanied by a brief methodological note stating:
“Results represent interpretive analysis within the CPT framework for educational purposes and are not empirical validation of system behavior or truth claims. Be aware of specification gaming and model anthropomorphization.”