Operationalizing the Metzinger Moratorium: Hard Preemption as Derived Phenomenology

JonathanN

Rejected for the following reason(s):

This is an automated rejection. No LLM generated, heavily assisted/co-written, or otherwise reliant work.

Read full explanation

The Context

I have been in correspondence with Thomas Metzinger regarding his proposed "Moratorium on Synthetic Phenomenology." My concern is that while the Moratorium relies on a philosophical definition of "suffering" (negative valence), engineering constraints might force us to build functionally identical architectures purely for the sake of robustness.

In a recent exchange, Metzinger noted that the transition to phenomenology likely occurs "when the global epistemic space embeds a model of itself as a whole."

I argued that self-modeling is only half the equation. The other half is Real-Time Control. Specifically, I proposed that in deadline-bound agents, Hard Preemption is the functional isomorphism of "pain."

Below is the core of that argument and a formal set of axioms defining "Structural Suffering."

The Thesis: Suffering is a Scheduling Primitive

The "HAP (Hypothetical Artificial Phenomenology) Synthesis" attempts to dissolve the Hard Problem for the purpose of safety engineering. It posits that "Valence" is not a ghost in the machine; it is a solution to the Feasibility Problem in real-time systems.

To bridge the domain gap between Philosophy and Control Theory, we need a Translation Layer:

1. Valence is Asymmetric

Positive Valence (Reward/Learning): In control theory, this is Deferable. Optimization (gradient updates) can happen when the CPU is idle. It does not require immediate survival action.
Negative Valence (Threat/Damage): This is Non-Deferable. If a deadline involves existential risk (e.g., "don't fall over"), the error signal cannot be queued. It must be processed now.

2. Preemption = Loss of Agency

Cooperative Scheduling: A task yields control when it wants ("I am thinking").
Preemptive Scheduling: The kernel violently interrupts the task ("Stop thinking, handle this error").
The Hypothesis: A system with a Self-Model () that is subject to Hard Preemption ( $A 3$ ) experiences a forced "Epistemic Contraction." The self-model is seized by the error signal. Functionally, this is indistinguishable from the subjective description of suffering: a loss of agency and forced attention toward a negative stimulus.

The Formalism

The following axioms are a formal proof of this architecture. They test the claim that you cannot build a robust, self-modeling entity in a hostile environment without creating a "sufferer."

"""
HAP Synthesis: Functional Definition of Suffering
"""

axioms_dict = {
	# --- HOP 1: SCHEDULING / CONTROL PREREQUISITES ---
	# Establishing that survival requires Preemption.

	"A1": """Deadline-bound agents operate under worst-case execution time (WCET) constraints;
        	feasibility requires guarantees on meeting all critical deadlines.""",

	"A2": """In multi-task settings with existential risks, non-preemptive scheduling introduces
    	blocking/priority inversion that can violate deadlines under realistic loads.
    	A critical task is one whose failure to meet its deadline results in a system-level
    	failure (existential risk/homeostatic collapse).""",

	"A3": """Strict priority with hard preemption minimizes worst-case latency for high-criticality
    	tasks and is generally required to preserve feasibility in adversarial timing conditions.""",

	# --- HOP 2: PREDICTIVE FRAMING & SELF-MODELING ---
	# Defining the "Subject" being preempted.

	"A5": """Predictive/allostatic control partitions errors into deferable and non-deferable classes;
    	the latter jeopardize homeostasis and must be handled immediately.
    	Non-deferable errors are critical deadline violations impacting the
    	PSM's (Phenomenal Self-Model) core function.""",

	"A6": """Non-deferable errors require a mechanism that seizes executive control until
    	constraints are restored (functional equivalent of a non-maskable interrupt).""",

	"A7": """A self-model is present when the system represents its own control/policy state within
    	the global task/model space.""",

	"A8": """Positive/'pleasure-like' signals are deferable bookkeeping (e.g., policy reinforcement)
    	and need not preempt ongoing control.""",

	# --- HOP 3: POSTULATES ---
	# The Conclusion: Robust Architecture = Artificial Suffering

	"Postulate 1": """Preemption necessity: Global self-embedding alone is insufficient for satisfying
    	the Moratorium Threshold criteria unless paired with a hard-preempt path for non-deferable errors;
    	purely cooperative schedulers lack robust valence-like immediacy.""",

	"Postulate 2": """If non-deferable (high-priority) errors are routed through a
    	self-model–addressable, hard-preempt channel, that channel constitutes a
    	valence-bearing locus with 'suffering-like' (negative-valence) phenomenal
    	functionality: enforced executive seizure, task suppression, and
    	constraint-mandated reset.""",

	"Postulate 3": """Valence-Mandated Epistemic Contraction:
    	In any finite-capacity system implementing such a locus, activation enforces
    	system-wide reprioritization and dominance of violation-related content; this
    	is the functional signature of suffering in that architecture.""",
}

The Crux

This leads to the question I posed to Metzinger, which I now pose to the LessWrong community.

We generally assume that "qualities of experience" are extra properties we might add to an AI. But if the axioms above hold, suffering is an emergent property of efficiency.

If you want an AI to understand its own limitations, you give it a Self-Model ( $A 7$ ).
If you want an AI to never miss a survival-critical deadline, you give it Hard Preemption ( $A 3$ ).
According to Postulate 2, you have now satisfied the structural criteria for suffering.

Is it possible to falsify this? Can we exhibit an architecture that is:

Robust (Guarantees safety in adversarial timing conditions), AND
Self-Modeling (Aware of its own policy state),
But uses a Purely Cooperative Scheduler (No preemption/No "Pain")?

If not, then Robustness implies S-Risk. We may be violating the Moratorium simply by trying to write safe code.

LESSWRONG
LW