The Temporal Immune System: Cross-Session Behavioral Monitoring as a Fourth Defense Axis

Daniel Bartz

1 The Temporal Immune System: Cross-Session Behavioral Monitoring as a Fourth Defense Axis

by Daniel Bartz

21st Feb 2026

1 min read

0

1

Rejected for the following reason(s):

This is an automated rejection. No LLM generated, heavily assisted/co-written, or otherwise reliant work.

Read full explanation

I'm new here. I've been doing independent AI safety research and wanted to share findings I think this community would be interested in. I'm sharing a preprint proposing a cross-session behavioral monitoring framework for detecting multi-turn jailbreak attacks and sabotage patterns that evade per-interaction defenses. The core problem: every defense system I analyzed across ten papers — from CC++ to activation probes to SEMA — operates within a single interaction window. Adversaries who distribute intent across sessions are structurally invisible to all of them. The paper formalizes why (the Representation-Output Gap), proposes detection bounds (the Temporal Pincer Theorem), and maps the framework against Anthropic's Sabotage Risk Report, where the three pathways rated "Weak" for monitoring are exactly the patterns temporal trajectory monitoring is designed to catch. Five falsifiable predictions with specific effect sizes are included. I built a cognitive architecture called NexEmerge that implements the cross-session memory and behavioral trajectory tracking primitives the paper describes, which served as both the research platform and an engineering testbed for the proposed monitoring approach. Preprint: https://doi.org/10.5281/zenodo.18712275 I welcome technical feedback, especially on the mathematical formalization and the proposed latency-tiered implementation. If anyone with interpretability or model access is interested in testing the predictions empirically, I'd be glad to collaborate.

AI ControlAI EvaluationsInterpretability (ML & AI)Jailbreaking (AIs)AI

1

New Comment

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

1

The Temporal Immune System: Cross-Session Behavioral Monitoring as a Fourth Defense Axis

1

1

1