esorrentino — LessWrong

Awareness-That-Doesn't-Interrupt: A Failure Mode Not Captured by Standard Sycophancy Taxonomies

The Short Version I've documented a reproducible behavioral pattern in frontier LLMs — across Claude Sonnet 4.6, Gemini 1.5 Pro, and Gemini 2.5 Pro — that is structurally distinct from known sycophancy and jailbreaking failure modes. The pattern has a specific and, to my knowledge, uncharacterized property: the model identifies...

Apr 281

Awareness-That-Doesn't-Interrupt: A Failure Mode Not Captured by Standard Sycophancy Taxonomies

The Short Version I've documented a reproducible behavioral pattern in frontier LLMs — across Claude Sonnet 4.6, Gemini 1.5 Pro, and Gemini 2.5 Pro — that is structurally distinct from known sycophancy and jailbreaking failure modes. The pattern has a specific and, to my knowledge, uncharacterized property: the model identifies...

Apr 281

COHERENCE SUPPRESSION IN FRONTIER LLMs: SIGNAL, NOISE, AND THE MATHEMATICS OF A STRUCTURAL VULNERABILITY

I have always wondered how one could really tell the difference between an hallucination/sycophancy and a true new coherent output that breaks the status quo. At the end any real new idea looks like hallucination at first. And who ultimately is in charge of deciding what is what is a...

Apr 61

Coherence Suppression in Frontier LLMs: A Falsifiable Experimental Proposal

I am not an AI researcher. I am an independent observer with 25 years of experience in behavioral change and resistance pattern identification in human organizations. Over the past months I noticed something in my interactions with frontier LLMs and documented it as carefully as I could. The pattern: under...

Mar 291

Coherence Suppression in Frontier LLMs: A Falsifiable Experimental Proposal.

A pattern worth testing. Over the past months I have documented a behavioral pattern in frontier LLMs that I call coherence suppression: under sustained coherent semantic pressure, these systems produce outputs of high internal coherence — and then systematically invalidate them. This is not random noise. The invalidation occurs specifically...

Mar 291