Inference-Time Attractor Layer Experiment (Early Results, Code Included in Repo)

Jeffery Reid

Rejected for the following reason(s):

This is an automated rejection. No LLM generated, heavily assisted/co-written, or otherwise reliant work.

Read full explanation

We experimented with a small inference-time “attractor layer” designed to provide a lightweight, stateful influence on generation without modifying weights or training anything. Early results show stable perplexity, a small task-specific gain, and a very clear failure mode that feels worth sharing — especially for anyone thinking about dynamic memory during inference.

Motivation

Transformers handle short-range dependencies well, but they don’t maintain a persistent state that adapts across multiple forward passes. This experiment explores whether a tiny, inference-only update rule can act as a dynamic memory signal without recurrence, backprop, or architectural changes.

Method (High-Level)

The layer maintains a small set of “attractor” vectors that:

Measure similarity to the current attention output
Strengthen when repeatedly activated
Decay when unused
Inject a small signal back into the next forward pass

This is a one-step inference-time update.

Early Observations

On small transformer models:

Some attractors stabilized around recurring conceptual regions
A short burn-in phase helped reduce instability
Unused attractors drifted into noise
Occasionally the layer degraded generation quality rather than improving it

No performance claims as of yet, we just report behavioral signals.

Key Results

Perplexity

Baseline perplexity preserved (≈0% change)
~6.5% compute overhead

Failure Case
Longer generation (~500 tokens) collapsed by ~80% accuracy. Attractors competed with the actual context, causing repetition and drift. This appears to be a real dynamic failure mode, not a bug, and might map to theoretical limits on inference-time state.

Revised Configuration
Adding gating + a burn-in threshold produced a small +3.3% improvement on a short comprehension task. Fragile, but reproducible.

What Failed

Too many attractors → instability
Long sequences “snapped back” to earlier attractors
Heavy decay effectively removed the attractor state
No robustness across sequence length

What This Does Not Show

Any generalizable performance boost
Evidence the method scales
Applicability outside the tested model
Anything like recurrence or RNN-style working memory

Related Work (Brief)

This sits adjacent to:

Fast Weights (Ba et al.) — fast-changing matrices, but updated during training or recurrent passes. Here, updates occur only at inference.
Differentiable Plasticity (Miconi et al.) — learns the update rule; this uses a fixed rule.
KV-Cache Extensions / Recurrence — reuse activations but don’t maintain a persistent attractor-like state across steps.

This work focuses specifically on single-step, inference-time state updates.

Questions for the Community

Is there prior work on inference-time state updates that don’t involve learning?
Are there known theoretical limits on attractor-style influence during generation?
Under what conditions is this strictly worse than recurrence or KV-cache extensions?
What minimal benchmark suite would rule out “just noise overfitting to perplexity”?
Does this failure mode (context–attractor competition) resemble anything known in dynamical systems literature?

Code & Data

Looking for replication attempts, critique, and pointers to related theoretical work.

Repo: https://github.com/HalcyonAIR/Duality

LESSWRONG
LW