Measuring Latent Operator States in Transformer Mid-Layers
Summary I report evidence that conversational “operators” (e.g. summarize, critique, reframe) correspond to stable, decodable internal states in transformer mid-layers, distinct from surface instruction wording and generalizing across content. These states are weak but persistent, geometrically separable via simple centroid methods, and survive instruction masking. I’m sharing this as a...