The Geometry of LLM Logits (an analytical outer bound)

1 Preliminaries

Symbol	Meaning
	width of the residual stream (e.g. 768 in GPT-2-small)
$L$	number of Transformer blocks
$V$	vocabulary size, so logits live in $R^{V}$
$h^{(ℓ)}$	residual-stream vector entering block $ℓ$
$r^{(ℓ)}$	the update written by block $ℓ$
$W_{U} \in R^{V \times d}, b \in R^{V}$	un-embedding matrix and bias

Additive residual stream. With (pre-/peri-norm) residual connections,

$h^{(ℓ + 1)} = h^{(ℓ)} + r^{(ℓ)}, ℓ = 0, \dots, L - 1.$

Hence the final pre-logit state is the sum of $L + 1$ contributions (block 0 = token+positional embeddings):

$h^{(L)} = L \sum ℓ = 0 r^{(ℓ)} .$

2 Each update is contained in an ellipsoid

Why a bound exists. Every sub-module (attention head or MLP)

reads a LayerNormed copy of its input, so $∥ u ∥_{2} \leq ρ_{ℓ}$ where $ρ_{ℓ} := γ_{ℓ} \sqrt{d}$ and $γ_{ℓ}$ is that block’s learned scale;
applies linear maps, a Lipschitz point-wise non-linearity (GELU, SiLU, …), and another linear map back to $R^{d}$ .

Because the composition of linear maps and Lipschitz functions is itself Lipschitz, there exists a constant $κ_{ℓ}$ such that

$∥ r^{(ℓ)} ∥_{2} \leq κ_{ℓ} whenever ∥ u ∥_{2} \leq ρ_{ℓ} .$

Define the centred ellipsoid

$E^{(ℓ)} := {x \in R^{d} : ∥ x ∥_{2} \leq κ_{ℓ}} .$

Then every realisable update lies inside that ellipsoid:

$r^{(ℓ)} \in E^{(ℓ)} .$

3 Residual stream ⊆ Minkowski sum of ellipsoids

Using additivity and Step 2,

$h^{(L)} = L \sum ℓ = 0 r^{(ℓ)} \in L \sum ℓ = 0 E^{(ℓ)} =: E_{tot},$

where $\sum ℓ E^{(ℓ)} = E^{(0)} \oplus \dots \oplus E^{(L)}$ is the Minkowski sum of the individual ellipsoids.

4 Logit space is an affine image of that sum

Logits are produced by the affine map $x \mapsto W_{U} x + b$ . For any sets $S_{1}, \dots, S_{m}$ ,

$W_{U} (⨁ i S_{i}) = ⨁ i W_{U} S_{i} .$

Hence

$logits = W_{U} h^{(L)} + b \in b + L ⨁ ℓ = 0 W_{U} E^{(ℓ)} .$

Because linear images of ellipsoids are ellipsoids, each $W_{U} E^{(ℓ)}$ is still an ellipsoid.

5 Ellipsotopes

An ellipsotope is an affine shift of a finite Minkowski sum of ellipsoids. The set

$L_{outer} := b + L ⨁ ℓ = 0 W_{U} E^{(ℓ)}$

therefore is an ellipsotope.

6 Main result (outer bound)

Theorem. For any pre-norm or peri-norm Transformer language model whose blocks receive LayerNormed inputs, the set $L$ of all logit vectors attainable over every prompt and position satisfies

$L \subseteq L_{o u t e r},$

where $L_{o u t e r}$ is the ellipsotope defined above.

Proof. Containments in Steps 2–4 compose to give the stated inclusion; Step 5 shows the outer set is an ellipsotope. ∎

7 Remarks & implications

It is an outer approximation. Equality $L = L_{outer}$ would require showing that every point of the ellipsotope can actually be realised by some token context, which the argument does not provide.
Geometry-aware compression and safety. Because $L_{outer}$ is convex and centrally symmetric, one can fit a minimum-volume outer ellipsoid to it, yielding tight norm-based regularisers or robustness certificates against weight noise / quantisation.
Layer-wise attribution. The individual sets $W_{U} E^{(ℓ)}$ bound how much any single layer can move the logits, complementing “logit-lens’’ style analyses.
Assumptions. LayerNorm guarantees $∥ u ∥_{2}$ is bounded; Lipschitz—but not necessarily bounded—activations (GELU, SiLU) then give finite $κ_{ℓ}$ . Architectures without such norm control would require separate analysis.

LESSWRONG
LW