3897

LESSWRONG
LW

3896
AI
Frontpage

3

Superposition Without Compression: Why Entangled Representations Are the Default

by James Butterworth
26th May 2025
1 min read
2

3

This is a linkpost for https://drive.google.com/file/d/1xdTRTRjKXneXxYKA7Kqa4NHe7nzV7IKK/view?usp=drive_link
AI
Frontpage

3

Superposition Without Compression: Why Entangled Representations Are the Default
2Lucius Bushnaq
3James Butterworth
New Comment
2 comments, sorted by
top scoring
Click to highlight new comments since: Today at 10:46 AM
[-]Lucius Bushnaq4mo20

You seem to be equating superposition and polysemanticity here, but they're not the same thing.

Reply
[-]James Butterworth4mo30

Thank you for pointing this out to me! I will read further and update the blog.

Reply1
Moderation Log
More from James Butterworth
View more
Curated and popular this week
2Comments

I have heard numerous claims recently that the underparameterisation of neural networks can be implied due to the polysemanticity of its neurons, which is prevalent in LLMs.

Whilst I have no doubt that polysemanticity is the only solution to an underparameterised model, I urge on the side of caution when using polysemanticity as proof of underparametarisation.

In this note I claim that: even when sufficient capacity is available, superposition may be the default due to its overwhelming prevalence in the solution space. Disentangled, monosemantic solutions occupy a tiny fraction of the total low-loss solutions.

This suggests that superposition arises not just as a necessity in underparametarised models, but also is an inevitability of the search space of neural networks.

In this note I show a comprehensible toy example where this is the case and hypothesise that this is also the case in larger networks.

These were very rough Sunday musings so I am very interested about what other people think about this claim :).