LESSWRONG
LW

2701
Georg Lange
73Ω34110
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Some costs of superposition
Georg Lange2yΩ230

Calculating l, the maximal number of simultaneously active features, yields strange results. For example, if we have 100 features and 100 neurons, l has to be < 100/(8 * ln(100)) = 2.7. But I would expect that 100 features can be simultaneously active because we have 100 dimensions, so the features can be orthogonal and independent. Am I understanding something wrong?

Reply
15SAEs Discover Meaningful Features in the IOI Task
Ω
1y
Ω
2
77An Interpretability Illusion for Activation Patching of Arbitrary Subspaces
Ω
2y
Ω
4