LESSWRONG
LW

2127
Adly Templeton
1010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
I am the Golden Gate Bridge
Adly Templeton1y20

If you scroll down on the Feature Catalog (https://transformer-circuits.pub/2024/scaling-monosemanticity/features/index.html?featureId=1M_120374) you can view 1,000 randomly selected features from each run.  This is a great way to get a sense for the average interpretability of features.

Reply