x

LESSWRONG

LW

cye — LessWrong

cye

cye

Message

2

1

6mo

cye

2

6mo

Research Log: Monet/PEER sparse experts

cye3d30

Interesting post. We explored similar work during a MATS stream, training different MoE designs to get more interpretable experts. We started by just testing increasingly sparse MoEs (partly inspired by that Monet paper) on the logic that smaller experts = tighter specialization, then moved on to things like orthogonality constraints, etc.

We were pretty pessimistic from the results at first. Individual experts didn't seem to specialize in anything you wouldn't get from just running k-means on the residual stream (i.e., no real interp benefit). This is sort... (read more)

1