LESSWRONG
LW

2057
Anish Mudide
128150
Message
Dialogue
Subscribe

Math & CS Undergraduate at MIT
https://amudide.github.io/

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide1y30

Thanks for your comment! I believe your concern was echoed by Lee and Arthur in their comments and is completely valid. This work is primarily a proof-of-concept that we can successfully scale SAEs by directly applying MoE, but I suspect that we will need to make tweaks to the architecture.

Reply
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide1y30

Yes, you can train a Switch SAE in any scenario where you can train a standard SAE. @hugofry has a nice blog post on training SAEs for ViT.

Reply
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide1y40

Thanks for the question -- f is calculated over an entire batch of inputs, not a single x. Figure 1 shows how the Switch SAE processes a single residual stream activation x.

Reply
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide1y50

Hi Lee and Arthur, thanks for the feedback! I agree that routing to a single expert will force redundant features and will experiment with Arthur's suggestion. I haven't taken a close look at the router/expert geometry yet but plan to do so soon. 

Reply
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide1y31

Thanks for the comment -- I trained TopK SAEs with various widths (all fitting within a single GPU) and observed wider SAEs take substantially longer to train, which leads me to believe that the encoder forward pass is a major bottleneck for wall-clock time.  The Switch SAE also improves memory efficiency because we do not need to store all M latents.

I'm currently working on implementing expert-parallelism, which I hope will lead to substantial improvements to wall-clock time.

Reply
118Efficient Dictionary Learning with Switch Sparse Autoencoders
1y
20