x
Do Bilinear MLPs Actually Learn Cleaner Circuits? — LessWrong