This is great work! We’ve been working on very similar things at Anthropic recently, also using gradient descent on autoencoders for sparse coding activations, but focusing more on improving the sparse coding technique and loss to be more robust and on extending it to real datasets. Here’s some of the thoughts I had reading this:
I like the description of your more sophisticated synthetic data generation. We’ve only tried using synthetic data without correlations and with uniform frequency. We've also tried real models we don’t have the ground truth for but where we can easily visualize the feature directions (1-layer MNIST perceptrons).
I like how the MMC metric has an understandable 0-1 scale. We've
This is great work! We’ve been working on very similar things at Anthropic recently, also using gradient descent on autoencoders for sparse coding activations, but focusing more on improving the sparse coding technique and loss to be more robust and on extending it to real datasets. Here’s some of the thoughts I had reading this:
- I like the description of your more sophisticated synthetic data generation. We’ve only tried using synthetic data without correlations and with uniform frequency. We've also tried real models we don’t have the ground truth for but where we can easily visualize the feature directions (1-layer MNIST perceptrons).
- I like how the MMC metric has an understandable 0-1 scale. We've
... (read more)