This is great work! We’ve been working on very similar things at Anthropic recently, also using gradient descent on autoencoders for sparse coding activations, but focusing more on improving the sparse coding technique and loss to be more robust and on extending it to real datasets. Here’s some of the thoughts I had reading this:
I like the description of your more sophisticated synthetic data generation. We’ve only tried using synthetic data without correlations and with uniform frequency. We've also tried real models we don’t have the ground truth f
This is great work! We’ve been working on very similar things at Anthropic recently, also using gradient descent on autoencoders for sparse coding activations, but focusing more on improving the sparse coding technique and loss to be more robust and on extending it to real datasets. Here’s some of the thoughts I had reading this:
- I like the description of your more sophisticated synthetic data generation. We’ve only tried using synthetic data without correlations and with uniform frequency. We've also tried real models we don’t have the ground truth f
... (read more)