ben_york has not written any posts yet.

Thanks for your reply. It's a fascinating topic and I've got lots of follow-up questions but I'll read the paper and book first to get a better idea of which questions have already been addressed.
(edit 2 days later): Whoah. There's a lot of material in the book, in your paper and in those from your research group. I didn't realize that one could say so much about flatness! It's very likely I have misunderstood, but are you guys talking about why a model seems to end up on a particular part of a (high dimensional) ridge/plateau of the loss function? The relationship between parameter perturbations and data perturbations is interesting. Do you... (read more)
Hi Zach. Thanks for such a nice post. The degeneracies seem crucial to the apparent simplicity bias. Does footnote 15 imply that somehow the parameter vector works its way to a certain part of the parameter space, where it gets stuck because the loss function gradients can't steer it out? Also, does this interpretation mean that simplicity is related to (or even more accurately described as) robustness, which would make intuitive sense to me. In this case different measures of simplicity could be reframed as measures of robustness to different types of perturbation.
Hi Liam, you've made a really nice resource here. Thanks. I think you need to put the det in a log (so you can explode it a bit later!). It's in the equation a little above Examples of Singular Loss Landscapes.