ben_york

Message

1mo

ben_york

Message

1mo

ben_york has not written any posts yet.

ben_york

1mo

ben_york has not written any posts yet.

Replying toDSLT 1. The RLCT Measures the Effective Dimension of Neural Networks

ben_york17d

DSLT 1. The RLCT Measures the Effective Dimension of Neural Networks

detI(w(0))

Hi Liam, you've made a really nice resource here. Thanks. I think you need to put the det in a log (so you can explode it a bit later!). It's in the equation a little above Examples of Singular Loss Landscapes.

Replying toDeep learning as program synthesis

ben_york1mo*

Deep learning as program synthesis

Thanks for your reply. It's a fascinating topic and I've got lots of follow-up questions but I'll read the paper and book first to get a better idea of which questions have already been addressed.

(edit 2 days later): Whoah. There's a lot of material in the book, in your paper and in those from your research group. I didn't realize that one could say so much about flatness! It's very likely I have misunderstood, but are you guys talking about why a model seems to end up on a particular part of a (high dimensional) ridge/plateau of the loss function? The relationship between parameter perturbations and data perturbations is interesting. Do you... (read more)

Replying toDeep learning as program synthesis

ben_york1mo

Deep learning as program synthesis

Hi Zach. Thanks for such a nice post. The degeneracies seem crucial to the apparent simplicity bias. Does footnote 15 imply that somehow the parameter vector works its way to a certain part of the parameter space, where it gets stuck because the loss function gradients can't steer it out? Also, does this interpretation mean that simplicity is related to (or even more accurately described as) robustness, which would make intuitive sense to me. In this case different measures of simplicity could be reframed as measures of robustness to different types of perturbation.

•••

LESSWRONG
LW

LESSWRONG
LW

ben_york

ben_york

ben_york

ben_york