DavidHolmes — LessWrong

Neural networks biased towards geometrically simple functions?

Neural networks (NNs) do not output all functions with equal probability, but seem to be biased towards functions of certain types; heuristically, towards 'simple' functions. In VPCL18, MSVP+19, MVPSL20 evidence is given that functions output by NNs are inclined to have low information-theoretic complexity - nice summaries are given on...

Dec 8, 202216

Bias towards simple functions; application to alignment?

Summary Deep neural networks (DNNs) are generally used with large numbers of parameters relative to the number of given data-points, so that the solutions they output are far from uniquely determined. How do DNNs 'choose' what solution to output? Some fairly recent papers ([1], [2]) seem to suggest that DNNs...

Aug 18, 20225

Categorial preferences and utility functions

This post is motivated by a recent post of Stuart Armstrong on going from preferences to a utility function. It was originally planned as a comment, but seems to have developed a bit of a life of its own. The ideas here came up in a discussion with Owen Biesel;...

Aug 9, 201910