Complexity Penalties in Statistical Learning — LessWrong