Grouped Loss may disfavor discontinuous capabilities — LessWrong