Leon

As does Chesterton, less explicitly:

Mere light sophistry is the thing that I happen to despise most of all things, and it is perhaps a wholesome fact that this is the thing of which I am generally accused. I know nothing so contemptible as a mere paradox; a mere ingenious defence of the indefensible.

and at length.

I get the impression that he (thankfully!) eased off on that particular template as time went on.

9y420

I suspect most self-identified communists would baulk at the description of their ideology as "complete state control of many facets of life".

Here's how I think about the distinction on a meta-level:

"It is best to act for the greater good (and acting for the greater good often requires being awesome)."

vs.

"It is best to be an awesome person (and awesome people will consider the greater good)."

where ''acting for the greater good" means "having one's own utility function in sync with the aggregate utility function of all relevant agents" and "awesome" means "having one's own terminal goals in sync with 'deep' terminal goals (possibly inherent in being whatever one is)" (e.g. Sam Harris/Aristotle-style 'flourishing').

Ah, good point. It's like the prior, considered as a regularizer, is too "soft" to encode the constraint we want.

A Bayesian could respond that we rarely actually want sparse solutions -- in what situation is a physical parameter identically zero? -- but rather solutions which have many near-zeroes with high probability. The posterior would satisfy this I think. In this sense a Bayesian could justify the Laplace prior as approximating a so-called "slab-and-spike" prior (which I believe leads to combinatorial intractability similar to the fully L0 solution).

Also, without L0 the frequentist doesn't get fully sparse solutions either. The shrinkage is gradual; sometimes there are many tiny coefficients along the regularization path.

[FWIW I like the logical view of probability, but don't hold a strong Bayesian position. What seems most important to me is getting the semantics of both Bayesian (= conditional on the data) and frequentist (= unconditional, and dealing with the unknowns in some potentially nonprobabilistic way) statements right. Maybe there'd be less confusion -- and more use of Bayes in science -- if "inference" were reserved for the former and "estimation" for the latter.]

Many L1 constraint-based algorithms (for example the LASSO) can be interpreted as producing maximum a posteriori Bayesian point estimates with Laplace (= double exponential) priors on the coefficients.

This is just the (intended) critique of utilitarianism itself, which says that the utility functions of others are (in aggregate) *exactly* what you should care about.

Opposing Bohr's interpretation.