Wiki Contributions

Comments

As I was reading I kept waiting for gills to be mentioned, and then it was satisfying to see that it was one of the two un-dolphinlike characteristics in the dictionary definition. I figured, if someone asks me "how do fish get oxygen if they live underwater?" I'm going to say "because they have gills," not "because they either have gills or return to the surface to breath."

But then, vaniver's comment mentioning sharks got me thinking. What if someone asks "do fish have bones?" The more I think about it, the harder it is for me to think of sharks as the same kind of thing as salmon.

I don't know what lesson to draw from this, except that I find it uncomfortable, want better words, and am suddenly motivated to learn fish taxonomy, a subject that never interested me before, just for vocabulary that leaves me more mentally comfortable.

My view of the Copernican revolution used to be that when people finally switched to the heliocentric model, something clicked. The data was suddenly predictable and understandable.
It sounds like this is what happened though, but the click was at Kepler.
The surprising corollary is that Galileo just happened to be right, and I don't really want to imitate him. I don't want to be the kind of person who would have been a Copernican without knowing Kepler's theory.
But on the other hand, to invent Kepler's theory, Kepler had to be a Copernican.

"Statistical models with fewer assumptions" is a tricky one, because the conditions under which your inferences work is not identical to the conditions you assume when deriving your inferences.

I mostly have in mind a historical controversy in the mathematical study of evolution. Joseph Felsenstein introduced maximum likelihood methods for inferring phylogenetic trees. He assumed a probabilistic model for how DNA sequences change over time, and from that he derived maximum likelihood estimates of phylogenetic trees of species based on their DNA sequences.

Felsenstein's maximum likelihood method was an alternative to another method, the "maximum parsimony" method. The maximum parsimony tree is the tree that requires you to assume the fewest possible sequence changes when explaining the data.

Some people criticized Felsenstein's maximum likelihood method, since it assumed a statistical model, whereas the maximum parsimony method did not. Felsenstein's response was to exhibit a phylogenetic tree and model of sequence change where maximum parsimony failed. Specifically, it was a tree connecting four species. And when you randomly generate DNA sequences using this tree and the specified probability model for sequence change, maximum parsimony gives the wrong result. When you generate short sequences, it may give the right result by chance, but as you generate longer seqences, maximum parsimony will, with probability 1, converge on the wrong tree. In statistical terms, maximum parsimony is inconsistent: it fails in the infinite-data limit, at least when that is the data-generating process.

What does this mean for the criticism that maximum likelihood makes assumptions? Well, it's true that maximum likelihood works when the data-generating process matches our assumptions, and may not work otherwise. But maximum parsimony also works for a limited set of data-generating processes. Can users of maximum parsimony, then, be accused of making the assumption that the data-generating process is one on which maximum parsimony is consistent?

The field of phylogenetic inference has since become very simulation-heavy. They assume a data-generating process, and test the output of maximum likelihood, maximum parsimony, and other methods. The conceern is, therefore, not so much on how many assumptions the statistical method makes, but on what range of data-generating processes it gives correct results.

This is an important distinction because, while we can assume that the maximum likelihood method works when its assumptions are true, it may also work when its assumptions are false. We have to explore with theory and simulations what is the set of data-generating processes on which it is effective, just like we do with "assumption-free" methods like maximum parsimony.

For more info, some of this story can be found in Felsenstein's book "Inferring Phylogenies", which also contains references to many of the original papers.

If you generate predictions by looking up market values, that means three things:

  1. You're never going to make any money on the market, not even a tiny bit
  2. You're never going to contribute to the market
  3. You're never really going to learn what the market knows about underlying causes

Because of (2), if everyone did this, there would be no market.

You need your own machine for generating predictions. Here, I'm borrowing a metaphor from Ronny Fernandez, who says that he keeps track of predictions because he needs to debug the program that generates them.

You don't need to believe the resulting predictions. You can make predictions like, "I think that the probability of a Trump re-election is .99 times the probability assigned by this prediction market, plus .01 times the probability output by my machine." You basically believe the market's prediction, but have some small way of contributing.

I think (3) is more important though. If the market is consistently right about something, there exists a machine that outputs good predictions. That machine may even be a causal model that gives you some deep insight about the world. And you're never going to have it in your head if all you do is make market-based opinions.

So I think that's another sense of "you have a right to be wrong." You have a right to keep tinkering with your own prediction-machine in your garage, even if it's not yet competitive with the finished products on the market.