Younes Kamel

Msc student in Computer and Data Science

Your intuition is a model.

Sure, you can use a broad definition of "model" to include any decision making process. But I used the word model to refer to probabilistic and quantitative models.

A few examples of topics where you really don't include any cost/benefit estimates in your decision (as opposed to strawman examples of INCORRECT cost/benefit use) would go a long way.

Sure. An example from my life is "I refrain from investing in the stock market because we do not understand how it works and it is too uncertain". I don't rely on cost benefit analysis in this case. It is more of qualitative analysis. I do not use cost benefit analysis because I am unable to quantify the expected utility I would derive from investing in the stock market. I do not have the necessary information to compute it.

it seems intuitively to me, someone who admittedly doesn't know much about the subject, like unknown unknowns could be accurately modeled most of the time with a long-tailed normal distribution

How fat tailed do you make it ? You said you use past extreme events to choose a distribution. But what if the past largest event is not the largest possible event ? What if the past does not predict the future in this case ?

You can say "the largest truck that ever crossed by bridge was 20 tons, therefore my bridge has to be able to sustain 20 tons" but that is a logical fallacy. The fact that the largest truck you ever saw was 20 tons does not mean a 30 ton truck could not come by one day. This amounts to saying "I have observed the queen of England for 600 days and she hasn't died in any of them, therefore the queen of England will never die".

I wrote a post summarizing misuses of statistics here. You can read that if you want a short version. If you want to learn to evaluate studies and gauge their rigor, then read Inuitive biostatistics by Harvey Motuslsky and Statistics done wrong by Alex Reinhart. These were my main sources for my post. After reading them you should have a good understanding of statistics intuitively, without necessarily knowing the math. If you have to read only one, then definitely go for *Intuitive biostatistics*. It includes perhaps 90% of the content of the other book and more, because *Statistics done wrong* assumes you've taken at least an introductory class in statistics, but *Intuitive biostatistics* doesn't. Read it at your own pace, take time to understand every point, especially the "common mistakes" parts. Don't speed-read the book, understanding statistics will improve your critical thinking more than reading 100 other books, so don't worry if it takes time. If you don't understand the explanations for the "common mistakes" then check out Wikipedia or the explanation in *Statistics done wrong*. If you prefer online courses to books then you can take an introductory statistics MOOC and then read *Statistics done wrong, *which is shorter than *Intuitive biostatistics.* Good luck !

Perhaps the most important takeaway from our study is hidden in plain sight: the field is in danger of being drowned by noise. Different optimizers exhibit a surprisingly similar performance distribution compared to a single method that is re-tuned or simply re-run with different random seeds. It is thus questionable how much insight the development of new methods yields, at least if they are conceptually and functionally close to the existing population.

This is from the author's conclusion. They do also acknowledge that a couple optimizers seem to be better than others across tasks and datasets, and I agree with them (and with you if that's your point). But most optimizers do not meet the "significant improvement" claims their authors have been making. They also say most tuned algorithms can be equaled by trying seevral un-tuned algorithms. So the point is twofold :

1. Most new algorithms can be equaled or beaten by re-tuning of most old algorithms.

2. Their tuned versions can be equaled or beaten by many un-tuned versions of old algorithms.

This seems to be consistent with there being no overwhelming winner and low variance in algorithm performance.

If I understand your model correctly, and let me know if I do, if an algorithm Y improves performance by 1 std on a specific task, it woulds still get beaten by an unimproved algorithm 16% of the time. Sure, but you have to compute the probability of the Y algorithm (mean=1, std=1) being beaten by the X1, X2, X3, X4 algorithms (all mean=0, std=1) , which is what is happening in the authors' experiment, and it is much lower.

You're right, I should have written "but it turns out most of them could be beaten by the untuned version of *several* competitors on the 5 five datasets", as one can see in the figures. Thank you for pointing it out, I'll edit the post.

I'm not as versed in mistakes of meta-analysis yet, but I'm working on it ! Once I compile enough meta-analysis misuses I will add them to the post. Here is one that's pretty interesting :

https://crystalprisonzone.blogspot.com/2016/07/the-failure-of-fail-safe-n.html

Many studies still use fail-safe N to account for publication bias when it has been shown to be invalid. If you see a study that uses it you can act as if they did not account for publication bias at all.

100% agree with defaulting to non-gaussian distribution. That is what rigorous statistics would look like imo.

I'm starting to realize that as well. It can give you the intuition without having to memorize theorems. I think I'm going to start using simulations a lot more.

Yes, for sure. You can still fall for selective skepticism where you scrutinize studies you "like" much more than studies you don't like. You can deal with that by systematically applying the same checklist to every study you read, but that might be time consuming. The real solution is probably a community that is versed in statistics and that have open debates on the quality of studies, perhaps cumulatively, biases will cancel each other if the community has enough diversity of thought. Hence the value of pluralism.

I know about index funds. Even those are not nearly as safe as people think. It is a fallacy to assume that because the SP500 on average grows 7% a year that you will get a 7%/year return rate on your investment. Your true expected return is lower than that. People have a hard time predicting how they will behave in particular situations. They swear they won't sell after a crash, and yet they do. You might say you are not like that, but probabilistically speaking you probably are. You might get sick and need to get cash quick and sell while the market is down. You might need to buy a house because of an unexpected child. Because the group gets 7% return, does not mean that an individual will get 7% return on the long run. This is called the ergodicity fallacy. There is also tracking error and fees, depending on your broker.