The Lindy effect (or Lindy's Law).
The Lindy effect is a theory that the future life expectancy of some non-perishable things like a technology or an idea is proportional to their current age, so that every additional period of survival implies a longer remaining life expectancy. Where the Lindy effect applies, mortality rate decreases with time.
Example: you have two books to choose from (assuming both seem equally interesting), and you don't know much information about them except how long they've been in print. The first one came out this year, and the other one has been in print for 40 years.
Using Lindy you can expect the first book's sales to drop either this year or the next one, and you can expect the latter to stay in print for about 40 more years. in other words, the older book is likely to be more relevant, and so that's the one you'll choose.
I suggest Nassim Taleb's 'Antifragile' if you wish to read more about it.
The Lindy Effect gives no insight about which of the two books will be more “relevant“. For example, you could be comparing two political biographies, one on Donald Trump and the other on Jimmy Carter. They might both look equally interesting, but the Trump biography will make you look better informed about current affairs.
Choosing the timely rather than the timeless book is a valid rule. There‘ll always be time for the timeless literature later but the timely literature gives you the most bang for your buck if you read it now.
The Lindy Effect only tells y...
This is personal to me, but I once took a class at school where all the problems were multiple choice, required a moderate amount of thought, and were relatively easy. I got 1/50 wrong, giving me a 2% base rate for making the class of dumb mistakes like misreading inequalities or circling the wrong answer.
This isn't quite a meta-prior, but it seemed sort of related?
I'd imagine publication bias priors are helpful, especially with increasing specificity of research area, and especially where you can think of any remote possibility for interference.
Just as an example I'm familiar with (note this is probably a somewhat more extreme example than for most research areas due to the state of pharmacological research): If you see 37 RCTs in favour of a given drug, and 3 that find no significant impact (i.e. 93% in favour), it is not unfounded to assume that the trials actually performed are roughly equal in favour and against, and that there may be a missing 34-odd studies.
A 2009 analysis found that this was almost exactly the case (the studies registered were 36:38 in favour of the drug; one positive RCT went missing before publication. Along with twenty-two non-significant studies that were missing altogether, and a further 11 which were so poorly analysed as to appear significant.
(Bad Pharma, by Ben Goldacre, is a pretty sound resource for this topic in general)
Scott Aaronson's Ten Signs a Claimed Mathematical Breakthrough is Wrong
(abridged from Aaronson)
- The authors don’t use TeX. This simple test (suggested by Dave Bacon) already catches at least 60% of wrong mathematical breakthroughs. David Deutsch and Lov Grover are among the only known false positives.
- The authors don’t understand the question. Maybe they mistake NP≠coNP for some claim about psychology or metaphysics. Or maybe they solve the Grover problem in O(1) queries, under some notion of quantum computing lifted from a magazine article. I’ve seen both.
- The approach seems to yield something much stronger and maybe even false (but the authors never discuss that).
- The approach conflicts with a known impossibility result (which the authors never mention).
- The authors themselves switch to weasel words by the end. ... Personally, I happen to be a big fan of heuristic algorithms, honestly advertised and experimentally analyzed. But when a “proof” has turned into a “plausibility argument” by page 47 — release the hounds! ®6. The paper jumps into technicalities without presenting a new idea. If a famous problem could be solved only by manipulating formulas and applying standard reductions, then it’s overwhelmingly likely someone would’ve solved it already.
- The paper doesn’t build on (or in some cases even refer to) any previous work.
- The paper wastes lots of space on standard material. If you’d really proved P≠NP, then you wouldn’t start your paper by laboriously defining 3SAT, in a manner suggesting your readers might not have heard of it.
- The paper waxes poetic about “practical consequences,” “deep philosophical implications,” etc. Note that most papers make exactly the opposite mistake: they never get around to explaining why anyone should read them. But when it comes to something like P≠NP, to “motivate” your result is to insult your readers’ intelligence.
- The techniques just seem too wimpy for the problem at hand. Of all ten tests, this is the slipperiest and hardest to apply — but also the decisive one in many cases. As an analogy, suppose your friend in Boston blindfolded you, drove you around for twenty minutes, then took the blindfold off and claimed you were now in Beijing. Yes, you do see Chinese signs and pagoda roofs, and no, you can’t immediately disprove him — but based on your knowledge of both cars and geography, isn’t it more likely you’re just in Chinatown? I know it’s trite, but this is exactly how I feel when I see (for example) a paper that uses category theory to prove NL≠NP. We start in Boston, we end up in Beijing, and at no point is anything resembling an ocean ever crossed.
One idea that comes to mind is that the surface-level information sources (e.g. news articles) are often *'correct' *on a basic level, but really more like 'yes, but it's complicated' on a deeper level.
The best illustration of this is if you've ever seen a surface-level description of something you know about at a deep level, and you realise how wrong it is, or at least how much nuance it's missing. The next step is to realise that it's like that with everything - i.e. all the things you're not an expert on.
Some geography documents referred that in Japan if you are prosecuted for a crime you are found guilty 97% of the time. In the first way of telling about a particular case this works in the opposite way that this distribution tells less than a prosecution in a random country. Then in a second way how this very limited factoid gives reason to suspect that something is very amiss with the system.
The documentary put forth that Japanese hate for the state to be proven wrong and go to inappropriate lengths to avoid such flows of events. It also feels that culture has greater weigth for the gravity of not fitting in, so a lot of the conflict resolving might be done "informally" before it becomes police business. With active witchhunters the official officials only do the most extreme cases or the most excusable gray area usages are being actively hidden if they are otherwise socially desirable.
Probably if one were interested in tweaking the system there would have to be details on who has the authority do what based on what level of proof. And the case that the system is working correctly could have many details to great length to seem okay. And I would guess that true progress would be a very slippery and hard to detect and very resistant to trivial solution attempts. Yet the case that there is something to be found seems pretty strong.
Where did you get the 97% number from? https://www.youtube.com/watch?v=OINAk2xl8Bc suggests that 60% of those who are investigated for a crime by the police don't get charged with the crime.
Gwern's essay about how everything is correlated seems related/relevant: https://www.gwern.net/Everything
Pithily: "The null hypothesis is always false; statistical significance is not equal to significance"
cross-posted on the EA Forum
I'm interested in questions of the form, "I have a bit of metadata/structure to the question, but I know very little about the content of the question (or alternatively, I'm too worried about biases/hacks to how I think about the problem or what pieces of information to pay attention to). In those situations, what prior should I start with?"
I'm not sure if there is a more technical term than "low-information prior."
Some examples of what I found useful recently:
1. Laplace's Rule of Succession, for when the underlying mechanism is unknown.
2. Percentage of binary questions that resolves as "yes" on Metaculus. It turns out that of all binary (Yes-No) questions asked on the prediction platform Metaculus, 29% of them resolved yes. This means that even if you know nothing about the content of a Metaculus question, a reasonable starting point for answering a randomly selected binary Metaculus question is 29%.
In both cases, obviously there are reasons to override the prior (for example, you can arbitrarily flip all questions on Metaculus such that your prior is now 71%). However (I claim), having a decent prior is nonetheless useful in practice, even if it's theoretically unprincipled.
I'd be interested in seeing something like 5-10 examples of low-information priors as useful as the rule of succession or the Metaculus binary prior.