Extremely cool.
As you say, looking at treatment effect heterogeneity is hard given power. But -- if I understand correctly, you did not take any melatonin between nights in which you randomized -- have you looked "treatment effect vs. length of time since last experimental night"? This would be a very crude way of getting at tolerance effects.
I have a (completed) 5-year melatonin self-experiment that I will hopefully write up later this year (although... I have been saying that for 12+ months at this point), will be fun to compare notes.
~3x slower after funding stops growing at current rates in 2027-2029
Have you (or others) written about where this estimate comes from?
Very cool!
To deal with the imperfect compliance of the randomization, you could use the "instrumental variables" approach. In this case, since it is (one-sided) noncompliance in an experiment, this amounts to:
I emphasize that this is a very simple econometric technique and does not rely on unreasonable assumptions ("Wald estimator" is another search term here).
Isn't the fact that Manifold is not really a real-money prediction market very important here? If there was real money on the table, for example, it's less likely that the 1/1/26 market would have been "forgotten" -- the original traders would have had money on the line to discipline their attention.
Every time someone calls Manifold (or Metaculus) a "prediction market", god kills an arbitrageur [even though both platforms are still great!].
This isn't REALLY the point of your (nice) piece, but the title provides an opportunity to plant a flag and point out: "predictably updating" is not necessarily bad or irrational. Unfortunately I don't have time to write up the full argument right now, hopefully eventually, but, TLDR:
In macroeconomics, this has recently been discussed in detail by Farmer, Nakamura, and Steinsson in the context of "medusa charts" that seem to show financial markets 'predictably updating' about interest rates.
But I imagine this issue has been discussed elsewhere -- this is not an 'economic phenomenon' per se, it's just a property of Bayesian updating on processes with a slow-moving nonstationary component.
Scott Sumner offers some comments here FWIW, copying and pasting:
I certainly believe the BOJ policy had the effect of boosting Japan’s real GDP, but the figure cited by Yudkowsky (“trillions of dollars”) seems excessive.
A few points:
1. In the long run, money is neutral. Hence monetary stimulus won’t impact the long run level of Japan’s RGDP or employment.
2. There’s a lot of evidence that Kuroda’s policies boosted Japan’s NGDP.
So here’s the issue. How much evidence is there that faster NGDP growth boosted Japan’s real economy (and employment) for a period of time? (Alternatively, how flexible are Japanese wages and prices?)
I’d say there is substantial evidence. Japanese stocks responded as if the policy was boosting growth. Unemployment fell to levels well below the 2006 boomlet. Also, keep in mind that growth in Japan’s working age population slowed sharply in the past decade, so trend RGDP growth is slowing substantially. Growth held up better after the 2014 tax increase than after the previous (1997?) version. Thus if Yudkowsky’s evidence was too cursory, so is this critique.
To summarize, the article makes some good points, but only shows that Yudkowsky might be wrong, not that he is wrong. I still think there’s lots of evidence that he was right and the pessimists at the BOJ were wrong, even if he exaggerates the benefits.
As an aside, he mentions my name. But people with very different views on monetary policy effectiveness—such as Paul Krugman (2018)—also see the evidence as clearly suggesting that Kuroda’s policy worked to some extent.
(There’s lots more I could say, but I’m on vacation.)
1. Very interesting, thanks, I think this is the first or second most interesting comment we've gotten.
2. I see that you are suggesting this as a possibility, rather than a likelihood, but I'll note at least for other readers that -- I would bet against this occurring, given central banks' somewhat successful record at maintaining stable inflation and desire to avoid deflation. But it's possible!
3. Also, I don't know if inflation-linked bonds in the other countries we sample -- UK/Canada/Australia -- have the deflation floor. Maybe they avoid this issue.
4. Long-term inflation swaps (or better yet, options) could test this hypothesis! i.e. by showing the market's expectation of future inflation (or the full [risk-neutral] distribution, with options).
(A confusing way of writing "probability")
Very nice. I recently did a similar exercise, and -- because as you note the Epoch data (understandably) doesn't have estimates of training compute for reasoning models -- I had o3 guesstimate "effective training compute" by OpenAI model (caveat: this doesn't really make sense!). You can see the FLOP by model in the link. And:
By this metric, it's ~3.5 more OOMs from o3 to 1-month-AGI. If -- as was often said to be the case pre-reasoning models -- effective compute can still be said to be growing at ~10x a year, then 1-month-AGI arrives around early 2029