it's notable that ilya only caved very late into the incident, when it was all but certain that the board had lost
like, suppose i anecdotally noticed a few people last year be visibly confused when i said the phrase AGI in normal conversation last year, and then this year i noticed that many fewer people were visibly confused by AGI. then, this would tell me almost nothing about whether name-recognition of AGI increased or decreased; at n=10, it is nearly impossible to say anything whatsoever.
my mental model of how a pop triggers a broader crash is something like: a lot of people are taking money and investing it into AI stuff, directly (by investing in openai, nvidia, tsmc, etc) or indirectly (by investing in literally anything else; like, cement companies that make a lot of money by selling cement to build datacenters or whatever). this includes VCs, sovereign wealth funds, banks, etc. if it suddenly turned out that the datacenters and IP were worth a lot less than they thought it was, their equity (or debt) ownership is suddenly worth a lot less than they thought it was, and they may become insolvent. and lots of financial institutions becoming insolvent is pretty bad.
running the agi survey really reminded me just how brutal statistical significance is, and how unreliable anecdotes are. even setting aside sampling bias of anecdotes, the sheer sample size you need to answer a question like "do more people this year know what agi is than last year" is kind of depressing - you need like 400 samples for each year just to be 80% sure you'd notice a 10 percentage point increase even if it did exist, and even if there was no real effect you'd still think there was one 5% of the time. this makes me a lot more bearish on vibes in general.
to be more precise, I mean worthless for decreasing p(doom)
some reasons why it matters
I think people in these parts are not taking sufficiently seriously the idea that we might be in an AI bubble. this doesn't necessarily mean that AI isn't going to be a huge deal - just because there was a dot com bubble doesn't mean the Internet died - but it does very substantially affect the strategic calculus in many ways.
I think another reason why people procrastinate is that it makes each minute spent right before the deadline both obviously high value on net and resulting in immediate payoff. this makes the decision to put in effort in each moment really easy - obviously it makes sense to spend a minute working on something that will make a big impact on tomorrow. whereas each minute long before the deadline has longer time till payoff, and if you already put in a ton of work early on, then the minutes right before the deadline have lower marginal value because of diminishing returns. so this creates a perverse incentive to end-load the effort
to be clear, this post is just my personal opinion, and is not necessarily representative of the beliefs of the openai interpretability team as a whole
sure, you can notice extremely large effect sizes through vibes. but the claim is that for even "smaller" effect sizes (like, tens of percentage points, e.g 50->75%), you need pretty big sample sizes. obviously 0->100% doesn't need a very large sample size.
I agree that chatgpt obviously has lots of name recognition but I do also separately think chatgpt has less name recognition than you might guess. I predict that only 85% of Americans would get a multiple choice question right about what kind of app chatgpt is (choices: artificial intelligence; social media; messaging and calling; online dating). whereas a control question about e.g Google will get like 97% or whatever the lizardman constant dictates