Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.
(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)
Aww, thank you! And yep, I'll keep posting one a day for about another week!
We are killing the popular comments section early next week! I was waiting on doing that until we had shipped the new frontpage feed to everyone. The feed gives us much more ability to adjust what kind of context is attached to comments, and how to decide what comments to show.
the bar at MATS has raised every program for 4 years now
What?! Something terrible must be going on in your mechanisms for evaluating people (which to be clear, isn't surprising, indeed, you are the central target of the optimization that is happening here, but like, to me it illustrates the risks here quite cleanly).
It is very very obvious to me that median MATS participant quality has gone down continuously for the last few cohorts. I thought this was somewhat clear to y'all and you thought it was worth the tradeoff of having bigger cohorts, but you thinking it has "gone up continuously" shows a huge disconnect.
Like, these days at the end of a MATS program half of the people couldn't really tell you why AI might be an existential risk at all. Their eyes glaze over when you try to talk about AI strategy. IDK, maybe these people are better ML researchers, but obviously they are worse contributors to the field than the people in the early cohorts.
Goodfire, AIUC, Lucid Computing, Transluce, Seismic, AVERI, Fathom
Yeah, I mean, I do think I am a lot more pessimistic about all of these. If you want we can make a bet on how well things have played out with these in 5 years, deferring to some small panel of trusted third party people.
To date, I know of only two such RL dataset startups that spawned via AI safety
Agree. Making RL environments/datasets has only very recently become a highly profitable thing, so you shouldn't expect much! I am happy to make bets that we will see many more in the next 1-2 years.
Nor do I think academia's losing credit in any straightforward sense, as it's widely considered too big to fail even by many dissenters, who e.g. are extremely disappointed with standards in scientific academia but still automatically equate academia with science in general.
Huh, I do think our world models must differ here. My current sense is societal trust and reliance on academia is dropping pretty sharply, partially though not centrally as a result of things like this, and I similarly expect the market value of things like PhDs to drop relatively intensely in the coming decade (barring major AI disruption making that question moot). I would be happy to bet on this, if you disagree.
What happens as a result of the kinds of failures you describe is not at all like a decline in price, a little bit like a decline in the aggregate purchasing power of money, somewhat more like increased vulnerability to speculative attack, and most similar to a decrease in transaction volume as people see fewer and fewer opportunities for profitable transactions within the system.
I found this set of potential analogies helpful! I do think I still disagree about the relative appropriateness for each one of these analogies to the situation. Not sure how much value I would provide by going through them all in this comment thread, though I might take the opportunity and do it in a top-level post.
I mean, in as much as one is worried about Goodhart's law, and the issue in contention is adversarial selection, then the acceptance rate going down over time is kind of the premise of the conversation. Like, it would be evidence against my model of the situation if the acceptance rate had been going up (since that would imply MATS is facing less adversarial pressure over time).
Mentor ratings is the most interesting category to me. As you can imagine I don't care much for ML skill at the margin. CodeSignal is a bit interesting though I am not familiar enough with it to interpret it, but I might look into it.
I don't know whether you have any plots of mentor ratings over time broken out by individual mentor. My best guess is the reason why mentor ratings are going up is because you have more mentors who are looking for basically just ML skill, and you have successfully found a way to connect people into ML roles.
This is of course where most of your incentive gradient was pointing to in the first place, as of course the entities that are just trying to hire ML researchers have the most resources, and you will get the most applicants for highly paid industry ML roles, which are currently among the most prestigious and most highly paid roles in the world (while of course being centrally responsible for the risk from AI that we are working on).