LESSWRONG
LW

2463
habryka
50362Ω18022785768118
Message
Dialogue
Subscribe

Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com. 

(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
56Habryka's Shortform Feed
Ω
7y
Ω
439
The Lightcone Principles
A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology
AI safety undervalues founders
habryka2h146

The MATS acceptance rate was 33% in Summer 2022 (the first program with open applications) and decreased to 4.3% (in terms of first-stage applicants; ~7% if you only count those who completed all stages) in Summer 2025. Similarly, our mentor acceptance rate decreased from 100% in Summer 2022 to 27% for the upcoming Winter 2026 Program.

I mean, in as much as one is worried about Goodhart's law, and the issue in contention is adversarial selection, then the acceptance rate going down over time is kind of the premise of the conversation. Like, it would be evidence against my model of the situation if the acceptance rate had been going up (since that would imply MATS is facing less adversarial pressure over time).

I don't have plots prepared, but measures of scholar technical ability (e.g., mentor ratings, placements, CodeSignal score) have consistently increased. I feel very confident that MATS is consistently improving in our ability to find, train, and place ML (and other) researchers in AI safety roles, predominantly as "Iterators".

Mentor ratings is the most interesting category to me. As you can imagine I don't care much for ML skill at the margin. CodeSignal is a bit interesting though I am not familiar enough with it to interpret it, but I might look into it. 

I don't know whether you have any plots of mentor ratings over time broken out by individual mentor. My best guess is the reason why mentor ratings are going up is because you have more mentors who are looking for basically just ML skill, and you have successfully found a way to connect people into ML roles. 

This is of course where most of your incentive gradient was pointing to in the first place, as of course the entities that are just trying to hire ML researchers have the most resources, and you will get the most applicants for highly paid industry ML roles, which are currently among the most prestigious and most highly paid roles in the world (while of course being centrally responsible for the risk from AI that we are working on).

Reply
Put numbers on stuff, all the time, otherwise scope insensitivity will eat you
habryka3h20

Aww, thank you! And yep, I'll keep posting one a day for about another week!

Reply
The Charge of the Hobby Horse
habryka3h20

We are killing the popular comments section early next week! I was waiting on doing that until we had shipped the new frontpage feed to everyone. The feed gives us much more ability to adjust what kind of context is attached to comments, and how to decide what comments to show.

Reply
AI safety undervalues founders
habryka4h80

the bar at MATS has raised every program for 4 years now

What?! Something terrible must be going on in your mechanisms for evaluating people (which to be clear, isn't surprising, indeed, you are the central target of the optimization that is happening here, but like, to me it illustrates the risks here quite cleanly). 

It is very very obvious to me that median MATS participant quality has gone down continuously for the last few cohorts. I thought this was somewhat clear to y'all and you thought it was worth the tradeoff of having bigger cohorts, but you thinking it has "gone up continuously" shows a huge disconnect.  

Like, these days at the end of a MATS program half of the people couldn't really tell you why AI might be an existential risk at all. Their eyes glaze over when you try to talk about AI strategy. IDK, maybe these people are better ML researchers, but obviously they are worse contributors to the field than the people in the early cohorts. 

Goodfire, AIUC, Lucid Computing, Transluce, Seismic, AVERI, Fathom

Yeah, I mean, I do think I am a lot more pessimistic about all of these. If you want we can make a bet on how well things have played out with these in 5 years, deferring to some small panel of trusted third party people.

To date, I know of only two such RL dataset startups that spawned via AI safety

Agree. Making RL environments/datasets has only very recently become a highly profitable thing, so you shouldn't expect much! I am happy to make bets that we will see many more in the next 1-2 years.

Reply11
Don't let people buy credit with borrowed funds
habryka4h42

Nor do I think academia's losing credit in any straightforward sense, as it's widely considered too big to fail even by many dissenters, who e.g. are extremely disappointed with standards in scientific academia but still automatically equate academia with science in general.

Huh, I do think our world models must differ here. My current sense is societal trust and reliance on academia is dropping pretty sharply, partially though not centrally as a result of things like this, and I similarly expect the market value of things like PhDs to drop relatively intensely in the coming decade (barring major AI disruption making that question moot). I would be happy to bet on this, if you disagree.

What happens as a result of the kinds of failures you describe is not at all like a decline in price, a little bit like a decline in the aggregate purchasing power of money, somewhat more like increased vulnerability to speculative attack, and most similar to a decrease in transaction volume as people see fewer and fewer opportunities for profitable transactions within the system.

I found this set of potential analogies helpful! I do think I still disagree about the relative appropriateness for each one of these analogies to the situation. Not sure how much value I would provide by going through them all in this comment thread, though I might take the opportunity and do it in a top-level post.

Reply
Load More
39Put numbers on stuff, all the time, otherwise scope insensitivity will eat you
3h
2
67Increasing returns to effort are common
1d
3
78Don't let people buy credit with borrowed funds
2d
19
111Tell people as early as possible it's not going to work out
2d
11
200Paranoia: A Beginner's Guide
6h
45
68Two can keep a secret if one is dead. So please share everything with at least one person.
3d
0
119Do not hand off what you cannot pick up
4d
17
81Question the Requirements
5d
12
250Banning Said Achmiz (and broader thoughts on moderation)
3mo
399
97Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
4mo
43
Load More
CS 2881r
2 months ago
(+204)
Roko's Basilisk
4 months ago
Roko's Basilisk
4 months ago
AI Psychology
a year ago
(+58/-28)