Curious what makes you think this.
Because there is a reason for why Cursor and Claude Code exist. I'd suggest looking at what they do for more details.
METR is not in the business of building code agents. Why is their work informing so much of your views on the usefulness Cursor or Claude Code?
This is literally the point I make above.
Either you fail to capture the relevant capabilities and build unwarranted confidence that things are ok, or you are doing public competitive elicitation & amplification work.
(I can't really tell if this post is trying to argue the overhang is increasing or just that there is some moderately sized overhang ongoingly.)
It has increased on some axes (companies are racing as fast as they can and the capital and research is by far LONG scaling), and reduced on some others (low-hanging fruits get plucked first).
The main point is that it is there and consistently under-estimated.
For instance, there are still massive returns to spending an hour on learning and experimenting with prompt engineering techniques. Let alone more advanced approaches.
This thus leads to a bias of over-estimating the safety of our systems, except if you expect that our evaluators are better elicitators than not only existing AI research engineers, but like, the ones over the next two, five or ten years.
Code Agents (Cursor or Claude Code) are much better at performing code tasks than their fine-tune equivalent, mainly because of the scaffolding.
When I told you that we should not put 4% of the global alignment spending budget in AI Village, you asked me if I thought METR should also not get as much funding as it does.
It should now be more legible why.
From my point of view, both of AI Village and METR, on top of not doing the straightforward thing of advocating for a pause, are bad on their own terms.
Either you fail to capture the relevant capabilities and build unwarranted confidence that things are ok, or you are doing public competitive elicitation & amplification work.
What is an example of something useful you think could in theory be done with current models but isn't being elicited in favor of training larger models?
Better prompt engineering, fine-tuning, interpretability, scaffolding, sampling.
Fast-forward button
I think you are may be failing to make an argument along the line of "But people are already working on this! Markets are efficient!"
To which my response is "And thus 3 years from now, we'll know what to do with models much more than we do, even if you personally can't come up with an example now. The same way we now know what to do with models much more than 3 years ago."
Except if you expect this not be the case, like by having directly worked on juicing models or followed people who have done so and failed, you shouldn't really expect your failure to come up with such examples to be informative.
If you are on Paul/Quentin side, "lots of slack" would be enough to concede, but they do not think there's lots of slack.
If you are on Eliezer/Nate side, "little slack" is far from enough to concede: it's about whether humanity can and will do something with that slack.
So this is not a crux.
Nevertheless, this concept could help prevent a very common failure mode in the debate.
Namely, at any point in the debate, either side could ask "Are you arguing that there is lots/little slack, that we are willing/unwilling to use that slack, or that we are able/unable to use that slack?", which I expect could clear some amount of talking past each other.
A simple example where understanding an underlying problem doesn't solve the problem: I understand fairly well why I'm tempted to eat too many potato chips, and why this is bad for me, and what I could do instead. And yet, sometimes I still eat more potato chips than I intend.
This is a great example.
Some people, specifically thanks to their better understanding of themselves, do not find themselves eating more potato chips than they intend.
There is more.
I believe...
Our society does seem to inculcate in its members the idea that certain things are only for super-smart people to do, and whoever you are, you are not smart enough to do an impactful thing.
Most people would fail at passing the bar and the USMLE. This is why most people do not attempt them, and this is why our society tells them not to.
I believe it is load bearing, but in the straightforward way: it would be catastrophic if everyone tried to study things far beyond their abilities and wasted their time.
Most people have no hope of understanding complex topics.
No. Strongly disagree. "Most people don't understand X" is a thing I could accept, but "most people can't understand X" is usually false, with only rare exceptions.
You are confusing "Most people can't understand X." with "Most people have no hope of understanding X.". Only the latter matter for the psychological toll it has on people.
Hopelessness might be warranted or not, but it's there.
---
Separately, I believe that quite often, their hopelessness is warranted.
Everyone hits their ceilings.
I know many mathematically talented people who struggle to express themselves in ways that are legible to others, or to move their body in a natural way. They will get better if they train, but it's pretty clear to them and everyone else that their ceiling is low.
In general, I know many people talented [at a field] with clear limitations in [some other field]. Arts, maths, oral expression, style, empathy, physical strength, body awareness, and so on.
Over time, they learn to acknowledge their talents and their limitations.
Nope.
I meant that even if the Evil People are Evil, and thus decide to not make all the bad things go away, the fact that they could make them go away is reassuring in itself.
I should have been clearer.
(I have edited it with the hope of making it clearer. Thanks.)
Nice comment.
This deals with a lot of the themes from the follow-up essay, which I expect you may be interested in.