Gabriel Alfour — LessWrong

AI Timelines and Points of no return

Gabriel Alfour4d20

Clearly relevant, thanks.

AI Timelines and Points of no return

Gabriel Alfour4d40

Thanks for responding.

AI Timelines and Points of no return

Gabriel Alfour5d20

For instance, in the past, it would have been conceivable for a single country of the G20 to unilaterally make it their priority to ban the development of ASI and its precursors.
In the past, it would have been conceivable for any country in the West to decide to fight off Big Tech and lead the collective fight.

I think unilateralism + leadership is quite unconceivable right now.

I am interested in any scenario you have in mind (not with the intent to fight whatever you suggest, just to see if there are ideas or mechanisms I may be missing).

And geopolitical will is something which can fluctuate. Now it's absent, currently there is no geopolitical will to do so, but in the future it might emerge (then it might disappear again, and so on).

This is a failure of my writing: I should have made it clear that it's a PNR when there's no going back.

My point of "when there's not enough geopolitical will left" was that nope, we can reach a point where there's just not enough left. Not only "right now no body wants to regulate AI" but "right now, everything is so captive that there's not really any independent will to govern left anymore".

It will cost you nothing to "bribe" a Utilitarian

Gabriel Alfour14d101

This is not a problem, this is completely within the framework!

Even with a single AI accelerationist corporation and a single employee, you may reason about bargaining power (Wikipedia).

What is true regardless of the number of accelerationist corps is that the influx of safety-branded researchers willing to work for them will drive down the safety premium.

For instance, 80k hours tried to increase their supply, and is proud to have done so. Like they say, their advisees now work at DeepMind and Anthropic!

The Eldritch in the 21st century

Gabriel Alfour24d30

Nice comment.

This deals with a lot of the themes from the follow-up essay, which I expect you may be interested in.

We are likely in an AI overhang, and this is bad.

Gabriel Alfour1mo2-2

Curious what makes you think this.

Because there is a reason for why Cursor and Claude Code exist. I'd suggest looking at what they do for more details.

METR is not in the business of building code agents. Why is their work informing so much of your views on the usefulness Cursor or Claude Code?

This is literally the point I make above.

Either you fail to capture the relevant capabilities and build unwarranted confidence that things are ok, or you are doing public competitive elicitation & amplification work.

We are likely in an AI overhang, and this is bad.

Gabriel Alfour1mo20

(I can't really tell if this post is trying to argue the overhang is increasing or just that there is some moderately sized overhang ongoingly.)

It has increased on some axes (companies are racing as fast as they can and the capital and research is by far LONG scaling), and reduced on some others (low-hanging fruits get plucked first).

The main point is that it is there and consistently under-estimated.

For instance, there are still massive returns to spending an hour on learning and experimenting with prompt engineering techniques. Let alone more advanced approaches.

This thus leads to a bias of over-estimating the safety of our systems, except if you expect that our evaluators are better elicitators than not only existing AI research engineers, but like, the ones over the next two, five or ten years.

We are likely in an AI overhang, and this is bad.

Gabriel Alfour1mo149

Code Agents (Cursor or Claude Code) are much better at performing code tasks than their fine-tune equivalent, mainly because of the scaffolding.

When I told you that we should not put 4% of the global alignment spending budget in AI Village, you asked me if I thought METR should also not get as much funding as it does.

It should now be more legible why.

From my point of view, both of AI Village and METR, on top of not doing the straightforward thing of advocating for a pause, are bad on their own terms.

Either you fail to capture the relevant capabilities and build unwarranted confidence that things are ok, or you are doing public competitive elicitation & amplification work.

We are likely in an AI overhang, and this is bad.

Gabriel Alfour1mo31

What is an example of something useful you think could in theory be done with current models but isn't being elicited in favor of training larger models?

Better prompt engineering, fine-tuning, interpretability, scaffolding, sampling.

Fast-forward button

I think you are may be failing to make an argument along the line of "But people are already working on this! Markets are efficient!"

To which my response is "And thus 3 years from now, we'll know what to do with models much more than we do, even if you personally can't come up with an example now. The same way we now know what to do with models much more than 3 years ago."

Except if you expect this not be the case, like by having directly worked on juicing models or followed people who have done so and failed, you shouldn't really expect your failure to come up with such examples to be informative.

A Lens on the Sharp Left Turn: Optimization Slack

Gabriel Alfour1mo71

If you are on Paul/Quentin side, "lots of slack" would be enough to concede, but they do not think there's lots of slack.

If you are on Eliezer/Nate side, "little slack" is far from enough to concede: it's about whether humanity can and will do something with that slack.

So this is not a crux.

Nevertheless, this concept could help prevent a very common failure mode in the debate.

Namely, at any point in the debate, either side could ask "Are you arguing that there is lots/little slack, that we are willing/unwilling to use that slack, or that we are able/unable to use that slack?", which I expect could clear some amount of talking past each other.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments