anaguma — LessWrong

Claude 4.6 was released about an hour ago. Just 10 mins after it was released, OpenAI released GPT-5.3.

They’ll be able to automate ML research in the sense of coming up with experiments to try, and implementing those experiments, but never any new conceptual work.

I agree this seems unlikely. As a baseline, labs have thousands of capabilities researchers coming up with insights, and they could train the models to imitate them. There is also a path of RLVRing against the results of small scale experiments. It’s more expensive to collect data for research taste, but it doesn’t seem like a difference in kind to software engineering.

Daniel Kokotajlo's Shortform

anaguma15d50

And people still wonder how the AIs could possibly take over!

Clarifying how our AI timelines forecasts have changed since AI 2027

anaguma15d40

How has your p(doom) changed over this period?

Wei Dai's Shortform

anaguma15d30

What do you think have been the most important applications of UDT or other decision theories to alignment?

Thomas Kwa's Shortform

anaguma16d45

2x uplift is already happening at the most advanced AI lab

This seems plausible to me, but would be good to have a new METR uplift study to have more confidence in this.

Kaj's shortform feed

anaguma17d10

Could you give an example of an article where this was effective?

AI #152: Brought To You By The Torment Nexus

anaguma18d10

The existential risks that everyone will die or that the future will belong to the AIs are obvious.

I not sure that this is obvious to most, particularly outside of LW.

Releasing TakeOverBench.com: a benchmark, for AI takeover

anaguma20d131

The one bench we definitely don't want to be bench-maxxed.

benwr's unpolished thoughts

anaguma21d10

arguing that if you don't know the hazard rate, but instead have uncertainty about it that you update over time, hyperbolic-looking discounting can fall out from there

This seems relevant to X-risk discussions.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments