163

LESSWRONG
LW

162
AI
Frontpage

8

AlphaGo Moment for Model Architecture Discovery (arXiv)

by Person
26th Jul 2025
1 min read
4

8

AI
Frontpage

8

AlphaGo Moment for Model Architecture Discovery (arXiv)
16brambleboy
3ceba
2mishka
1Person
New Comment
4 comments, sorted by
top scoring
Click to highlight new comments since: Today at 5:28 AM
[-]brambleboy2mo165

Looks like BS. They basically just prompted ChatGPT to churn out a bunch of random architectures that ended up with similar performance. It seems likely that the ones they claim to be "SoTA" just had good numbers due to random variation. ChatGPT probably had a big role in writing the paper, too. The grandiose claims reek of its praise.

Reply
[-]ceba2mo32

Metaphors can get the shape of an idea across quickly, but this use of metaphor isn't used to describe the the paper. It's used to convey the magnitude of their discovery, in their own opinion. Was comparison to others' achievements the most effective and honest way to introduce their work? 

Reply
[-]mishka2mo20

It’s probably too early to say (we are having some tempting neural architecture search papers in recent months; this one is on my “to look closer” list).

Anyway, we probably need a link to the paper: https://arxiv.org/abs/2507.18074

Reply2
[-]Person2mo10

Thanks for the link, will add it to the post. I originally included just the arXiv pdf viewer link for it, not sure what happened for it to be gone

Reply
Moderation Log
More from Person
View more
Curated and popular this week
4Comments

A new paper picking up steam on twitter/X AI discourse, mostly thanks to its absurdly boastful title and abstract. I'm trying to figure out how important the paper is and whether the methodology/results are sound, but it's hard to find good analysis through all the noise.

While AI systems demonstrate exponentially improving capabilities, the pace of AI research itself remains linearly bounded by human cognitive capacity, creating an increasingly severe development bottleneck. We present ASI-ARCH, the first demonstration of Artificial Superintelligence for AI research (ASI4AI) in the critical domain of neural architecture discovery—a fully autonomous system that shatters this fundamental constraint by enabling AI to conduct its own architectural innovation. Moving beyond traditional Neural Architecture Search (NAS), which is fundamentally limited to exploring human-defined spaces, we introduce a paradigm shift from automated optimization to automated innovation. ASI-ARCH can conduct end-to-end scientific research in the challenging domain of architecture discovery, autonomously hypothesizing novel architectural concepts, implementing them as executable code, training and empirically validating their performance through rigorous experimentation and past human and AI experience. ASI-ARCH conducted 1,773 autonomous experiments over 20,000 GPU hours, culminating in the discovery of 106 innovative, state-of-the-art (SOTA) linear attention architectures. Like AlphaGo’s Move 37 that revealed unexpected strategic insights invisible to human players, our AI-discovered architectures demonstrate emergent design principles that systematically surpass human-designed baselines and illuminate previously unknown pathways for architectural innovation (Fig. 2). Crucially, we establish the first empirical scaling law for scientific discovery itself—demonstrating that architectural breakthroughs can be scaled computationally, transforming research progress from a human-limited to a computation-scalable process. We provide comprehensive analysis of the emergent design patterns and autonomous research capabilities that enabled these breakthroughs, establishing a blueprint for self-accelerating AI systems. To democratize AI-driven research, we open-source the complete framework, discovered architectures, and cognitive traces.