Frontier Coding Agents Can Now Implement an AlphaZero Self-Play Machine Learning Pipeline For Connect Four That Performs Comparably to an External Solver
by Baybar, jsherwood, and Benjamin Kaplan
TLDR: We tested whether frontier coding agents could autonomously implement AlphaZero for Connect Four in three hours. Some of them could do this very well, with Opus 4.7 sometimes performing better, by Bradley-Terry rating, than an external solver. In GPT-5.4's evaluations, it used much less of its time budget than...
Apr 2828