Transcript of Beren Millidge's Keynote at The Post-AGI Workshop, San Diego, December 2025 You know how human values might survive in a very multifarious AI world where there's lots of AIs competing? This is the kind of MOLOCH world that Scott Alexander talks about. And then I realized that to...
Crossposted from my personal blog. Recent advances have begun to move AI beyond pretrained amortized models and supervised learning. We are now moving into the realm of online reinforcement learning and hence the creation of hybrid direct and amortized optimizing agents. While we generally have found that purely amortized pretrained...
Crossposted from my personal blog. I was inspired to cross-post this here given the discussion that this post on the role of capital in an AI future elicited. When discussing the future of AI, I semi-often hear an argument along the lines that in a slow takeoff world, despite AIs...
Just an ML linguistic quirk I have wondered about for a while. When I started learning ML (in 2016-2017 period) everybody referred to the period of training models as just 'training' which could then (optionally) be followed by finetuning. This usage makes sense to me and as far as I...
Executive summary Our mission at Apollo Research is to reduce catastrophic risks from AI by auditing advanced AI systems for misalignment and dangerous capabilities, with an initial focus on deceptive alignment. In our announcement post, we presented a brief theory of change of our organization which explains why we expect...
Crossposted from my personal blog. A fundamental problem in AI alignment, as well as in many social sciences is the problem of preference aggregation. Given a number of different actors who have specific preferences, what is a consistent way of making decisions that ensures that the outcome is fair and...