tl;dr We’re incubating an academic journal for AI alignment: rapid peer-review of foundational Alignment research that the current publication ecosystem underserves. Key bets: paid attributed review, reviewer-written synthesis abstracts, and targeted automation. Contact us if you’re interested in participating as an author, reviewer, or editor, or if you know someone...
> TLDR: We made substantial progress in 2024: > > * We published a series of papers that verify key predictions of Singular Learning Theory (SLT) [1, 2, 3, 4, 5, 6]. > * We scaled key SLT-derived techniques to models with billions of parameters, eliminating our main concerns around...
Introduces the idea of cognitive work as a parallel to physical work, and explains why concentrated sources of cognitive work may pose a risk to human safety. Acknowledgements. Thanks to Echo Zhou and John Wentworth for feedback and suggestions. Some of these ideas were presented originally in a talk in...
A short story, best enjoyed in-context... try exploring possible meanings of some of the unusual terms with your favourite commercially available Solomonoff inductor. Acknowledgements. Thanks to Simon Pepin Lehalleur and Ziling Ye for feedback and suggestions. This post is the "fun" half of a pair, for the serious version see...
Our large learning machines find patterns in the world and use them to predict. When these machines exceed us and become superhuman, one of those patterns will be relative human incompetence. How comfortable are we with the incorporation of this pattern into their predictions, when those predictions become the actions...
We're excited to announce the inaugural Australian AI Safety Forum, taking place on November 7-8, 2024, in Sydney, Australia. This event aims to foster the growth of the AI safety community within Australia. Apply now! Key Details * Dates: November 7-8, 2024 * Location: Sydney Knowledge Hub, The University of...
TLDR: We’re hiring two research assistants to work on advancing developmental interpretability and other applications of singular learning theory to alignment. About Us Timaeus’s mission is to empower humanity by making breakthrough scientific progress on alignment. Our research focuses on applications of singular learning theory to foundational problems within alignment,...