Introduction Previously I wrote about what it would mean for AI to “go well”. I would like to elaborate on this and propose some details towards a “scale-free” definition of alignment. Here “scale-free alignment” means a version of alignment that does not feature sudden and rapid “phase shifts”, so as...
A Summary People who believe in science and rationality often state that their beliefs are formed from sound foundations of observation, experimentation, and reasoning. They dismiss religion and faith as unscientific forms of thought which cannot lead to knowledge. At the same time, thought experiments like the simulation hypothesis show...
Note: reposting this as a top level post because it got no interaction as a quick take and I think this is actually a serious question. It arose from a discussion in the Technical AI Governance Forum. As AI systems get more and more complicated, the properties we are trying...
TL;DR I believe that: * There exists a parallel track of AI research which has been largely ignored by the AI safety community. This agenda aims to implement human-like online learning in ML models, and it is now close to maturity. Keywords: Hierarchical Reasoning Model, Energy-based Model, Test time training....
[A parable, of a sort.] Your Highness, It is with no small sense of urgency that I write these words, with the hope that they will be received by your magnanimous spirit with grace. This letter I write not in the spirit of criticism which certain upstarts have made the...
tl;dr: We will soon be forced to make a choice to treat AI systems either as full cognitive/ethical agents that are hampered in various ways, or continue to treat them as not-very-good systems that perform "surprisingly complex reward hacks". Treating AI safety and morality seriously implies that the first perspective...