Introduction to davidad and today's topics
tutor vals
LessWrong prides itself for an ethos of "say it how you think it" (see "A case for courage when speaking of AI danger"). I want to also apply this standard for courage when speaking of AI optimism, and generally for expressing one's views as weird as they may seem.
davidad, not a stranger to MIRI views and carefulness (see Open Agency Architecture) and programme director at ARIA on Safeguarded AI, has recently expressed mounting hope in collaborating with or enabling AI systems, because some of them are in fact already aligned enough and already "in basin" enough that their further reflection and improvement will more likely than not be aligned and beneficial for humanity and all beings*.
(*valuing "being beneficial to all beings" is not obviously good to everyone, but we'll get to that too, and I hope davidad will correct significant misrepresentations)
He further clarified in response that he now overall finds it unlikely LLMs scaled up to ASI would end up killing everyone.
In this dialogue we explore various ideas and try to get an understanding of davidad's viewpoints and their agreements/disagreements with classic MIRI&LW views. [Editor's note: See also this dialogue on the Natural Abstraction of Good between davidad and Gabriel Alfour]
For more context on me the "interviewer" here, Vals is the alt of J.C., board member of the french Centre for AI Safety (CeSIA) and teacher at ML4Good, with ~three years of professional involvement in AI Safety, mostly in field building and strategy. My views are not representative of these orgs, and my points here may not be representative of my views.
tutor vals
Topics I could see us exploring:
* Good vs Evil axis, alignment basin, how/why Claude (or others?) could be aligned
* Ethical realism, or quasi realism: what do you believe and why
* Mathematical realism: what do you believe and why
* Exploring various sub branches of AI futures discussion eg
* "is s