1. There will be Good and Bad AIs. 2. More than 80% of AIs will be Good. 3. Superintelligence will be achieved before 2035. 4. The robot population will exceed the human population by 2045.
A Curious Puzzle Yesterday, I listened to Roman Yampolskiy on The Diary of a CEO discuss AI risks with genuine passion. His concerns about potential suffering and extinction resonated with me, until he mentioned his strong belief that we're living in a simulation, possibly run by indifferent operators. This got...
Introduction Over the last two decades, the AI Safety community has produced several debates on AI morality and alignment. From Eliezer Yudkowsky’s early proposal of Coherent Extrapolated Volition (2004), to later reflections on the complexity and fragility of human values (2008; 2009), to ongoing discussions of moral uncertainty (MacAskill, 2020),...
This work was carried out independently, essentially cloning Will Any Crap Cause Emergent Misalignment? but re-writing the fine-tuning dataset into non-dual language while keeping the scatological theme. The result is a notable reduction in misaligned outputs, especially in the more interesting classes. Since Non-Dual Crap is still a form of...
The existential threat posed by Artificial Superintelligence (ASI) is arguably the most critical challenge facing humanity. The argument, compellingly summarized in the LessWrong post titled "The Problem," outlines a stark reality: the development of ASI, based on current optimization paradigms, is overwhelmingly likely to lead to catastrophe. This outcome is...