"Wait, dignity points?" you ask. "What are those? In what units are they measured, exactly?"
And to this I reply: "Obviously, the measuring units of dignity are over humanity's log odds of survival - the graph on which the logistic success curve is a straight line. A project that doubles humanity's chance of survival from 0% to 0% is helping humanity die with one additional information-theoretic bit of dignity."
"But if enough people can contribute enough bits of dignity like that, wouldn't that mean we didn't die at all?" "Yes, but again, don't get your hopes up."
Historically people worried about extinction risk from artificial intelligence have not seriously considered deliberately slowing down AI progress as a solution. Katja Grace argues this strategy should be considered more seriously, and that common objections to it are incorrect or exaggerated.
A "good project" in AGI research needs:1) Trustworthy command, 2) Research closure, 3) Strong operational security, 4) Commitment to the common good, 5) An alignment mindset, and 6) Requisite resource levels.
The post goes into detail on what minimal, adequate, and good performance looks like.
A fictional story about an AI researcher who leaves an experiment running overnight.
"Human feedback on diverse tasks" could lead to transformative AI, while requiring little innovation on current techniques. But it seems likely that the natural course of this path leads to full blown AI takeover.
Katja Grace provides a list of counterarguments to the basic case for existential risk from superhuman AI systems. She examines potential gaps in arguments about AI goal-directedness, AI goals being harmful, and AI superiority over humans. While she sees these as serious concerns, she doesn't find the case for overwhelming likelihood of existential risk convincing based on current arguments.
The field of AI alignment is growing rapidly, attracting more resources and mindshare each year. As it grows, more people will be incentivized to misleadingly portray themselves or their projects as more alignment-friendly than they are. Adam proposes "safetywashing" as the term for this
Some people believe AI development is extremely dangerous, but are hesitant to directly confront or dissuade AI researchers. The author argues we should be more willing to engage in activism and outreach to slow down dangerous AI progress. They give an example of their own intervention with an AI research group.