Cross-posted from my website. You may have seen this graph from Chris Olah illustrating a range of views on the difficulty of aligning superintelligent AI: Evan Hubinger, an alignment team lead at Anthropic, says: > If the only thing that we have to do to solve alignment is train away...
Last year I gave my reasoning on cause prioritization and did shallow reviews of some relevant orgs. I'm doing it again this year. Cross-posted to my website. Cause prioritization In September, I published a report on the AI safety landscape, specifically focusing on AI x-risk policy/advocacy. The prioritization section of...
Cross-posted to my website. AI companies want to bootstrap weakly-superhuman AI to align superintelligent AI. I don't expect them to succeed. I could give various arguments for why alignment bootstrapping is hard and why AI companies are ignoring the hard parts of the problem; but I don't need any specific...
Even if we solve the AI alignment problem, we still face non-alignment problems, which are all the other existential problems [1] that AI may bring. People have written research agendas on various imposing problems that we are nowhere close to solving, and that we may need to solve before developing...
Cross-posted from my website. One day, I was at my grandma's house reading the Sunday funny pages, when I suddenly felt myself getting sucked into a Garfield comic. I looked down at my body and saw that I had become fully cartoonified. My hands had four fingers and I had...
Political advocacy is an important lever for reducing existential risk. One way to make political change happen is to support candidates for Congress. In October, Eric Neyman wrote Consider donating to Alex Bores, author of the RAISE Act. He created a cost-effectiveness analysis to estimate how donations to Bores's campaign...
Cross-posted from my website. Last year, I wrote a list of things I've changed my mind on. But good truth-seeking doesn't just require you to consider where you might be wrong; you must also consider where you might be right. In this post, I provide some beliefs I used to...