One important property for a style of thinking and argumentation to have is what I call galaxy brain resistance: how difficult is it to abuse that style of thinking to argue for pretty much whatever you want - something that you already decided elsewhere for other reasons? The spirit here is similar to falsifiability in science: if your arguments can justify anything, then your arguments imply nothing.
In this post, I will argue that patterns of reasoning that are very low in galaxy brain resistance are a common phenomenon, some with consequences that are mild and others with consequences that are extreme. I will also describe some patterns that are high in galaxy brain resistance, and advocate for their use.
Vitalik talks about styles of arguments that prove too much, pulling examples from the AI, EA, and cryptocurrency communities. As defenses, he recommends having deontological principles that override slick reasoning and avoiding incentives that would distort your reasoning.
His advice to would-be AI safety researchers:
- Don't work for a company that's making frontier fully-autonomous AI capabilities progress even faster
- Don't live in the San Francisco Bay Area