Existential wins and AI safety
By an “existential win”, we mean humanity creates a stable, positive future. We care a heck of a lot about this one.
Our working model here accords roughly with the model in Nick Bostrom’s book Superintelligence
. In particular, we believe that if general artificial intelligence is at some point invented, it will be an enormously big deal
Who we’re focusing on, why
Our preliminary investigations agree with The Onion’s
; despite some looking, we have found no ultra-competent group of people behind the scenes who have fully got things covered.
What we have found are:
- AI and machine learning graduate students, researchers, project-managers, etc. who care; who can think; and who are interested in thinking better;
- Students and others affiliated with the “Effective Altruism” movement, who are looking to direct their careers in ways that can do the most good;
- Rationality geeks, who are interested in seriously working to understand how the heck thinking works when it works, and how to make it work even in domains as confusing as AI safety.
These folks, we suspect, are the ones who can give humanity the most boost in its survival-odds per dollar of CFAR’s present efforts (which is a statement partly about us, but so it goes). We’ve been focusing on them.
(For the sake of everyone. Would you rather: (a) have bad rationality skills yourself; or (b) be killed by a scientist or policy-maker who also had bad rationality skills?)
Brier-boosting, not Signal-boosting
Everyone thinks they’re right. We do, too. So we have some temptation to take our own favorite current models of AI Safety strategy and to try to get everyone else to shut up about their models and believe ours instead.
This understandably popular activity is often called “signal boosting”, “raising awareness”, or doing “outreach”.
At CFAR, though, we force ourselves not to do “signal boosting” in this way. Our strategy is to spread general-purpose thinking skills, not our current opinions. It is important that we get the truth-seeking skills themselves to snowball across relevant players, because ultimately, creating a safe AI (or otherwise securing an existential win) is a research problem. Nobody, today, has copyable opinions that will get us there.
We like to call this “Brier boosting”, because a “Brier score
” is a measure of predictive accuracy.
 By "We believe X", we do not mean to assert that every CFAR staff member individually believes X. (Similarly for "We care about Y). We mean rather that CFAR as an organization is planning/acting as though X is true. (Much as if CFAR promises you a rationality T-shirt, that isn't an individual promise from each of the individuals at CFAR; it is rather a promise from the organization as such.)
If we're going to build an art of rationality, we'll need to figure out how to create an organization where people can individually believe whatever the heck they end up actually believing as they chase the evidence, while also having the organization qua organization be predictable/intelligible.
You may also want to check out two documents we posted in the days since this post: