CFAR’s new focus, and AI Safety


31


AnnaSalamon

A bit about our last few months:

  • We’ve been working on getting a simple clear mission and an organization that actually works.  We think of our goal as analogous to the transition that the old Singularity Institute underwent under Lukeprog (during which chaos was replaced by a simple, intelligible structure that made it easier to turn effort into forward motion).
  • As part of that, we’ll need to find a way to be intelligible.
  • This is the first of several blog posts aimed at causing our new form to be visible from outside.  (If you're in the Bay Area, you can also come meet us at tonight's open house.) (We'll be talking more about the causes of this mission-change; the extent to which it is in fact a change, etc. in an upcoming post.)

Here's a short explanation of our new mission:
  • We care a lot about AI Safety efforts in particular, and about otherwise increasing the odds that humanity reaches the stars.

  • Also, we[1] believe such efforts are bottlenecked more by our collective epistemology, than by the number of people who verbally endorse or act on "AI Safety", or any other "spreadable viewpointdisconnected from its derivation.

  • Our aim is therefore to find ways of improving both individual thinking skill, and the modes of thinking and social fabric that allow people to think together.  And to do this among the relatively small sets of people tackling existential risk. 


To elaborate a little:

Existential wins and AI safety

By an “existential win”, we mean humanity creates a stable, positive future.  We care a heck of a lot about this one.

Our working model here accords roughly with the model in Nick Bostrom’s book Superintelligence.  In particular, we believe that if general artificial intelligence is at some point invented, it will be an enormously big deal.

(Lately, AI Safety is being discussed by everyone from The Economist to Newsweek to Obama to an open letter from eight thousand.  But we’ve been thinking on this, and backchaining partly from it, since before that.)

Who we’re focusing on, why

Our preliminary investigations agree with The Onion’s; despite some looking, we have found no ultra-competent group of people behind the scenes who have fully got things covered.

What we have found are:
  • AI and machine learning graduate students, researchers, project-managers, etc. who care; who can think; and who are interested in thinking better;
  • Students and others affiliated with the “Effective Altruism” movement, who are looking to direct their careers in ways that can do the most good;
  • Rationality geeks, who are interested in seriously working to understand how the heck thinking works when it works, and how to make it work even in domains as confusing as AI safety.
These folks, we suspect, are the ones who can give humanity the most boost in its survival-odds per dollar of CFAR’s present efforts (which is a statement partly about us, but so it goes).  We’ve been focusing on them.  

(For the sake of everyone.  Would you rather: (a) have bad rationality skills yourself; or (b) be killed by a scientist or policy-maker who also had bad rationality skills?)

Brier-boosting, not Signal-boosting

Everyone thinks they’re right.  We do, too.  So we have some temptation to take our own favorite current models of AI Safety strategy and to try to get everyone else to shut up about their models and believe ours instead.

This understandably popular activity is often called “signal boosting”, “raising awareness”, or doing “outreach”.

At CFAR, though, we force ourselves not to do “signal boosting” in this way.  Our strategy is to spread general-purpose thinking skills, not our current opinions.  It is important that we get the truth-seeking skills themselves to snowball across relevant players, because ultimately, creating a safe AI (or otherwise securing an existential win) is a research problem.  Nobody, today, has copyable opinions that will get us there.

We like to call this “Brier boosting”, because a “Brier score” is a measure of predictive accuracy. 

Lever and World


[1] By "We believe X", we do not mean to assert that every CFAR staff member individually believes X.  (Similarly for "We care about Y).  We mean rather that CFAR as an organization is planning/acting as though X is true.  (Much as if CFAR promises you a rationality T-shirt, that isn't an individual promise from each of the individuals at CFAR; it is rather a promise from the organization as such.)

If we're going to build an art of rationality, we'll need to figure out how to create an organization where people can individually believe whatever the heck they end up actually believing as they chase the evidence, while also having the organization qua organization be predictable/intelligible.

ETA:
You may also want to check out two documents we posted in the days since this post: