CFAR’s new focus, and AI Safety

Thanks for the thoughts; I appreciate it.

I agree with you that framing is important; I just deleted the old ETA. (For anyone interested, it used to read:

ETA: Having talked just now to people at our open house, I would like to clarify: Even though our aim is explicitly AI Safety...
CFAR does still need an art of rationality, and a community of rationality geeks that support that. We will still be investing at least some in that community. We will also still be running some "explore" workshops of different sorts aiming at patching gaps in the

... (read more)

Thanks for engaging. Further thoughts:

I agree with you that framing is important; I just deleted the old ETA.

For what it's worth I think even without saying that your aim is explicitly AI safety, a lot of people reading this post will take that away unless you do more to cancel the implicature. Even the title does this! It's a slightly odd grammatical construction which looks an awful lot like CFAR’s new focus: AI Safety; I think without being more up-front about alternative interpretation it will sometimes be read that way.

I'm curious where our two

... (read more)

CFAR’s new focus, and AI Safety

by AnnaSalamon 2 min read3rd Dec 201688 comments


A bit about our last few months:

  • We’ve been working on getting a simple clear mission and an organization that actually works.  We think of our goal as analogous to the transition that the old Singularity Institute underwent under Lukeprog (during which chaos was replaced by a simple, intelligible structure that made it easier to turn effort into forward motion).
  • As part of that, we’ll need to find a way to be intelligible.
  • This is the first of several blog posts aimed at causing our new form to be visible from outside.  (If you're in the Bay Area, you can also come meet us at tonight's open house.) (We'll be talking more about the causes of this mission-change; the extent to which it is in fact a change, etc. in an upcoming post.)

Here's a short explanation of our new mission:
  • We care a lot about AI Safety efforts in particular, and about otherwise increasing the odds that humanity reaches the stars.

  • Also, we[1] believe such efforts are bottlenecked more by our collective epistemology, than by the number of people who verbally endorse or act on "AI Safety", or any other "spreadable viewpointdisconnected from its derivation.

  • Our aim is therefore to find ways of improving both individual thinking skill, and the modes of thinking and social fabric that allow people to think together.  And to do this among the relatively small sets of people tackling existential risk. 

To elaborate a little:

Existential wins and AI safety

By an “existential win”, we mean humanity creates a stable, positive future.  We care a heck of a lot about this one.

Our working model here accords roughly with the model in Nick Bostrom’s book Superintelligence.  In particular, we believe that if general artificial intelligence is at some point invented, it will be an enormously big deal.

(Lately, AI Safety is being discussed by everyone from The Economist to Newsweek to Obama to an open letter from eight thousand.  But we’ve been thinking on this, and backchaining partly from it, since before that.)

Who we’re focusing on, why

Our preliminary investigations agree with The Onion’s; despite some looking, we have found no ultra-competent group of people behind the scenes who have fully got things covered.

What we have found are:
  • AI and machine learning graduate students, researchers, project-managers, etc. who care; who can think; and who are interested in thinking better;
  • Students and others affiliated with the “Effective Altruism” movement, who are looking to direct their careers in ways that can do the most good;
  • Rationality geeks, who are interested in seriously working to understand how the heck thinking works when it works, and how to make it work even in domains as confusing as AI safety.
These folks, we suspect, are the ones who can give humanity the most boost in its survival-odds per dollar of CFAR’s present efforts (which is a statement partly about us, but so it goes).  We’ve been focusing on them.  

(For the sake of everyone.  Would you rather: (a) have bad rationality skills yourself; or (b) be killed by a scientist or policy-maker who also had bad rationality skills?)

Brier-boosting, not Signal-boosting

Everyone thinks they’re right.  We do, too.  So we have some temptation to take our own favorite current models of AI Safety strategy and to try to get everyone else to shut up about their models and believe ours instead.

This understandably popular activity is often called “signal boosting”, “raising awareness”, or doing “outreach”.

At CFAR, though, we force ourselves not to do “signal boosting” in this way.  Our strategy is to spread general-purpose thinking skills, not our current opinions.  It is important that we get the truth-seeking skills themselves to snowball across relevant players, because ultimately, creating a safe AI (or otherwise securing an existential win) is a research problem.  Nobody, today, has copyable opinions that will get us there.

We like to call this “Brier boosting”, because a “Brier score” is a measure of predictive accuracy. 

Lever and World

[1] By "We believe X", we do not mean to assert that every CFAR staff member individually believes X.  (Similarly for "We care about Y).  We mean rather that CFAR as an organization is planning/acting as though X is true.  (Much as if CFAR promises you a rationality T-shirt, that isn't an individual promise from each of the individuals at CFAR; it is rather a promise from the organization as such.)

If we're going to build an art of rationality, we'll need to figure out how to create an organization where people can individually believe whatever the heck they end up actually believing as they chase the evidence, while also having the organization qua organization be predictable/intelligible.

You may also want to check out two documents we posted in the days since this post: