Q&A with new Executive Director of Singularity Institute

Today I was appointed the new Executive Director of Singularity Institute.

Because I care about transparency, one of my first projects as an intern was to begin work on the organization's first Strategic Plan. I researched how to write a strategic plan, tracked down the strategic plans of similar organizations, and met with each staff member, progressively iterating the document until it was something everyone could get behind.

I quickly learned why there isn't more of this kind of thing: transparency is a lot of work! 100+ hours of work later, plus dozens of hours from others, and the strategic plan was finally finished and ratified by the board. It doesn't accomplish much by itself, but it's one important stepping stone in building an organization that is more productive, more trusted, and more likely to help solve the world's biggest problems.

I spent two months as a researcher, and was then appointed Executive Director.

In further pursuit of transparency, I'd like to answer (on video) submitted questions from the Less Wrong community just as Eliezer did two years ago.

 

The Rules

1) One question per comment (to allow voting to carry more information about people's preferences).

2) Try to be as clear and concise as possible. If your question can't be condensed into one paragraph, you should probably ask in a separate post. Make sure you have an actual question somewhere in there (you can bold it to make it easier to scan).

3) I will generally answer the top-voted questions, but will skip some of them. I will tend to select questions about Singularity Institute as an organization, not about the technical details of some bit of research. You can read some of the details of the Friendly AI research program in my interview with Michael Anissimov.

4) If you reference certain things that are online in your question, provide a link.

5) This thread will be open to questions and votes for 7 days, at which time I will decide which questions to begin recording video responses for.

 

I might respond to certain questions within the comments thread and not on video; for example, when there is a one-word answer.

177 comments, sorted by
magical algorithm
Highlighting new comments since Today at 8:46 AM
Select new highlight date

If someone as capable as Terence Tao approached the SIAI, asking if they could work full-time and for free on friendly AI, what would you tell them to do? In other words, are there any known FAI sub-problems that demand some sort of expertise that the SIAI is currently lacking?

What message about FAI/MIRI should I take away from the fact that this very important question isn't answered?

How are you going to address the perceived and actual lack of rigor associated with SIAI?

There are essentially no academics who believe that high-quality research is happening at the Singularity Institute. This is likely to pose problems for your plan to work with professors to find research candidates. It is also likely to be an indicator of little high-quality work happening at the Institute.

In his recent Summit presentation, Eliezer states that "most things you need to know to build Friendly AI are rigorous understanding of AGI rather than Friendly parts per se". This suggests that researchers in AI and machine learning should be able to appreciate high-quality work done by SIAI. However, this is not happening, and the publications listed on the SIAI page--including TDT--are mostly high-level arguments that don't meet this standard. How do you plan to change this?

There are essentially no academics who believe that high-quality research is happening at the Singularity Institute.

I believe that high-quality research is happening at the Singularity Institute.

James Miller, Associate Professor of Economics, Smith College.

PhD, University of Chicago.

To distinguish the above from the statement "I like the Singularity Institute", could you be specific about what research activities you have observed in sufficient detail to confidently describe as "high-quality"?

ETA: Not a hint of sarcasm or snark intended, I'm sincerely curious.

I'm currently writing a book on the Singularity and have consequently become extremely familiar with the organization's work. I have gone through most of EY's writings and have an extremely high opinion of them. His research on AI plays a big part in my book. I have also been ending my game theory classes with "rationality shorts" in which I present some of EY's material from the sequences.

I also have a high opinion of Carl Shulman's (an SI employee) writings including “How Hard is Artificial Intelligence? The Evolutionary Argument and Observation Selection Effects." (Co-authored with Bostrom) and Shulman's paper on AGI and arms races.

There are essentially no academics who believe that high-quality research is happening at the Singularity Institute.

David Chalmers has said that the decision theory work is a major advance (along with various other philosophers), although he is frustrated that it hasn't been communicated more actively to the academic decision theory and philosophy communities. A number of current and former academics, including David, Stephen Omohundro, James Miller (above), and Nick Bostrom have reported that work at SIAI has been very helpful for their own research and writing in related topics.

Evan Williams, now a professor of philosophy at Purdue cites, in his dissertation, three inspirations leading to the work: John Stuart Mill's "On Liberty," John Rawls' "Theory of Justice," and Eliezer Yudkowsky's "Creating Friendly AI" (2001), discussed at greater length than the others. Nick Beckstead, a Rutgers (#2 philosophy program) philosophy PhD student who works on existential risks and population ethics reported large benefits to his academic work from discussions with SIAI staff.

These folk are a minority, and SIAI is not well integrated with academia (no PhDs on staff, publishing, etc), but also not negligible.

In his recent Summit presentation, Eliezer states that "most things you need to know to build Friendly AI are rigorous understanding of AGI rather than Friendly parts per se". This suggests that researchers in AI and machine learning should be able to appreciate high-quality work done by SIAI.

I think that work in this area has been disproportionately done by Eliezer Yudkowsky, and to a lesser extent Marcello Herreshoff. Eliezer has been heavily occupied with Overcoming BIas, Less Wrong, and his book for the last several years, in part to recruit a more substantial team for this. He also is reluctant to release work that he thinks is relevant to building AGI. Problems in recruiting and the policies of secrecy seem like the big issues here.

Eliezer has been heavily occupied with Overcoming BIas, Less Wrong, and his book for the last several years, in part to recruit a more substantial team for this.

Eliezer's investment into OB/LW apparently hasn't returned even a single full-time FAI researcher for SIAI after several years (although a few people are almost certainly doing more and better FAI-related research than if the Sequences didn't happen). Has this met SIAI's initial expectations? Do you guys think we're at the beginning of a snowball effect, or has OB/LW pretty much done as much as it can, as far as creating/recruiting FAI researchers is concerned? What are your current expectations for the book in this regard?

I have noticed increasing numbers of very talented math and CS folk expressing interest or taking actions showing significant commitment. A number of them are currently doing things like PhD programs in AI. However, there hasn't been much of a core FAI team and research program to assimilate people into. Current plans are for Eliezer to switch back to full time AI after his book, with intake of more folk into that research program. Given the mix of people in the extended SIAI community, I am pretty confident that with abundant funding a team of pretty competent researchers (with at least some indicators like PhDs from the top AI/CS programs, 1 in 100,000 or better performance on mathematics contests, etc) could be mustered over time, based on people I already know.

I am less confident that a team can be assembled with so much world-class talent that it is a large fraction of the quality-adjusted human capital applied to AGI, without big gains in recruiting (e.g. success with the rationality book or communication on AI safety issues, better staff to drive recruiting, a more attractive and established team to integrate newcomers, relevant celebrity endorsements, etc). The Manhattan Project had 21 then- or future Nobel laureates. AI, and certainly FAI, are currently getting a much, much smaller share of world scientific talent than nukes did, so that it's easier for a small team to loom large, but it seems to me like there is still a lot of ground to be covered to recruit a credibly strong FAI team.

Thanks. You didn't answer my questions directly, but it sounds like things are proceeding more or less according to expectations. I have a couple of followup questions.

At what level of talent do you think an attempt to build an FAI would start to do more (expected) good than harm? For simplicity, feel free to ignore the opportunity cost of spending financial and human resources on this project, and just consider the potential direct harmful effects, like accidentally creating an UFAI while experimenting to better understand AGI, or building a would-be FAI that turns out to be an UFAI due to a philosophical, theoretical or programming error, or leaking AGI advances that will allow others to build an UFAI, or starting an AGI arms race.

I have a serious concern that if SIAI ever manages to obtain abundant funding and a team of "pretty competent researchers" (or even "world-class talent", since I'm not convinced that even a team of world-class talent trying to build an FAI will do more good than harm), it will proceed with an FAI project without adequate analysis of the costs and benefits of doing so, or without continuously reevaluating the decision in light of new information. Do you think this concern is reasonable?

If so, I think it would help a lot if SIAI got into the habit of making its strategic thinking more transparent. It could post answers to questions like the ones I asked in the grandparent comment without having to be prompted. It could publish the reasons behind every major strategic decision, and the metrics it keeps to evaluate its initiatives. (One way to do this, if such strategic thinking often occurs or is presented at board meetings, would be to publish the meeting minutes, as I suggested in another comment.)

At what level of talent do you think an attempt to build an FAI would start to do more (expected) good than harm?

I'm not sure that scientific talent is the relevant variable here. More talented folk are more likely to achieve both positive and negative outcomes. I would place more weight on epistemic rationality, motivations (personality, background checks), institutional setup and culture, the strategy of first trying to get test the tractability of robust FAI theory and then advancing FAI before code (with emphasis on the more-FAI-less-AGI problems first), and similar variables.

Do you think this concern is reasonable?

Certainly it's a reasonable concern from a distance. Folk do try to estimate and reduce the risks you mentioned, and to investigate alternative non-FAI interventions. My personal sense is that these efforts have been reasonable but need to be bolstered along with the FAI research team. If it looks like a credible (to me) team may be assembled my plan would be (and has been) to monitor and influence team composition, culture, and exposure to information. In other words, I'd like to select folk ready to reevaluate as well as to make progress, and to work hard to build that culture as researchers join up.

If so, I think it would help a lot if SIAI got into the habit of making its strategic thinking more transparent.

I can't speak for everyone, but I am happy to see SIAI become more transparent in various ways. The publication of the strategic plan is part of that, and I believe Luke is keen (with encouragement from others) to increase communication and transparency in other ways.

publish the meeting minutes

This one would be a decision for the board, but I'll give my personal take again. Personally, I like the recorded GiveWell meetings and see the virtues of transparency in being more credible to observers, and in providing external incentives. However, I would also worry that signalling issues with a diverse external audience can hinder accurate discussion of important topics, e.g. frank discussions of the strengths and weaknesses of potential Summit speakers, partners, and potential hires that could cause hurt feelings and damage valuable relationships. Because of this problem I would be more wholehearted in supporting other forms of transparency, e.g. more frequent and detailed reporting on activities, financial transparency, the strategic plan, things like Luke's Q&A, etc. But I wouldn't be surprised if this happens too.

I'm not sure that scientific talent is the relevant variable here. More talented folk are more likely to achieve both positive and negative outcomes.

Let's assume that all the other variables are already optimized for to minimize the risk of creating an UFAI. It seems to me that the the relationship between the ability level of the FAI team and probabilities of the possible outcomes must then look something like this:

This chart isn't meant to communicate my actual estimates of the probabilities and crossover points, but just the overall shapes of the curves. Do you disagree with them? (If you want to draw your own version, click here and then click on "Modify This Chart".)

Folk do try to estimate and reduce the risks you mentioned, and to investigate alternative non-FAI interventions.

Has anyone posted SIAI's estimates of those risks?

I would also worry that signalling issues with a diverse external audience can hinder accurate discussion of important topics

That seems reasonable, and given that I'm more interested in the "strategic" as opposed to "tactical" reasoning within SIAI, I'd be happy for it to be communicated through some other means.

Do you disagree with them?

If we condition on having all other variables optimized, I'd expect a team to adopt very high standards of proof, and recognize limits to its own capabilities, biases, etc. One of the primary purposes of organizing a small FAI team is to create a team that can actually stop and abandon a line of research/design (Eliezer calls this "halt, melt, and catch fire") that cannot be shown to be safe (given limited human ability, incentives and bias). If that works (and it's a separate target in team construction rather than a guarantee, but you specified optimized non-talent variables) then I would expect a big shift of probability from "UFAI" to "null."

What I'm afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or the formalization of the notion of "safety" used by the proof is wrong. This kind of thing happens a lot in cryptography, if you replace "safety" with "security". These mistakes are still occurring today, even after decades of research into how to do such proofs and what the relevant formalizations are. From where I'm sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations. There's good recent review of the history of provable security, titled Provable Security in the Real World, which might help you understand where I'm coming from.

Your comment has finally convinced me to study some practical crypto because it seems to have fruitful analogies to FAI. It's especially awesome that one of the references in the linked article is "An Attack Against SSH2 Protocol" by W. Dai.

From where I'm sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations.

Correct me if I'm wrong, but it doesn't seem as though "proofs" of algorithm correctness fail as frequently as "proofs" of cryptosystem unbreakableness.

Where does your intuition that friendliness proofs are on the order of reliability of cryptosystem proofs come from?

Interesting question. I guess proofs of algorithm correctness fail less often because:

  1. It's easier to empirically test algorithms to weed out the incorrect ones, so there are fewer efforts to prove conjectures of correctness that are actually false.
  2. It's easier to formalize what it means for an algorithm to be correct than for a cryptosystem to be secure.

In both respects, proving Friendliness seems even worse than proving security.

What I'm afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or that the formalization of the notion of "safety" used by the proof is wrong.

Thanks for clarifying.

This kind of thing happens a lot in cryptography,

I agree.

Could you elaborate on the ability axis. Could you name some people that you perceive to be of world class ability in their field. Could you further explain if you believe that there are people who are sufficiently above that class.

For example, what about Terence Tao? What about the current SIAI team?

However, I would also worry that signalling issues with a diverse external audience can hinder accurate discussion of important topics

Basically it ensures that all serious discussion and decision making is made prior to any meeting in informal conversations so that the meeting sounds good. Such a record should be considered a work of fiction regardless of whether it is a video transcript or a typed document. (Only to the extent that the subject of the meeting matters - harmless or irrelevant things wouldn't change.)

Because of this problem I would be more wholehearted in supporting other forms of transparency, e.g. more frequent and detailed reporting on activities, financial transparency, the strategic plan, things like Luke's Q&A, etc. But I wouldn't be surprised if this happens too.

That's more like it!

Personally, I like the recorded GiveWell meetings and see the virtues of transparency in being more credible to observers, and in providing external incentives. However, I would also worry that signalling issues with a diverse external audience can hinder accurate discussion of important topics, e.g. frank discussions of the strengths and weaknesses of potential Summit speakers, partners, and potential hires that could cause hurt feelings and damage valuable relationships. Because of this problem I would be more wholehearted in supporting other forms of transparency, e.g. more frequent and detailed reporting on activities, financial transparency, the strategic plan, things like Luke's Q&A, etc. But I wouldn't be surprised if this happens too.

I'll take this opportunity to mention that I'm against publishing SIAI's board meeting minutes. First, for the reasons Carl gave above. Second, because then we'd have to invest a lot of time explaining the logic behind each decision, or else face waves of criticism for decisions that appear arbitrary when one merely publishes the decision and not the argument.

However, I'm definitely making big effort to improve SIAI transparency. Our new website (under development) has a page devoted to transparency, where you'll be able to find our strategic plan, our 990s, and probably other links. I'm also publishing the monthly progress reports, and recently co-wrote 'Intelligence Explosion: Evidence and Import', which for the first time (excepting Chalmers) summarizes many of our key pieces of reasoning with the clarity of mainstream academic form. We're also developing an annual report, and I'm working toward developing some other documents that will make SIAI strategy more transparent. But all this takes time, especially when starting from pretty close to 0 on transparency, and having lots of other problems to fix, too.

Second, because then we'd have to invest a lot of time explaining the logic behind each decision, or else face waves of criticism for decisions that appear arbitrary when one merely publishes the decision and not the argument.

Are the arguments not made during the board meetings? Or do you guys talk ahead of time and just formalize the decisions during the board meetings?

In any case, I think you should invest more time explaining the logic behind your decisions, and not just make the decisions themselves more transparent. If publishing board meeting minutes is not the best way to do that, then please think about some other way of doing it. I'll list some of the benefits of doing this, in case you haven't thought of some of them:

  • encourage others to emulate you and think strategically about their own choices
  • allow outsiders to review your strategic thinking and point out possible errors
  • assure donors and potential donors that there is good reasoning behind your strategic decisions
  • improve exchange of strategic ideas between everyone working on existential risk reduction

The arguments are strewn across dozens of conversations in and out of board meetings (mostly out).

As for finding other ways to explain the logic behind our decisions, I agree, and I'm working on it. One qualification I would add, however, is that I predict more benefit to my strategic thinking from one hour with Paul Christiano and one hour with Nick Bostrom than from spending four hours to write up my strategic thinking on subject X and publishing it so that passersby can comment on it. It takes a lot of effort to be so well-informed about these issues that one can offer valuable strategic advice. But for some X we have already spent those many productive hours with Christiano and Bostrom and so on, and it's a good marginal investment to write up our strategic thinking on X.

This reminds me a bit of Eliezer's excuse when he was resisting calls for him to publish his TDT ideas on LW:

Unfortunately this "timeless decision theory" would require a long sequence to write up

I suggest you may be similarly overestimating the difficulty of explaining your strategic ideas/problems to a sufficiently large audience to get useful feedback. Why not just explain them the same way that you would explain to Christiano and Bostrom? If some among the LW community don't understand, they can ask questions and others could fill them in.

The decision theory discussions on LW generated significant progress, but perhaps more importantly created a pool of people with strong interest in the topic (some of whom ended up becoming your research associates). Don't you think the same thing could happen with Singularity strategies?

I suggest you may be similarly overestimating the difficulty of explaining your strategic ideas/problems to a sufficiently large audience to get useful feedback...

Yes, I would get some useful feedback, but I also predict a negative effect: When people don't have enough background knowledge to make what I say sound reasonable to them, I'll get penalized for sounding crazy in the same way that I'm penalized when I try to explain AGI to an intuitive Cartesian dualist.

By penalized, I mean something like the effect that Scott Adams (author of Dilbert) encountered while blogging:

I hoped that people who loved the blog would spill over to people who read Dilbert, and make my flagship product stronger. Instead, I found that if I wrote nine highly popular posts, and one that a reader disagreed with, the reaction was inevitably “I can never read Dilbert again because of what you wrote in that one post.” Every blog post reduced my income, even if 90% of the readers loved it. And a startling number of readers couldn’t tell when I was serious or kidding, so most of the negative reactions were based on misperceptions.

Anyway, you also wrote:

The decision theory discussions on LW generated significant progress, but perhaps more importantly created a pool of people with strong interest in the topic (some of whom ended up becoming your research associates). Don't you think the same thing could happen with Singularity strategies?

If so, then not for the same reasons. I think people got interested in decision theory because they could see results. But it's hard to feel you've gotten a result in something like strategy, where we may never know whether or not one strategy was counterfactually better, or at least won't be confident about that for another 5 years. Decision theory offers the opportunity for results that most people in the field can agree on.

The "results" in decision theory we've got so far are so tenuous that I believe their role is primarily to somewhat clarify the problem statement for what remains to be done (a big step compared to complete confusion in the past, but not quite clear (-ly motivated) math). The ratchet of science hasn't clicked yet, even if rational evidence is significant, which is the same problem you voice for strategy discussion.

If so, then not for the same reasons. I think people got interested in decision theory because they could see results. But it's hard to feel you've gotten a result in something like strategy, where we may never know whether or not one strategy was counterfactually better, or at least won't be confident about that for another 5 years. Decision theory offers the opportunity for results that most people in the field can agree on.

At FHI they sometimes sit around a whiteboard and discuss weird AI-boxing ideas or weird acquire-relevant-influence ideas, and feel as though they are making progress when something sounds more-promising than usual, leads to other interesting ideas, etc. We could too. I suspect it would create a similar set of interested people capable of having strategy ideas, though probably less math-inclined than the decision theory folk, and with more surrounding political chaos.

Okay; that changes my attitude a bit. But FHI's core people are unlikely to produce the Scott Adams effect in response to strategic discussion. Do you or Wei think it's reasonable for me to worry about that when discussing strategy in detail amongst, say, LWers — most of whom have far less understanding of the relevant issues (by virtue of not working on them every weeks for months or years)?

I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some in the SingInst fan base. It is possible that this is reason enough to avoid such discussion; my guess is that it is not, but I could easily be wrong here, and many think it is.

I was mostly responding to the [paraphrased] "we can't discuss it publicly because it would take too long", and "it wouldn't work to create an informed set of strategists because there wouldn't be a sense of progress"; I've said sentences like that before, and, when I said them, they were excuses/rationalizations. My actual reason was something like: "I'd like to avoid alienating people, and I'd like to avoid starting conflicts whose outcomes I cannot predict."

I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.

It'll alienate some SingInst-ers? That's a troubling sign. Aren't most SingInst-ers at least vaguely competent rationalists who are actually interested in Singularity options? Yet they will be alienated by mere theoretical exploration of the domain? What has your HR department been doing?

I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.

From a public relations viewpoint this sentence alone is worse than any particular detail could possible be. Because it not only allows, but forces people to imagine what horrible strategies you could possible explore and pursue. Strategies that are bad enough that you not only believe that even the community most closely related to SI would be alienated by them, but that you are also unable to support those explorations with rational arguments.

Personally I don't want to contribute anything to an organisation which admits to explore strategies that are unacceptable by most people. And I wouldn't suggest anyone else to do so. Yet I would neither be willing to to contribute if you were secretive about your strategic explorations. I just don't trust you people, I never did. And I am still horrified by how people who actually believe that what you are saying is true and possible are willing to trust your small group blindly to shape the universe.

A paperclip maximizer is just a transformation of the universe into a state of almost no suffering. But a friendly AI that isn't quite friendly, or one that is biased by the ideas of a small group of abnormal and psychopathic people, could increase negative utility dramatically.

I agree that detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers.

From a public relations viewpoint this sentence alone is worse than any particular detail could possible be.

No, I don't agree with this. I predict that whatever strategies AnnaSalamon has in mind would alienate someone unless those strategies were very anodyne or vague. If the sample of listeners is big enough there will usually be someone to take issue with just about any idea one voices.

Because it not only allows, but forces people to imagine what horrible strategies you could possible explore and pursue.

How true is that? In my case it just makes me try to imagine whether there are any strategies AnnaSalamon could propose that wouldn't perturb anyone. When it comes to the singularity I draw a blank, as it's a big enough issue that just about anything she or I or you could say about it will bother somebody.

I disagree that AS's weak statement that "detailed exploration of Singularity strategies would alienate some LW-ers" tells you very much at all about the nature of those strategies. I expect most conceivable strategies would piss someone off, so I'd say her claim communicates less than 1 bit of information about those strategies.

Based on the rest of your comment I think you've read AnnaSalamon's statement as one implying that SI's strategies are unusually objectionable or alienating; maybe that's what she meant but it doesn't seem to be what she wrote.

Based on the rest of your comment I think you've read AnnaSalamon's statement as one implying that SI's strategies are unusually objectionable or alienating;

Which is the right strategy. Humans are unfriendly. The group around AnnaSalamon is trying to take over and shape the universe according to their idea of what is right and good.

If you are making decisions based on the worst case scenario - as you are clearly doing when it comes to artificial intelligence, if you support friendly AI research - then you should do the same when it comes to human beings.

It isn't enough to talk to them, to review their output and conclude that they are most likely friendly. Doing so and contributing money is aking to letting an AI, that is not provably friendly, out of the box. They either have to prove that they are friendly or make all their work transparent. Otherwise the right thing to do is to label them as terrorists and tell them to fuck off.

You could just as reasonably have written that comment if AnnaSalamon had never posted in this thread, though. My argument here isn't with your broader attitude to FAI/SI, it's that I think it's unfair to pounce on a very low-information statement like "detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers" and write it off as terrible PR that implies SI's considering horrible strategies.

...it's unfair to pounce on a very low-information statement like "detailed exploration of Singularity strategies would alienate some LW-ers, and some SingInst-ers"...

I think that it does convey quite a lot information. I already know that people associated with SI and LW accept a lot of strategic thinking that would be considered everything from absurd to outright psychopathic within different circles. If she says that the strategies they explore would even alienate some people associated with LW, let alone SI, then that's really bad.

I think you underestimate the amount of information that a natural language sentence can carry and signal.

...and write it off as terrible PR that implies SI's considering horrible strategies.

It is abundantly clear that SI is really bad at PR. I assign a high probability to the possibility that her and other members of the SI are revealing a lot of what is going on behind the scenes by being careless about their communication.

If she says that the strategies they explore would even alienate some people associated with LW, let alone SI, then that's really bad.

I disagree. LWers have a range of opinions on AI & the singularity (yes, those opinions are less diverse than the general population's, but I don't see them being sufficiently less diverse for your argument to go through). There are already quite a few LWers who're SI sceptics to a degree. I'm also sure there are LWers who, at the moment, basically agree with SI but would spurn it if it announced a more specific strategy for handling AI/the singularity. I think this would be true for most possible strategies SI could announce. I'd expect the same basic argument to hold for SI (though I'm less sure because I know less about SI).

I think you underestimate the amount of information that a natural language sentence can carry and signal.

Quite possible! But in any case, a sentence can carry lots of information about one thing, but not another. One has to look at the probability of a sentence or claim conditional on a specific thing. As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to P(AS says some people would be alienated | SI has an un-terrible secret strategy), so the likelihood ratio is about one, and AnnaSalamon's belief discriminates poorly between those two particular hypotheses.

It is abundantly clear that SI is really bad at PR. I assign a high probability to the possibility that her and other members of the SI are revealing a lot of what is going on behind the scenes by being careless about their communication.

Plausible, but I doubt it's true for this specific example.

As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to P(AS says some people would be alienated | SI has an un-terrible secret strategy), so the likelihood ratio is about one...

If I was to accept your estimation then the associated utility of P(people alienated | terrible strategy) and P(people alienated | un-terrible strategy) would force you to act according to the first possibility.

I don't follow. Do you mean that the potential disutility of SI having a terrible strategy is so much bigger than the potential utility of SI having an un-terrible strategy that, given equal likelihoods, I should act against SI? If so, I disagree.

Quite possible! But in any case, a sentence can carry lots of information about one thing, but not another. One has to look at the probability of a sentence or claim conditional on a specific thing. As I see it, P(AS says some people would be alienated | SI has a terrible secret strategy) is about equal to ...

Blah blah blah...full stop. We're talking about the communication of primates with other primates. Evolution honed your skills to detect the intention and possible bullshit in the output of other primates. Use your intuition!

I disagree. LWers have a range of opinions on AI & the singularity ...

I am not sure what you are getting at. If she thinks that there are strategies that should be kept secrete for political reasons or whatever and admits it, that's bad from any possible viewpoint.

Use your intuition!

I have. My gut didn't raise a red flag when I read AnnaSalamon's post, but it did when I read yours.

I am not sure what you are getting at.

I was giving a reason for my claim that there'd be someone on LW/in SI who'd be alienated by all but the blandest of strategies.

If she thinks that there are strategies that should be kept secrete for political reasons or whatever and admits it, that's bad from any possible viewpoint.

Maybe she thinks that and maybe she doesn't, but either way she didn't admit it. (At least not in the post I'm talking about. I haven't read AS's whole comment history.)

To my intuitions you sound exactly like a bitter excluded nobody attacking someone successful and popular. You DON'T talk like someone who sees through the lies of an evil greedy deceiver and honestly wants people to examine what he says and come to the correct opinion.

It isn't enough to talk to them, to review their output and conclude that they are most likely friendly. Doing so and contributing money is aking to letting an AI, that is not provably friendly, out of the box. They either have to prove that they are friendly or make all their work transparent. Otherwise the right thing to do is to label them as terrorists and tell them to fuck off.

I think the "mostly harmless" phrase still applies. These look like kids with firecrackers. The folk we should watch out for are more likely to be the Chinese, the military, hedge funds - and so on.

Maybe you can give an example of the kind of thing that you're worried about? What might you say that could get you penalized for sounding crazy?

(Maybe we could take this discussion private; I'm also curious what kinds of questions these considerations apply to.)

...most of whom have far less understanding of the relevant issues (by virtue of not working on them every weeks for months or years)?

Right, better to hide in your ivory tower only talking to people who agree with you. A perfect recipe to reinforce crazy ideas and amplify any biases.

signalling issues with a diverse external audience can hinder accurate discussion

Minutes can be much more general than (video) transcripts.

I would be surprised if the optimal solution isn't a third alternative and is instead total secrecy or manipulable complete transcription.

Eliezer's investment into OB/LW apparently hasn't returned even a single full-time FAI researcher...

I believe that the SIAI has has been very successful in using OB/LW to not only rise awareness of risks from AI but to lend credence to the idea. From the very beginning I admired that feat.

Eliezer Yudkowsky's homepage is a perfect example of its type. Just imagine he would have concentrated solely on spreading the idea of risks from AI and the necessity of a friendliness theory. Without any background relating to business or an academic degree, to many people he would appear to be yet another crackpot spreading prophecies of doom. But someone who is apparently well-versed in probability theory, who studied cognitive biases and tries to refine the art of rationality? Someone like that can't possible be deluded enough to hold some complex beliefs that are completely unfounded, there must be more to it.

That's probably the biggest public relations stunt in the history of marketing extraordinary ideas.

Certainly, by many metrics LW can be considered wildly successful, and my comment wasn't meant to be a criticism of Eliezer or SIAI. But if SIAI was intending to build an FAI using its own team of FAI researchers, then at least so far LW has failed to recruit them any such researchers. I'm trying to figure out if this was the expected outcome, and if not, how updating on it has changed SIAI's plans. (Or to remind them to update in case they forgot to do so.)

Most of your analysis seems right, but the last sentence seems likely to be off. There have been a lot of clever PR stunts in history.

There have been a lot of clever PR stunts in history.

Most of them have not been targeting smart and educated nonconformists. Eliezer successfully changed people's mind by installing a way of thinking (a framework of heuristics, concepts and ideas) that is fine-tuned to non-obviously culminate in one inevitable conclusion, that you want to contribute money to his charity because it is rational to do so.

Take a look at the sequences in the light of the Singularity Institute. Even the Quantum Sequence helps to hit a point home that is indispensable to convince people, who would otherwise be skeptical, that it is rational to take risks from AI seriously. The Sequences promulgate that logical implications of general beliefs you already have do not cost you extra probability and that it would be logically rude to demand some knowably unobtainable evidence.

A true masterpiece.

I have informally been probing smart people I meet whether they're aware of LW. The answers have been surprisingly high number of 'Yes'. I expect this is already making impact on, at the very least, a less risky distribution of funding sources, and probably a good increase in funding once some of them (as many are in startups) will hit paydirt.

He also is reluctant to release work that he thinks is relevant to building AGI.

Sooner or later he will have to present some results. As the advent of AGI is moving closer people will start to panic and demand hard evidence that the SIAI is worth their money. Even someone who has published a lot of material on rationality and a popular fanfic will run out of credit and people will stop taking his word for it.

Luke discussed this a while back here.

I agree that this is an important question.

the publications listed on the SIAI page--including TDT--are mostly high-level arguments that don't meet this standard. How do you plan to change this?

This is my favorite of the questions so far.

How are you going to address the perceived and actual lack of rigor associated with SIAI?

A clarifying question. By 'rigor', do you mean the kind of rigor that is required to publish in journals like Risk Analysis or Minds and Machines, or do you mean something else by 'rigor'?

A clarifying question. By 'rigor', do you mean the kind of rigor that is required to publish in journals like Risk Analysis or Minds and Machines, or do you mean something else by 'rigor'?

I mean the kind of precise, mathematical analysis that would be required to publish at conferences like NIPS or in the Journal of Philosophical Logic. This entails development of technical results that are sufficiently clear and modular that other researchers can use them in their own work. In 15 years, I want to see a textbook on the mathematics of FAI that I can put on my bookshelf next to Pearl's Causality, Sipser's Introduction to the Theory of Computation and MacKay's Information Theory, Inference, and Learning Algorithms. This is not going to happen if research of sufficient quality doesn't start soon.

In 15 years, I want to see a textbook on the mathematics of FAI that I can put on my bookshelf next to Pearl's Causality, Sipser's Introduction to the Theory of Computation and MacKay's Information Theory, Inference, and Learning Algorithms.

My day brightened imagining that!

Thanks for clarifying.

Addendum: Since the people who upvoted the question were in the same position as you with respect to its interpretation, it would be good to not only address my intended meaning, but all major modes of interpretation.

By 'rigor', do you mean the kind of rigor that is required to publish in journals like Risk Analysis or Minds and Machines, or do you mean something else by 'rigor'?

I can't speak for the original questioner, but take for example the latest post by Holden Karnofsky from GiveWell. I would like to see a response by the SIAI that applies the same amount of mathematical rigor to show that it actually is the rational choice from the point of view of charitable giving.

A potential donor might currently get the impression that the SIAI has written a lot of rather colloquial posts on rationality than rigorous papers on the nature of AGI, not to mention friendly AI. In contrast, GiveWell appears to concentrate on their main objective, the evaluation of charities. In doing so they are being strictly technical, an appraoch that introduces a high degree of focus by tabooing colloquial language and thereby reducing ambiguity, while allowing others to review their work.

Some of the currently available papers might, in a less favorably academic context, be viewed as some amount of handwaving mixed with speculations.

I'd like to answer (on video) submitted questions from the Less Wrong community just as Eliezer did two years ago.

That was the most horribly designed thing I've ever seen anyone do on LessWrong, as I once described here so please, please, no video.

The questions are text. Have your answer on text too, so that we can actually read them -- unless there's some particular question which would actually be enhanced by the usage of video, (e.g. you'd like to show an animated graph or a computer simulation or something)

If there's nothing I can say to convince you against using video, then I beg you to atleast take the time to read my more specific problems in the link above and correct those particular flaws - a single audio that we can atleast play and listen in the background, while we're doing something else, instead of 30 videos that we must individually click. If not that, atleast a clear description of the questions on the same page (AND repeated clearly on the audio itself), so that we can see the questions that interest us, instead of a link to a different page.

But please, just consider text instead. Text has the highest signal-to-noise ratio. We can actually read it in our leisure. We can go back and forth and quote things exactly. TEXT IS NIFTY.

I disagree completely, as video has value not present in text, and text is easily derived from video. If this has not been done for Eliezer's videos, I volunteer to transcribe them - please let me know.

I just tried to find a transcript for Eliezer's Q&A and couldn't find one. So I'm taking you up on your offer!

Also, video is easily derived from text and I would actually enjoy watching a SingInst Q&A made with that sort of app :-)

Looks like you're right. I commit to working on this over the next few weeks. Please check in with me every so often (via comment here would be fine) to gauge my progress and encourage completion.

It's approximately 120 minutes of video; taking a number from wikipedia gives me 150 spoken wpm, divided by my typing wpm gives me about 6 hours, which will be optimistic - let's double it to 12, at let's say an average of 30 mins per day gives me 24 days. Let's see how it goes!

Checking in. Do you have the first 750 words done?

I have the first four, and six of the shortest answers done, so yes. I had a lot of spare time yesterday so I thought I'd get a head start. Today may be similar.

I am now roughly 60% done. I've been spending more time each day than I anticipated; I have been known to overcompensate for the planning fallacy :)

Relative to manifesting video of the person speaking the answers in a genuine manner after the fact, yes. But point taken, the irony of manually transcribing videos from an AI researcher is not lost on me. I feel somewhat like a monk in the Bayesian monastery.

Why not just play the audio to something like the Dragon Dictation app on an iPhone and then go back and proof it?

I'm skeptical of the time it would save. The app won't work for the length of the videos, but if you're aware of another great, free program, let me know.

What would the SIAI do given various amounts of money? Would it make a difference if you had 10 or 100 million dollars at your disposal, would a lot of money alter your strategic plan significantly?

The staff and leadership at the SIAI seems to be undergoing a lot of changes recently. Is instability in the organisation something to be concerned about?

What is each member of the SIAI currently doing and how is it related to friendly AI research?

The Team page can answer much of this question. Is there any staff member in particular for whom the connection between their duties and our mission is unclear?

(Carl isn't on the page yet; we need to get his photo.)

The Team page can answer much of this question. Is there any staff member in particular for whom the connection between their duties and our mission is unclear?

Louie Helm is Singularity Institute's Director of Development. He manages donor relations, grant writing, and talent recruitment.

Here are some of the actions that I would take as a director of development:

  • Talk to Peter Thiel and ask him why he donated more money to the Seasteading Institute than the SIAI.
  • Sit down with other SIAI members and ask what talents we need so I can actually get in touch with them.
  • Visit various conferences and ask experts how they would use their expertise if they were told to ensure the safety of artificial general intelligence.

Michael Anissimov is responsible for compiling, distributing, and promoting SIAI media materials.

What I would do:

  • Ask actual media experts what they would do, like those who created the creationist viral video Expelled or the trailer for the book You Are Not So Smart.
  • Talk to Kurzweil if he would be willing to concentrate more strongly on the negative effects of a possible Singularity and promote the Singularity Institute.
  • I would ask Peter Thiel and Jaan Tallinn if they could actually use their influence or companies to promote the Singularity Institute.
  • Talk with other members about the importance of public relations and teach them how to deal with the media.

Anna Salamon is a full-time SIAI researcher.

What is she researching right now? With due respect, but the Uncertain Future web project doesn't look like something that a researcher, who is capable of making progress on the FAI problem, could work 3 years on.

Eliezer Yudkowsky is the foremost researcher on Friendly AI and recursive self-improvement.

He's still writing his book on rationality? How is it going? Is he planning a book tour? Does he already know who he is going to send the book for free, e.g. Richards Dawkins or other people who could promote it on their blog?

Edwin Evans is the Chairman of the Singularity Institute Board of Directors

No clue what he is, or could be doing right now.

Ray Kurzweil

It looks like he's doing nothing except being part of the team page.

Amy Willey, J.D., is the Singularity Institute's Chief Operating Officer, and is responsible for institute operations and legal matters.

What I would do:

  • Try to figure out and make a detailed plan on how to stop possible dangerous AGI projects by all legal means (there are various researchers who believe that superintelligence could happen before 2030).
  • Devise a plan on how to deal with legal challenges arising from possible terrorist attacks done by people who loosely associated themselves with the mission of the SIAI, without its knowledge. For example how to deal with a house search.

Michael Vassar is SIAI's President, and provides overall leadership of the SIAI

As a president, one of the first actions I would take is to talk with everyone about the importance of data security. I would further make sure that there are encrypted backups, of my organisations work, on different continents and under different jurisdictions to make sure that various kinds of catastrophes, including a obligation to disclosure by a government, can be mitigated or avoided.

A lot of Eliezer's work has been not at all related strongly to FAI but has been to popularizing rational thinking. In your view, should the SIAI focus exclusively on AI issues or should it also care about rational issues? In that context, how does Eliezer's ongoing work relate to the SIAI?

In general, what will you be doing as Executive Director?

(This might be a question you could answer briefly as a reply to this comment.)

And how will your duties differ from those of the President.

Congrats Luke !

Just a form/media comment : I would personally greatly prefer a text Q&A page rather than a video, for many reasons (my understanding of written English is higher than of spoken English, text is easier to re-read or read at your own speed, much less intrusive media that I can for example read during small breaks at work while I can't for video, poor Internet bandwidth at home making downloading video always painful to me, ...).

One serious danger for organizations is that they can easily outlive their usefulness or can convince themselves that they are still relevant when they are not. Essentially this is a form of lost purpose. This is not a bad thing if the organizations are still doing useful work, but this isn't always the case. In this context, are there specific sets of events (other than the advent of a Singularity) which you think will make the SIAI need to essentially reevaluate its goals and purpose at a fundamental level?

Congratulations, but why do you think your comparative advantage lies in being an executive director? Won't that cut into your time budget for reading, writing, and thinking?

To the extent that SIAI intends to work directly on FAI, potential donors (and many others) need to evaluate not only whether the organization is competent, but whether it is completely dedicated to its explicitly altruistic goals.

What is SIAI doing to ensure that it is transparently trustworthy for the task it proposes?

(I'm more interested in structural initiatives than in arguments that it'd be silly to be selfish about Singularity-sized projects; those arguments are contingent on SIAI's presuppositions, and the kind of trustworthiness I'm asking about encompasses the veracity of SIAI on these assumptions.)

For example, have we heard anything about that big embezzlement?

Some of the money has been recovered. The court date that concerns most of the money is currently scheduled for January 2012.

As I understand it, we won a stipulated judgment for repayment of $40k+ of it. Another court date has been scheduled (I think for late March?) to give us a chance to argue for the rest of what we're owed.

Late March has passed. How did things pan out?

We won some more repayment in another stipulated judgment and there's another court date this month.

Good question. And for people who missed it, this refers to money that was reported stolen on SI's tax documents a few years ago. (relevant thread)

I'm more interested in structural initiatives

Can you give any examples of what you're thinking of, so I can be clearer about what you have in mind when you ask your question?

I'm actually not coming up with any- it seems to be a tough problem. Here's an elaborate hypothetical that I'm not particularly worried about, but which serves as a case study:

Suppose that Robin Hanson is right about the Singularity (no discontinuity, no singleton, just rapid economic doubling until technology reaches physical limits, at which point it's a hardscrapple expansion through the future lightcone for those rich enough to afford descendants), and that furthermore, EY knows it and has been trying to deceive the rest of us in order to fund an early AI, and thus grab a share of the Singularity pie for himself and a few chosen friends.

The thing that makes this seem implausible right now are that the SIAI people I know don't seem to be the sort of people who are into long cons, and also, their object-level arguments about the Singularity make sense to me. But, uh, I'm not sure that I can stake the future on my ability to play a game of Mafia. So I'm wondering if SIAI has come up with any ideas (stronger than a mission statement) to make credible their dedication to a fair Singularity.

Right.

I haven't devoted much time to this because I don't think anybody who has ever interacted with us in person has ever thought this was likely, and I'm not sure if anyone even on the internet has ever made the accusation - though of course some have raised the vague possibility, as you have. In other words, I doubt this worry is anyone's true rejection, whereas I suspect the lack of peer-reviewed papers from SIAI is many people's true rejection.

Skepticism about SIAI's competence screens off skepticism about SIAI's intentions, so of course that's not the true rejection for the vast majority of people. But it genuinely troubles me if nobody's thought of the latter question at all, beyond "Trust us, we have no incentive to implement anything but CEV".

If I told you that a large government or corporation was working hard on AGI plus Friendliness content (and that they were avoiding the obvious traps), even if they claimed altruistic goals, wouldn't you worry a bit about their real plan? What features would make you more or less worried?

I think the key point is that we're not there yet. Whatever theoretical tools we shape now are either generally useful, or generally useless, irrespective of considerations of motive; currently relevant question is (potential) competence. Only at some point in the (moderately distant) future, conditional on current and future work bearing fruit, motive might become relevant.

What features would make you more or less worried?

I'd worry about selfish institutional behavior, or explicit identification of the programmers' goals with the nation/corporation's selfish interests. Also, I guess, belief in the moral infallibility of some guru.

Otherwise I wouldn't worry about motives, not unless I thought one programmer could feasibly deceive the others and tell the AI to look only at this person's goals. Well, I have to qualify that -- if everyone in the relevant subculture agreed on moral issues and we never saw any public disagreement on what the future of humanity should look like, then maybe I'd worry. That might give each of them a greater expectation of getting what they want if they go with a more limited goal than CEV.

An "outside view" might be to put the SI in the reference class of "groups who are trying to create a utopia" and observe that previous such efforts that have managed to gain momentum have tended to make the world worse.

I think the reality is more complicated than that, but that might be part of what motivates these kind of questions.

I think the biggest specific trust-related issue I have is with CEV - getting the utility function generation process right is really important, and in an optimal world I'd expect to see CEV subjected to a process of continual improvement and informed discussion. I haven't seen that, but it's hard to tell whether the SI are being overly protective of their CEV document or whether it's just really hard getting the right people talking about it in the right way.

Am I to take this as a general answer to the overall question of trustworthiness or is this intended just as an answer to the specific example?

Suppose that Robin Hanson is right about the Singularity (no discontinuity, no singleton, just rapid economic doubling until technology reaches physical limits, at which point it's a hardscrapple expansion through the future lightcone for those rich enough to afford descendants), and that furthermore, EY knows it and has been trying to deceive the rest of us in order to fund an early AI, and thus grab a share of the Singularity pie for himself and a few chosen friends.

It would be clearer to say that Robin is right about the future, that there will not be a singularity. A hardscrapple race through the frontier basically just isn't one.

If you want to hypothesize that SingInst has secrets plus an evil plan, the secrets and plan have to combine in such a way that it's a good plan.

In June you indicated that exciting developments are happening right now but that it will take a while for things to happen and be announced. Are those developments still in progress?