Richard_Ngo

Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.com

Sequences

Understanding systematization
Stories
Meta-rationality
Replacing fear
Shaping safer goals
AGI safety from first principles

Wiki Contributions

Comments

Relevant: my post on value systematization

Though I have a sneaking suspicion that this comment was originally made on a draft of that?

I disagree with the first one. I think that the spectrum of human-level AGI is actually quite wide, and that for most tasks we'll get AGIs that are better than most humans significantly before we get AGIs that are better than all humans. But the latter is much more relevant for recursive self-improvement, because it's bottlenecked by innovation, which is driven primarily by the best human researchers. E.g. I think it'd be pretty difficult to speed up AI progress dramatically using millions of copies of an average human.

Also, by default I think people talk about FOOM in a way that ignores regulations, governance, etc. Whereas in fact I expect these to put significant constraints on the pace of progress after human-level AGI.

If we have millions of copies of the best human researchers, without governance constraints on the pace of progress... Then compute constraints become the biggest thing. It seems plausible that you get a software-only singularity, but it also seems plausible that you need to wait for AI innovation of new chip manufacturing to actually cash out in the real world.

I broadly agree with the second one, though I don't know how many people there are left with 30-year timelines. But 20 years to superintelligence doesn't seem unreasonable to me (though it's above my median). In general I've updated lately that Kurzweil was more right than I used to think about there being a significant gap between AGI and ASI. Part of this is because I expect the problem of multi-agent credit assignment over long time horizons to be difficult.

In the last 24 hours. I read fast (but also skipped the last third of the Doomsday Machine).

This comment prompted me to read both Secrets and also The Doomsday Machine by Ellsberg. Both really great, highly recommend.

I think "being the kind of agent who survives the selection process" can sometimes be an important epistemic thing to consider

I'm not claiming it's zero information, but there are lots of things that convey non-zero information which it'd be bad to set disclosure norms based on. E.g. "I've only ever worked at nonprofits" should definitely affect your opinion of someone's epistemics (e.g. when they're trying to evaluate corporate dynamics) but once we start getting people to disclose that sort of thing there's no clear stopping point. So mostly I want the line to be "current relevant conflicts of interest".

But I also think that one of the reasons why Richard still works at OpenAI is because he's the kind of agent who genuinely believes things that tend to be pretty aligned with OpenAI's interests, and I suspect his perspective is informed by having lots of friends/colleagues at OpenAI. 

Added a disclaimer, as suggested. It seems like a good practice for this sort of post. Though note that I disagree with this paragraph; I don't think "being the kind of agent who X" or "being informed by many people at Y" are good reasons to give disclaimers. Whereas I do buy that "they filter out any ideas that they have that could get them in trouble with the company" is an important (conscious or unconscious) effect, and worth a disclaimer.

I've also added this note to the text:

Note that most big companies (especially AGI companies) are strongly structurally power-seeking too, and this is a big reason why society at large is so skeptical of and hostile to them. I focused on AI safety in this post both because companies being power-seeking is an idea that's mostly "priced in", and because I think that these ideas are still useful even when dealing with other power-seeking actors.

No legible evidence jumps to mind, but I'll keep an eye out. Inherently this sort of thing is pretty hard to pin down, but I do think I'm one of the handful of people that most strongly bridges the AI safety and accelerationist communities on a social level, and so I get a lot of illegible impressions.

Presumably, at some point, some groups start advocating for specific policies that go against the e/acc worldview. At that point, it seems like you get the organized resistance.

My two suggestions:

  1. People stop aiming to produce proposals that hit almost all the possible worlds. By default you should design your proposal to be useless in, say, 20% of the worlds you're worried about (because trying to get that last 20% will create really disproportionate pushback); or design your proposal so that it leaves 20% of the work undone (because trusting that other people will do that work ends up being less power-seeking, and more robust, than trying to centralize everything under your plan). I often hear people saying stuff like "we need to ensure that things go well" or "this plan needs to be sufficient to prevent risk", and I think that mindset is basically guaranteed to push you too far towards the power-seeking end of the spectrum. (I've added an edit to the end of the post explaining this.)
  2. As a specific example of this, if your median doom scenario goes through AGI developed/deployed by centralized powers (e.g. big labs, govts) I claim you should basically ignore open-source. Sure, there are some tail worlds where a random hacker collective beats the big players to build AGI; or where the big players stop in a responsible way, but the open-source community doesn't; etc. But designing proposals around those is like trying to put out candles when your house is on fire. And I expect there to be widespread appetite for regulating AI labs from govts, wider society, and even labs themselves, within a few years' time, unless those proposals become toxic in the meantime—and making those proposals a referendum on open-source is one of the best ways I can imagine to make them toxic.

(I've talked to some people whose median doom scenario looks more like Hendrycks' "natural selection" paper. I think it makes sense by those people's lights to continue strongly opposing open-source, but I also think those people are wrong.)

I think that the "we must ensure" stuff is mostly driven by a kind of internal alarm bell rather than careful cost-benefit reasoning; and in general I often expect this type of motivation to backfire in all sorts of ways.

In a world where AI safety folks didn't say/do anything about OS, I would still suspect clashes between e/accs and AI safety folks.

There's a big difference between e/acc as a group of random twitter anons, and e/acc as an organized political force. I claim that anti-open-source sentiment from the AI safety community played a significant role (and was perhaps the single biggest driver) in the former turning into the latter. It's much easier to form a movement when you have an enemy. As one illustrative example, I've seen e/acc flags that are a version of the libertarian flag saying "come and take it [our GPUs]". These are a central example of an e/acc rallying cry that was directly triggered by AI governance proposals. And I've talked to several principled libertarians who are too mature to get sucked into a movement by online meme culture, but who have been swung in that direction due to shared opposition to SB-1047.

Consider, analogously: Silicon Valley has had many political disagreements with the Democrats over the last decade—e.g. left-leaning media has continuously been very hostile to Silicon Valley. But while the incentives to push back were there for a long time, the organized political will to push back has only arisen pretty recently. This shows that there's a big difference between "in principle people disagree" and "actual political fights".

I think it's extremely likely this would've happened anyways. A community that believes passionately in rapid or maximally-fast AGI progress already has strong motivation to fight AI regulations.

This reasoning seems far too weak to support such a confident conclusion. There was a lot of latent pro-innovation energy in Silicon Valley, true, but the ideology it gets channeled towards is highly contingent. For instance, Vivek Ramaswamy is a very pro-innovation, anti-regulation candidate who has no strong views on AI. If AI safety hadn't been such a convenient enemy then plausibly people with pro-innovation views would have channeled them towards something closer to his worldview.

Load More