Daniel Kokotajlo

Philosophy PhD student, worked at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Not sure what I'll do next yet. Views are my own & do not represent those of my current or former employer(s). I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html

Some of my favorite memes:


(by Rob Wiblin)

My EA Journey, depicted on the whiteboard at CLR:

(h/t Scott Alexander)


 

Sequences

Agency: What it is and why it matters
AI Timelines
Takeoff and Takeover in the Past and Future

Wiki Contributions

Comments

I agree that 0.7% is the number to beat for people who mostly focus on helping present humans and who don't take acausal or simulation argument stuff or cryonics seriously. I think that even if I was much more optimistic about AI alignment, I'd still think that number would be fairly plausibly beaten by a 1-year pause that begins right around the time of AGI. 

What are the mechanisms people have given and why are you skeptical of them?

Perhaps. I don't know much about the yields and so forth at the time, nor about the specific plans if any that were made for nuclear combat.

But I'd speculate that dozens of kiloton range fission bombs would have enabled the US and allies to win a war against the USSR. Perhaps by destroying dozens of cities, perhaps by preventing concentrations of defensive force sufficient to stop an armored thrust.

OK, thanks for clarifying.

Personally I think a 1-year pause right around the time of AGI would give us something like 50% of the benefits of a 10-year pause. That's just an initial guess, not super stable. And quantitatively I think it would improve overall chances of AGI going well by double-digit percentage points at least. Such that it makes sense to do a 1-year pause even for the sake of an elderly relative avoiding death from cancer, not to mention all the younger people alive today.

Big +1 to that. Part of why I support (some kinds of) AI regulation is that I think they'll reduce the risk of totalitarianism, not increase it.

So, it sounds like you'd be in favor of a 1-year pause or slowdown then, but not a 10-year?

(Also, I object to your side-swipe at longtermism. Longtermism according to wikipedia is "Longtermism is the ethical view that positively influencing the long-term future is a key moral priority of our time." "A key moral priority" doesn't mean "the only thing that has substantial moral value." If you had instead dunked on classic utilitarianism, I would have agreed.)

Hmm, I'm a bit surprised to hear you say that. I feel like I myself brought up regulatory capture a bunch of times in our conversations over the last two years. I think I even said it was the most likely scenario, in fact, and that it was making me seriously question whether what we were doing was helpful. Is this not how you remember it? Wanna hop on a call to discuss?

As for arguments of that form... I didn't say X is terrible, I said it often goes badly. If you round that off to "X is terrible" and fit it into the argument-form you are generally against, then I think to be consistent you'd have to give a similar treatment to a lot of common sense good things. Like e.g. doing surgery on a patient who seems likely to die absent treatment.

I also think we might be talking past each other re regulation. As I said elsewhere in this discussion (on the OP's blog) I am not in favor of generically increasing the amount of AI regulation in the world. I would instead advocate for something more targeted -- regulation that I actually think would really work, if implemented well. And I'm very concerned about the "if implemented well" part and have a lot to say about that too.

What are the regulations that you are concerned about, that (a) are being seriously advocated by people with P(doom) >80%, and (b) that have a significant chance of causing x-risk via totalitarianism or technical-alignment-incompetence?

Also, the US did consider the possibility of waging a preemptive nuclear war on the USSR to prevent it from getting nukes. (von Neumann advocated for this I think?) If the US was more of a warmonger, they might have done it, and then there would have been a more unambiguous world takeover.

Who is pushing for totalitarianism? I dispute that AI safety people are pushing for totalitarianism.

Reporting requirements, especially requirements to report to the public what your internal system capabilities are, so that it's impossible to have a secret AGI project. Also reporting requirements of the form "write a document explaining what capabilities, goals/values, constraints, etc. your AIs are supposed to have, and justifying those claims, and submit it to public scrutiny. So e.g. if your argument is 'we RLHF'd it to have those goals and constraints, and that probably works because there's No Evidence of deceptive alignment or other speculative failure modes' then at least the world can see that no, you don't have any better arguments than that.

That would be my minimal proposal. My maximal proposal would be something like "AGI research must be conducted in one place: the United Nations AGI Project, with a diverse group of nations able to see what's happening in the project and vote on each new major training run and have their own experts argue about the safety case etc."

There's a bunch of options in between.

I'd be quite happy with an AGI Pause if it happened, I just don't think it's going to happen, the corporations are too powerful. I also think that some of the other proposals are strictly better while also being more politically feasible. (They are more complicated and easily corrupted though, which to me is the appeal of calling for a pause. Harder to get regulatory-captured than something more nuanced.)

 

I think this post doesn't engage with the arguments for anti-x-risk AI regulation, to a degree that I am frustrated by. Surely you know that the people you are talking to -- the proponents of anti-x-risk AI regulation -- are well aware of the flaws of governments generally and the way in which regulation often stifles progress or results in regulatory capture that is actively harmful to whatever cause it was supposed to support. Surely you know that e.g. Yudkowsky is well aware of all this... right?

The core argument these regulation proponents are making and have been making is "Yeah, regulation often sucks or misfires, but we have to try, because worlds that don't regulate AGI are even more doomed."

What do you think will happen, if there is no regulation of AGI? Seriously how do you think it will go? Maybe you just don't think AGI will happen for another decade or so?

Load More