Daniel Kokotajlo

Philosophy PhD student, worked at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Not sure what I'll do next yet. Views are my own & do not represent those of my current or former employer(s). I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html

Some of my favorite memes:

(by Rob Wiblin)

My EA Journey, depicted on the whiteboard at CLR:

(h/t Scott Alexander)

Sequences

Agency: What it is and why it matters

AI Timelines

Takeoff and Takeover in the Past and Future

Posts

Sorted by New

5Daniel Kokotajlo's Shortform

360

61Self-Awareness: Taxonomy and eval suite proposal

2mo

255AI Timelines

6mo

58Linkpost for Jan Leike on Self-Exfiltration

7mo

106Paper: On measuring situational awareness in LLMs

8mo

38AGI is easier than robotaxis

8mo

61Pulling the Rope Sideways: Empirical Test Results

9mo

39What money-pumps exist, if any, for deontologists?

10mo

55The Treacherous Turn is finished! (AI-takeover-themed tabletop RPG)

41My version of Simulacra Levels

13Kallipolis, USA

Wiki Contributions

Comments

AI Regulation is Unsafe

Daniel Kokotajlo9h121

I think most people pushing for a pause are trying to push against a 'selective pause' and for an actual pause that would apply to the big labs who are at the forefront of progress. I agree with you, however, that the current overton window seems unfortunately centered around some combination of evals-and-mitigations that is at IMO high risk of regulatory capture (i.e. resulting in a selective pause that doesn't apply to the big corporations that most need to pause!) My disillusionment about this is part of why I left OpenAI.

AI Regulation is Unsafe

Daniel Kokotajlo10h20

Good point, I guess I was thinking in that case about people who care a bunch about a smaller group of humans e.g. their family and friends.

AI Regulation is Unsafe

Daniel Kokotajlo1d53

I agree that 0.7% is the number to beat for people who mostly focus on helping present humans and who don't take acausal or simulation argument stuff or cryonics seriously. I think that even if I was much more optimistic about AI alignment, I'd still think that number would be fairly plausibly beaten by a 1-year pause that begins right around the time of AGI.

What are the mechanisms people have given and why are you skeptical of them?

Uncontrollable Super-Powerful Explosives

Daniel Kokotajlo1d40

Perhaps. I don't know much about the yields and so forth at the time, nor about the specific plans if any that were made for nuclear combat.

But I'd speculate that dozens of kiloton range fission bombs would have enabled the US and allies to win a war against the USSR. Perhaps by destroying dozens of cities, perhaps by preventing concentrations of defensive force sufficient to stop an armored thrust.

AI Regulation is Unsafe

Daniel Kokotajlo1d20

OK, thanks for clarifying.

Personally I think a 1-year pause right around the time of AGI would give us something like 50% of the benefits of a 10-year pause. That's just an initial guess, not super stable. And quantitatively I think it would improve overall chances of AGI going well by double-digit percentage points at least. Such that it makes sense to do a 1-year pause even for the sake of an elderly relative avoiding death from cancer, not to mention all the younger people alive today.

AI Regulation is Unsafe

Daniel Kokotajlo1d42

Big +1 to that. Part of why I support (some kinds of) AI regulation is that I think they'll reduce the risk of totalitarianism, not increase it.

AI Regulation is Unsafe

Daniel Kokotajlo1d30

So, it sounds like you'd be in favor of a 1-year pause or slowdown then, but not a 10-year?

(Also, I object to your side-swipe at longtermism. Longtermism according to wikipedia is "Longtermism is the ethical view that positively influencing the long-term future is a key moral priority of our time." "A key moral priority" doesn't mean "the only thing that has substantial moral value." If you had instead dunked on classic utilitarianism, I would have agreed.)

AI Regulation is Unsafe

Daniel Kokotajlo2d83

Hmm, I'm a bit surprised to hear you say that. I feel like I myself brought up regulatory capture a bunch of times in our conversations over the last two years. I think I even said it was the most likely scenario, in fact, and that it was making me seriously question whether what we were doing was helpful. Is this not how you remember it? Wanna hop on a call to discuss?

As for arguments of that form... I didn't say X is terrible, I said it often goes badly. If you round that off to "X is terrible" and fit it into the argument-form you are generally against, then I think to be consistent you'd have to give a similar treatment to a lot of common sense good things. Like e.g. doing surgery on a patient who seems likely to die absent treatment.

I also think we might be talking past each other re regulation. As I said elsewhere in this discussion (on the OP's blog) I am not in favor of generically increasing the amount of AI regulation in the world. I would instead advocate for something more targeted -- regulation that I actually think would really work, if implemented well. And I'm very concerned about the "if implemented well" part and have a lot to say about that too.

What are the regulations that you are concerned about, that (a) are being seriously advocated by people with P(doom) >80%, and (b) that have a significant chance of causing x-risk via totalitarianism or technical-alignment-incompetence?

Uncontrollable Super-Powerful Explosives

Daniel Kokotajlo4d40

Also, the US did consider the possibility of waging a preemptive nuclear war on the USSR to prevent it from getting nukes. (von Neumann advocated for this I think?) If the US was more of a warmonger, they might have done it, and then there would have been a more unambiguous world takeover.

AI Regulation is Unsafe

Daniel Kokotajlo4d126

Who is pushing for totalitarianism? I dispute that AI safety people are pushing for totalitarianism.