Daniel Kokotajlo — LessWrong

Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Now executive director of the AI Futures Project. I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html

Some of my favorite memes:

(by Rob Wiblin)

Comic. Megan & Cueball show White Hat a graph of a line going up, not yet at, but heading towards, a threshold labelled "BAD". White Hat: "So things will be bad?" Megan: "Unless someone stops it." White Hat: "Will someone do that?" Megan: "We don't know, that's why we're showing you." White Hat: "Well, let me know if that happens!" Megan: "Based on this conversation, it already has."
(xkcd)

My EA Journey, depicted on the whiteboard at CLR:

(h/t Scott Alexander)

Alex Blechman @AlexBlechman Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus 5:49 PM Nov 8, 2021. Twitter Web App

For my part I just somehow never saw this. Dunno why. Maybe I was busy and didn't check LW for a while idk.

Wow somehow this article never came to my attention, sorry I missed it, thanks Eli Lifland for sending it to me!

OK, thanks!

FWIW I do think that superintelligent AI, when it eventually exists, will be more analogous to a superhuman god doing research beyond the comprehension of mere humans, than to a difficult enemy that could in theory be beaten by the main character. Like, yes, in theory it could be beaten, in the same sense that in theory I could beat Magnus Carlsen in chess or in theory Cuba could invade and conquer the USA.

I've bought it and plan to read it, you are the second person to recommend it to me recently.

Curious about how you think it's more realistic than AI 2027? AI 2027 doesn't really feature any superpersuasion, much less intelligence-as-mind-control. It does have bioweapons but they aren't at all important to the plot. As for boiling the oceans... I mean I think that'll happen in the next decade or two unless the powers that be decide to regulate economic growth to prevent it, which they might or might not.

Huh. Seems super wrong to me fwiw. How would you rank AIFP on the list?

No, alas. :( iirc I had a half-finished redraft and then ended up prioritizing other things.

For context, how would you rank the AI safety community w.r.t. integrity and honor, compared to the following groups:

1. AGI companies
2. Mainstream political parties (the organizations, not the voters, so e.g. the politicians and their staff)
3. Mainstream political movements e.g. neoliberalism, wokism, china hawks, BLM,
4. A typical university department
5. Elite opinion formers (e.g. the kind of people whose Substacks and op-eds are widely read and highly influential in DC, silicon valley, etc.)
6. A typical startup
7. A typical large bloated bureaucracy or corporation
8. A typical religion e.g. christianity, islam, etc.
9. The US military

This ontology allows clearer and more nuanced understanding of what's going on and dispels some confusions.

The ontology seems good to me, but what confusions is it dispelling? I'm out of the loop here.

IIUC the core of this post is the following:

There are three forces/reasons pushing very long-sighted ambitious agents/computations to make use of very short-sighted, unambitious agents/computations:

1. Schelling points for acausal coordination

2. Epistemology works best if you just myopically focus on answering correctly whatever question is in front of you, rather than e.g. trying to optimize your long-run average correctness or something.

3. Short-sighted, unambitious agents/computations are less of a threat, more easily controlled.

I'll ignore 1 for now. For 2 and 3, I think I understand what you are saying on a vibes/intuitive level, but I don't trust my vibes/intuition enough. I'd like to see 2 and 3 spelled out and justified more rigorously. IIRC there's an academic philosophy literature on 2, but I'm afraid I don't remember it much. Are you familiar with it? As for 3, here's a counterpoint: Yes, more longsighted agents/computations carry with them risk of various kinds of Evil. But, the super myopic ones also have their drawbacks. (Example: Sometimes you'd rather have a bureaucracy staffed with somewhat agentic people who can make exceptions to the rules when needed, than a bureaucracy staffed with apathetic rule-followers.) You haven't really argued that the cost-benefit analysis systematically favors delegating to myopic agents/computations.

The Takeoff Speeds Model Predicts We May Be Entering Crunch Time