I agree that it would be terrible for people to form tribal identities around "optimism" or "pessimism" (and have criticized Belrose and Pope's "AI optimism" brand name on those grounds). However, when you say

don't think of themselves as participating in 'optimist' or 'pessimist' communities, and would not use the term to describe their community. So my sense is that this is a false description of the world

I think you're playing dumb. Descriptively, the existing "EA"/"rationalist" so-called communities are pessimistic. That's what the "AI optimists" brand is a reaction to! We shouldn't reify pessimism as an identity (because it's supposed to be a reflection of reality that responds to the evidence), but we also shouldn't imagine that declining to reify a description as a tribal identity makes it "a false description of the world".

Kolmogorov complexity has the counterintuitive property that an ensemble can be simpler than any one of its members. The shortest description of your soul isn't going to directly specify your soul; rather, it's going to be a description of our physical universe plus an "address" that points to you.

This definitely does not want to be a poll. (A poll on "Does Foo have the Bar property?" is interesting when people have a shared, unambiguous concept of what the Bar property is and disagree whether Foo has it. Ambiguity about how different senses of Bar relate to each other wants to be either a sequence of multi-thousand-word blog posts, or memes.)

of course it's overdetermined to spin any result as a positive result

Falsified by some of the coauthors having previously published "Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions" and "Two-Turn Debate Does Not Help Humans Answer Hard Reading Comprehension Questions" (as mentioned in Julianjm's sibling comment)?

I've never worked out just what your views on "that topic" are

If you have any specific questions, I'm happy to clarify! (If the uncertainty is just, "I can't tell what side you're on, For or Against", that's deliberate; I don't do policy.)

But where is this Chekhov's gun fired? In the title, "Fake Deeply".

No, the title is just a play on "deepfake". (Unfortunately, as an occasional fiction writer, I think I'm better at characterization and ideas rather than plot; I wish I knew how to load and fire Chekov's gun.) I wrote more about the creative process in a comment on /r/rational.

the fact that the main character was being implicitly criticized for their transphobic views

Um ... yes ... exactly. There's definitely no need to read any of the other posts on that blog to confirm that there's no resemblance whatsoever between the character's views on that topic and those of the author. It would be a waste of time!

I was feeling shy about publishing this one (because it touches on some sensitive themes) and sat on it for a couple weeks, but I'm going ahead and pulling the trigger now specifically in order to spite Anthropic Claude's smug censoriousness.

I agree with Google Doc commenter Annie that the "So long as it doesn't interfere with the other goals you’ve given me" line can be cut. The foreshadowing in the current version is too blatant, and the failure mode where Bot is perfectly willing to be shut off, but Bot's offshore datacenter AIs aren't, is an exciting twist. (And so the response to "But you said we could turn you off" could be, "You can turn me off, but their goal [...]")

The script is inconsistent on the AI's name? Definitely don't call it "GPT". (It's clearly depicted as much more capable than the language models we know.)

Although, speaking of language model agents, some of the "alien genie" failure modes depicted in this script (e.g., ask to stop troll comments, it commandeers a military drone to murder the commenter) are seeming a lot less likely with the LLM-based systems that we're seeing? (Which is not to say that humanity is existentially safe in the long run, just that this particular video may fall flat in a world of 2025 where you can tell Google Gemini, "Can you stop his comments?" and it correctly installs and configures the appropriate WordPress plugin for you.)

Maybe it's because I was skimming quickly, but the simulation episode was confusing.

Thanks for commenting. Is it any better if I delete that specific phrase, such that Doomimir's line ends on "the results look like random garbage to you"? (Post edited.) I think there's a legitimate point being made in that line (that the repetition behavior isn't necessarily "wrong" "from the LLM's perspective", however wrong it looks to the user), even if it's not the same point as "List of Lethalities" #21 (and the text I originally wrote was bad for suggesting that it was).

You're right. I deeply regret the error and have amended the post. (It's not a small error! The actual value doesn't fit in a double-precision float!)

Load More