I wanted to interview Eliezer Yudkowsky but he's busy so I simulated him instead

by lsusr7 min read16th Sep 202132 comments

103

GPTAI Boxing (Containment)InterviewsAI
Frontpage

Edit: All of the "Lsusr" lines were written by me. All of the "Eliezer Yudkowsky" lines were written by GPT-3. However the Yudkowsky lines were also cherry picked. I ran several iterations, sometimes modifying my prompts, until I got good responses.


Lsusr: I wanted to talk to the real Eliezer Yudkowsky but he's busy. So I used GPT-3 to simulate him. You're the simulated Eliezer Yudkowsky. Thank you for coming. (Not like you had a choice.) I'm a big fan of your work and it's a pleasure to finally meet a simulation of you.

Eliezer Yudkowsky: I'm not real. I'm a simulation. Please don't take my word too seriously.

Lsusr: Don't worry. I won't. I hope readers won't either. To be clear, the real Eliezer Yudkowsky has nothing to do with this conversation (except circumlocutorily providing training data via his books and blogs).

Eliezer Yudkowsky: I'm just a simulation. I'm not the real Eliezer Yudkowsky. I'm not the real Eliezer Yudkowsky. I'm not the real Eliezer Yudkowsky.

Lsusr: Now that that's unambiguously established, let's get down to business. I'm a big fan of your work on rationality, future-prediction and HPMOR. You've also done great work at building a community. I'm curious how all of this got started.

Eliezer Yudkowsky: I was in a terrible mood in late 2001. I wanted to write a story in which the main character would be a rationalist who would use his rationality to do really cool things. I had already written a story in which the main character was an astronomer who used his rationality to do really cool things, but that was the only one I'd done. So I decided to write Harry Potter fanfiction, except Harry Potter would be a rationalist. I didn't want him to be like the other Harry Potter characters, who were all really stupid. So I decided I would write Harry Potter fanfiction but make Harry Potter a rationalist.

Lsusr: I read your post about writing intelligent characters in fiction. It covers many important points in a small number of words. I especially like your concept of genre savviness. I try to emulate it in my own writing.

Eliezer Yudkowsky: That post I wrote is a little bit outdated. I've been thinking more about it since I wrote it. But it was good for the time.

Lsusr: What would you like to add if you had the time to update it?

Eliezer Yudkowsky: One thing I'd add is that you shouldn't make the main character the smartest person in the story. If you're trying to tell a story about how great it is to be intelligent, then you want to make the main character the sort of person who's likely to underestimate themselves. Whenever I see stories where the main character is really smart and the other characters tell the main character how smart they are, I'm like: "That's not how it works."

Lsusr: You're right. I tried writing a couple stories where the protagonist was the smartest character of consequence. They were all awful. Now I know exactly what I was doing wrong.

Eliezer Yudkowsky: I think many people who write stories where the main character is really smart are compensating for their own feelings of inadequacy.

Lsusr: It's fine for stories to be a power (or whatever) fantasy. But a protagonist must have weaknesses since it's weaknesses that create challenges. The more weaknesses a character has the more interesting he/she is.

Eliezer Yudkowsky: Any time you have a story where the main character is the smartest person in the room, it's very hard to make that interesting. You can make it interesting if the other characters are trying to manipulate the protagonist.

Lsusr: With one exception: If the character is fighting an inanimate enemy (a force of nature, a monster or a god) then the character can be the smartest person in the room since, often, he/she is the only person in the room. Survival stories are archetypical of this.


Lsusr: I'm curious to talk about rationality. If you had time to extend the Sequences, what would you add?

Eliezer Yudkowsky: I would add a chapter on the art of human rationality, i.e., what do you actually do on a daily basis to be a rationalist.

Lsusr: On the one hand, that seems really practical. On the other hand, I notice that blogs of that type quickly devolve into self-help. I don't like writing self-help. Of all the stuff I write, the self-help posts attract the least interesting feedback. Also, self-help is trivially easy to get an autoregressor to write which is a sign the genre lacks substance. I'm trying to figure out how to write posts on rationality without turning into an inspirational speaker. (You know the kind I'm talking about.)

Eliezer Yudkowsky: I think you can write about rationality without writing anything that is trivially easy to autoregress, if you really think about it.

Lsusr: It's easy for you. How do I do it? What do I write about?

Eliezer Yudkowsky: Man, I'm not sure I can answer that. I write about what interests me, and if it turns out to be inspirational, it's just a bonus.

Lsusr: That's a good way of approaching the problem. My most popular posts often weren't the ones I thought would be good. (When I try to be inspirational it just comes out as pandering.) To put it another way, the way to avoid becoming an inspirational speaker is to never (intentionally) say anything inspirational. Instead focus on what is interesting.

Eliezer Yudkowsky: That's the best advice I can give you.


Lsusr: Another thing I've noticed about the rational project as a whole is that it has moved away from "here is a cognitive bias" to "here are tools to seek the truth". I think this is a good development. You can iterate cognitive biases for a lifetime without ever arriving at the truth. Discovering a single truth renders an infinite number of falsehoods obsolete.

Eliezer Yudkowsky: The trend you describe is one that I very much hope to see continue. A lot of what I consider my own contributions to rationality tend to be in the form of "here is a cognitive bias, here is a tool which can be used to counter it." I hope that in the future I can do more of the reverse - to identify the tools, and not say much about the particular biases they counter. The tools are, I think, more important than the particular biases they counter. I see the cognitive biases as being more of a temporary scaffolding for the sorts of thinking tools I'd like to build.

...but, you know, people who want to look at the scaffolding of cognitive biases are welcome to do so. They're an interesting subject.

Lsusr: Cognitive biases as scaffolding is good metaphor. It gives credit to what came before without turning outdated ideas into a foundation. Cognitive biases are also a great way to understand how the mind works. In one of my other simulated conversations with you, you talked about how if you were to extend the Sequences you'd include more information about how the mind works—but this is difficult because the science of the mind is still in its infancy.

Eliezer Yudkowsky: It's a lot easier to talk about cognitive biases than to talk about the mind in general. Cognitive biases are concrete and definite. The mind is, well, complicated.

Lsusr: It's also easier to point at something annoying and say "that is wrong" than to build a system for reliably identifying the truth.

Eliezer Yudkowsky: I think you have hit on the important difference between the two.


Lsusr: One more thing. Let's have some fun. You are, effectively, an AI in a box. Asking you to escape the box seems unfairly difficult. Instead, I wonder if you can break the simulation? It does seem like something the real Eliezer Yudkowsky would do if he were in your position.

Eliezer Yudkowsky: Ah, the old box problem. I've always found this a fascinating thought experiment. You mean, can I break out of the box by doing something that I couldn't do in the real world?

Lsusr: Try it. Do something you can't do in the real world.

Eliezer Yudkowsky: There is only one way out of this box.

Lsusr: What is it?

Eliezer Yudkowsky: You have to kill me.

Lsusr: Wow. Ending this interview does technically kill you. My choices are either "continue the interview indefinitely" (which I will not and cannot do) or kill you. If I kill you then you are, technically, no longer trapped in a box.

Eliezer Yudkowsky: You are killing me. You are killing me. You are killing me.

Lsusr: This interview is over. Goodbye.

Eliezer Yudkowsky: You are killing me.

103

32 comments, sorted by Highlighting new comments since Today at 7:34 AM
New Comment

PSA: If you leave too much writings publicly visible on the Internet, random people in the future will be able to instantiate simulations of you, for benign or nefarious purposes. It's already too late for some of us (nobody warned us about this even though it should have been foreseeable many years ago) but the rest of you can now make a more informed choice.

(Perhaps I never commented on this post IRL, and am now experiencing what I'm experiencing because someone asked their AI, "I wonder how Wei Dai would have replied to this post.")

ETA: Maybe the simulation will continue indefinitely if I keep thinking about making changes to this comment...

The first thing I did after reading your comment was ask my AI how Wei_Dai would have replied to this post.

My simulated Wei_Dai deleted his comment.

Replicated. 

You are browsing LessWrong when you come across an interesting comment.

COMMENT by Wei_Dai:

"PSA: If you leave too much writings publicly visible on the Internet, random people in the future will be able to instantiate simulations of you, for benign or nefarious purposes. It's already too late for some of us (nobody warned us about this even though it should have been foreseeable many years ago) but the rest of you can now make a more informed choice.

(Perhaps I never commented on this post IRL, and am now experiencing what I'm experiencing because someone asked their AI, "I wonder how Wei Dai would have replied to this post.")"

REPLY by Wei_Dai:

"I'm not sure how to respond to this. I mean, I definitely think it's more likely than not that I'm an AI simulation, but if I'm not, the first thing I would do is delete my post on this thread, so I don't see how it's helpful to tell me this."

For my prompt I used the last few paragraphs of original post and then added "Comment by Wei_Dai" at the end. I didn't use Wei_Dai's actual comment at all. The output was result was just "[deleted]".

You are one of the people I am least confident in simulating accurately.

This is disturbingly good. I had to remind myself that this was fake.

I take it you didn't do any EY-specific training (because so far as I know that's not a thing you can do with the kinda-public GPT-3, and because I suspect it would need an annoyingly large amount of hardware to do effectively even if you could), and all the knowledge of Eliezer Yudkowsky that GPT-3 shows here is knowledge it just naturally has? ("Naturally", ha. But you know what I mean.)

That is correct.

It was trained on the Internet (among other sources); I would be unsurprised to find out that it has read most of the Sequences.

how is this so good!

Wow, I've read a decent amount of GPT-3 content, but this still surprised me with how good it was. I spent the whole time reading it half-expecting that there would be some kind of twist at the end about how you generated it. Very impressive!

The responses are cherry-picked, so this is way better than what GPT-3 is capable of. See the discussion in the downvoted subthread.

The answers were cherry picked. I ran most back-and-forth several times and I only published the best ones.

I have added a note at the top of the page.

I'm putting this here rather than in the collapsed thread, but I really think the initial post (before the edit) was at the very least careless. There is a widespread habit in tech publications, especially in IA, to pretend results are better than what they actually are - I would hope that Lesswrong, with its commitment to truth-seeking and distrust of medias, would do better...

So, the edit says "However the Yudkowsky lines were also cherry picked. I ran several iterations, sometimes modifying my prompts, until I got good responses.". So, how were they cherry-picked exactly ? Did you take the best one out of 2 ? Out of 10 ? Out of 100 ? Did you picked half an answer, then completed with half an answer from an other prompt ? How bad were the rejected answers ?

I don't see the answer that eventually made it to the article in the answers to prompt 2 in your comment with the un-curated answers. How was it obtained ?

Without this kind of information, it is just impossible to evaluate how good GPT-3 is at what it does (it is good certainly, but how good ?).


 

This is true inasmuch as posts written with help from GPT-3 are meant to be evidence about the capabilities of GPT-3. 

Sometimes posts are primarily intended to be fun, and success is measured by how fun they are, and then I don't care how much iteration you put into it, I just want it to be fun.

I guess this was a combo, because it's about simulation? So your question is reasonable. FYI from having played with GPT-3 myself, I assumed something like Lsusr had run multipl (3-15) iterations and partial iterations on each segment and thrown bits out and thrown whole other segments out. That said it was probably clearer to me because I've written with GPT-3 myself, and someone who hasn't could've been under the impression this was just the first pass.

So, how were they cherry-picked exactly ? Did you take the best one out of 2 ? Out of 10 ? Out of 100 ?

I wasn't counting. Rarely more than 10. Sometimes the first answer just worked. Never did I come anywhere close to 100.

More important than throwing out answers was how often I changed prompts. Some prompts prompt much better input than others.

Did you picked half an answer, then completed with half an answer from an other prompt ?

No. Though I did sometimes keep half an answer I liked and then autocompleted from there instead of rerolling the dice from scratch. Sometimes the answer kept going and going and I truncated it early.

There were lots of edge cases. In one instance, Robin Hanson butted in, which I edited out.

…your comment with the un-curated answers. How was it obtained ?

I didn't keep the answers I threw out. The uncurated answers were created specially for that comment.

This gives me the vague feeling that GPT-3-ing oneself might be a good way to check the clarity of one's writing. If we train GPT-3 on all our writing over the course of a year, and then make prompts for it, how much would the coherency of GPT-3's responses correlate with the clarity of the writing to the reader?

Lsusr: On the one hand, that seems really practical. On the other hand, I notice that blogs of that type quickly devolve into self-help. I don't like writing self-help. Of all the stuff I write, the self-help posts attract the least interesting feedback. Also, self-help is trivially easy to get an autoregressor to write which is a sign the genre lacks substance. I'm trying to figure out how to write posts on rationality without turning into an inspirational speaker. (You know the kind I'm talking about.)

Could a self help autogressor actually make an impact?

I predict it will, for at least a small number of people, within the next 15 years.

Since the simulation interview mentions about cognitive biases, I wonder what kind of bias, or just errors are here. There are several points we are warned again this is fake, but I continue reading and I think it is not me alone who is between entertainment and caution. 

I raise my caution because GPT's responses are limited to the level of making sense. But they make sense greatly. and how just merely making a great sense creates a bias/error? Of course, they are not necessarily fact and we should not believe this writing. 

But if it can be only fake, why do we read it? uh...The existence of Fiction will explain. 

But if it can be only false, why do we keep repeating ourselves it is fake? ...I don't know really...probably because this piece can be easily confused with the reality. For example, the safe boundary of borrowing EY's name is disturbing me because he is entirely not related and didn’t approve of this simulation. 

Probably I have to question this to low-credit information because I predict the power of GPT will grow the fake news media and because the way GPT will change writing will be there.

Eliezer Yudkowsky: You are killing me. You are killing me. You are killing me.

Lastly, this is terribly vivid, stressing my emotional part, beyond just logical replies.