Debate experiments at The Curve, LessOnline and Manifest

Nathan Young

I like debate. I have done for years. So I have been slowly trying to improve it. Here is a set of theories I had and things, experiments I've run so far.

Theory: Any debates are good.

Are any debates actually good at all? Should I give up?

Test: Watch different debates.

Evidence: I much prefer some debates to others.

Good debates:

Dr. Richard Carrier andDr. Michael Licona. I like how they chat to one another.
Destiny and Ben Shapiro. I recall liking this one. I remember them as having good chemistry.
Jubilee’s “Surrounded” debates. I love an experimental format and these get a lot of different arguments in a short amount of time^[1].

Bad debates:

Finkelstein, Destiny and M. Rabbani & Benny Morris. Long and acrimonious. I think Lex Fridman is deeply guilty of the “I’ll just let them talk it out” school of debate. I think this is lazy.
Most things with William Lane Craig. Craig is an excellent debater on theology. I’m not sure I recall him ever losing. But his debates always hinge on niche points or technical arguments I don’t care about.
Anything with Jordan B. Peterson. Like trying to nail a cake to a wall.
Presidential debates. Trump in particular can lie with no cost at all, so he does.

Unclear:

Ezra Klein, Sam Harris. Bad that they don’t understand one another, but pretty interesting as a historical artefact to see two clever men who I like really fail to understand one another for very ~2018 culture war reasons.
Matt Dillahunty, Matthew Adelstein (aka Bentham's Bulldog). Dillahunty is sloppy but somehow his audience think he’s making good points. Frustrating to watch.

Status: Theory survived attempted falsification^[2].

Theory: The format is the problem.

Test: Run some different debate formats (see next).

Theory: Debates are bad because debaters focus on their own status.

They have to focus on how they appear to the audience and this stops them admitting points where they are wrong.

Test 1: Find ways to protect the status of the debaters

Evidence:

I tried running two debates like this at The Curve (Daniel Kokatajlo vs. Sayash Kapoor; Dean W. Ball vs. Gabriel Weil). I tried to moderate a bit more strongly than people tend to, ensuring that there were blocks of time where each was in control of the discussion.

The debates were okay but not great.

In both, it took us a long time to get to what felt like the meat of the discussion. I recall Ball and Weil saying they didn’t really understand one another’s position coming in.

In the the Ball vs. Weil debate, they weren’t really interested in being moderated, which to me felt like Ball therefore spent a lot more time defending his position and had less control over the discussion than I might like to see (though I think he was fine with it).

Kokatajlo and Kapoor felt solid debate, though not spectacular.

Test 2: Try and remove the status of the debaters and place it somewhere else.

Evidence: Courtly debates, Future of the Democratic Party, China discussion.

Ray Rafiq and I have had a goofy idea for a while of debates in a court style. king, knight, fool, etc. So at LessOnline I tried this out. Each debate had a king (or queen) to set the topic, two knights to argue it and a fool to ask questions. They took about 10 minutes each

I think our debaters (knights) were much less focused on their own positions than other rapid fire debates we could have run. In many ways it was a role play game. But it did feel like I partly succeeded in my aim - to pull status away from the debaters and put it somewhere else.

Later, Oliver Habryka wanted to run a session about the future of the Democratic Party. I pushed to try a new format there too, suggesting that Oliver would stand as the questioner and the dicussion would be about what interested him—whether somebody would speak, whether the audience would be able to ask questions would really be up to him and then I would serve as a meta-moderator to guard his time and attention. Habryka is a good candidate here because he's high status (CEO of Lightcone Infrastructure, who organise LessOnline) within the community and people respect his thinking.

This felt really good. There was a single questioner which provided a single viewpoint, rather than many questions from the audience or a rambling discussion from the panel. To me, this gave the event shape. Questions were answered, things were put to the side as new directions were investigated.

A couple of anecdotes:

At one point someone in the audience put their hand up and one of the panel pointed at them, so they asked a question. The panel member was about to answer, but I interrupted and asked Oliver whether he was interested in the question. Oliver said no, so the panel didn't answer it. This felt jarring but good. We expected Habryka to have better taste than the typical audience questioner^[3].
At another point, Oliver wanted to sit and think. Somebody asked if they could ask a question, and I said no. It's strange to have a room full of people sitting in silence, but the typical 30 seconds of a talk is pretty mediocre so it doesn’t actually seem that bad to lose. Then Habryka asked a question.

This felt like a genuine success in that we had a panel and they were being called on to answer questions that felt interesting to someone we resepected. For me a failure mode of debates is that debaters are scared of losing or trying to take turns and so what’s being discussed is not really of interest to anyone.

Next, I ran the discussion after a talk by Steve Hsu where he and Noah Smith discussed China. This was okay. At points it felt quite alive between them. But it could have been better for having somebody who was more willing to argue for US values. And perhaps someone to pin down Steve on specific facts about China, which Noah didn't really do (nor did he claim he would, professing not to be an expert^[4]).

Status: This theory is doing okay. I have had a couple of good events, but it’s unclear to me what great might look like.

Current top theory: A good investigator is best

My current top theory is that it really matters who is moderating/investigating. And that if this person is willing to hold the debaters/panel and force them to answer the difficult questions or engage with them, that makes a much more interesting debate than otherwise.

I suggest that Dwarkesh is a particularly good podcast host because he is so knowledgeable on AI topics and so willing to actually chase down his guests and say things like "okay, but what about the data centre built in Saudi Arabia?"

Suggested test: Future conferences, podcasts.

For the next set of conferences I run, I might like to focus on finding a good investigator for a topic and then choosing panelists afterwards and build an event around trying to understand AI, China, Ukraine war.

It's possible I'll also try the strategy for my podcast, which I haven't done episodes for in a while.

Other theories I may test later

Debaters should discuss beforehand. It's fun for people to discuss on the day because there is something very alive about people discussing things for the first time. But it seems worth to me having a short discussion beforehand to figure out the exact areas of disagreement and to check that there won't be 20 minutes of discussion on the day that could be avoided.
Debates are fun discussion pieces, but less good for sharing information. Debates are primarily useful as a way to set up discussions happening at a conference or to see discussion in the public sphere.
The key thing is who the debaters are. This seems too powerful. The point of a debate is that it's a format that allows two people who disagree to produce valuable work for other people. But if one has to select very carefully the two people who disagree, then that suggests that debates are much less valuable than one might want.
It would be better to have someone explain a field. It's possible I'm too focused on debate and that trying for a collaborative explanation or an overview of the field might be good. I struggle to think of a good format here.

One more thing..

Duncan Haldane built a home made Nielsen rating system that allowed audience members to twist a knob to display either red or green lights on their head. If they were interested, they turned to green. If they were bored, they turned to red. I didn't catch discussions where this was used, but it felt like a pretty interesting thing to do to be able to monitor people's interest in real time. And I can imagine using tools like this with a set of trusted “tastemakers” to guide an investigator on what interested some relevant group.

I'm not super interested in giving every audience member these because in general I think large groups of people can have quite poor taste^[5].

^{^}
The main issue with Surrounded is that the circle often removes good debaters because they disagree with the specific arguments as opposed to because they are doing badly. If you don’t follow, watch one! They are really good. eg here
^{^}
Does anyone have a better way to describe "survived attempted falsification"
Validated seems wrong.
^{^}
A better version of this would be to have an app where people could upvote questions and allow the questioner to see these in case any lines of inquiry were interesting to them.
^{^}
To me this felt too humble. Smith is a solid commentator on geopolitical issues with a moderate knowledge of China and better than almost all of the attendees, I’d guess.
^{^}
The median of a large group is quite accurate, but I tend to think the media they produce is not very interesting. Accurate but not tasteful. One to consider for LLMs perhaps.

You mention the status of the debaters, but it also makes sense to consider the status of the audience. Consider how many audience members "ask a question" but really just want to attract attention on themselves, express their own opinion on the topic, and signal that they too are knowledgeable about the subject.

Your intervention, telling them to just shut up even if there was nothing going on for 30 seconds, is treating them as low-status. But it worked, because you made the rules that everyone in the audience will be treated as low-status, so it is nothing personal when you refuse a specific person's question.

William Lane Craig is great to watch from meta-perspective. How do you go into someone else's field of expertise and try to beat them in a debate? He clearly thinks about it very carefully, in a way kinda like planning for political debates but with a much higher quality intended output.

Yeah I respect Craig. In the same way I respect a lion. That guy would likely trash me in a debate (especially since he debates topics he chooses).

I've noticed that whenever the debate touches on a very personal topic, it tends to be heated and pretty unpleasant to listen to. On contrast, debates about things that are low-stakes for the people who are debating tend to be much more productive, sometimes even involving steelmanning.

Good post!

Another way to diminish/remove status from debates is to shift from adversarial to collaborative modes. I'd like to see more experiments on "collaborative" debates. Here's an idea, pulling from this community: Crux Speedruns. Participants with opposing view on A must work together to find the crux of their disagreement as quickly as possible. Their team's time is added to the speedrun leaderboard.

It seems like there should be a way to center truth-seeking more than the debate framing does. Your last couple of theories seem promising in that direction.

Debates seem like a good idea: they should surface all of the good arguments and evidence. They have a subtle but powerful down side: they foster motivated reasoning in both the participants and audience.

They're not putting truth-seeking as the primary goal. That might be fine if motivated reasoning wasn't such a powerful effect. My impression is that public debate is usually net-negative for spreading truth. It gets people to dig in their heels on the positions they already hold, more than it exposes people to new arguments and evidence. They mix too much of the charisma, character, and skill of the participants with the objective merits of their positions, and human brains mix all of the perceived merits of associated arguments far more than we'd like.

Your attempts to remove the reputational effects work against this. But I like your last two suggestions best: carefully select the debaters who will have a truth-seeking discussion that happens to be called a debate, or change the terminology and drop the adversarial framing entirely.

Having someone just explain their view of a field is great, but we get that a lot already on podcasts already. What about having the goal being something similar to passing each other's ideological turing tests, a cooperative not competitive objective?

Or perhaps the goal could be to have the moderator be able to restate both positions and their principal reasons for holding them.

I've been watching discussions in science with interest for a long time, and I think they often become an argument, in which participants become more emotionally activated and more combative. This looks to me like it actually reverses progress toward the truth; it emotionally engages the onlookers too, and causes everyone to start looking for excuses to support their chosen position/debater. instead of carefully following the complex logic. Removing the debate framing is one way to reduce this tendency.

Anyway, kudos to you for doing experiments!

It's a separate issue, but I LOVE your one more thing! Having a way for non-speakers to weigh in on when the discussion is boring them seems ideal. Anonymity seems better than the lights on hats. I want an app for this, so I can gently nudge my friends off of dumb or argumentative conversation tracks without looking like a jerk.

It gets people to dig in their heels on the positions they already hold, more than it exposes people to new arguments and evidence.

I think there is some merit to this claim. But there is also a counterpoint, which Destiny (one of the debaters mentioned by the OP) talks about here. In the interest of saving readers from clicking on the video, here is a cleaned-up version of what he says:

"I think that the only way that you can break people out of these bad media environments, I think it actually has to be (this sounds so self-serving, I'm so sorry) through debate. I think it's the most important thing that you can possibly do. [...]

Because as I'm arguing with people, I realize what's happening. What's happening is they're getting a whole bunch of horrible information from whatever commentator they're listening to. And I can argue with them and I can pummel them and destroy every argument and become the debate god or whatever, but it doesn't matter because at the end of the day, they can go back and listen to that person.

But I think it does something to your mind (and this is feedback I got a lot back in the past) when you see your information god (i.e., the commentator himself) be forced to actually confront something he said but can't defend when he's actually challenged on it."

What was the specific 2018 culture war aspect to the Klein-Harris conversation? I don't think it would have been very different at any other point in the 21st century.

I think it would be different if it happened today. Harris position seems less controversial. Not sure you'd print he was a racialist today.

I would enjoy both comments on this as well as meta comments on my general process. There might be things that seem like dumb or crazy to you and they may be dumb or crazy! I find it hard to do experiments and this is the best way I have found.

I have also felt that debates were a deeply misused form of social knowledge sharing. I think what you did with Habryka worked well, and I would be interested in a format where two debaters each got a turn to be the questioner.

I wonder why no-one has just directly tried to do turing debate, where the debaters submit ~2000 words that explain their views to each other beforehand, then the actual debate is them taking on the position of the other side and trying to debate that.

An idea I've had for a while is to do a taboo debate. Where the debators/audience submits words that are tabooed. I don't know how well it would work, but it seems like it might help a little bit by giving people a tool to focus the discussion.

I wonder why no-one has just directly tried to do turing debate, where the debaters submit ~2000 words that explain their views to each other beforehand, then the actual debate is them taking on the position of the other side and trying to debate that.

One idea might be to pair debates with Delphi panels: do the usual Delphi method to get a consensus report beforehand, and then have them explain & debate what is left over as non-consensus (or possibly, if there are some experts who disagree hotly with the consensus report, bring them on for a debate with the original panel).