Summary: Parapsychology (the study of unorthodox mental phenomena such as hypnosis or telepathy) is widely considered to be the study of something that doesn’t exist. Many parapsychology experiments are uncommonly good ways to practice science, since the experiments can be performed without expensive apparatus and require some understanding of statistics. Let’s do a few experiments and see if these things exist! 

Tags: medium

Purpose: To learn how to do science on a confusing phenomenon. 

Materials: Zenner cards are useful but a 52 deck can do in a pinch. Some variations may suggest special decks or other tools entirely.

Announcement Text: “The Control Group is Out of Control is a popular Scott Alexander article. In it, he examines parapsychology, the field of science that investigates such phenomenon as hypnosis, telepathy, and astral projection. If you think that psychic powers aren't real, then this is a fascinating field: it's a good example of how attempts to do science can give you the wrong answer, returning false positives. My science classes mostly focused on easily replicable experiments like testing acceleration due to gravity or the resistance different materials had to electricity. When studying something like telepathy or antidepressants where not every person responds the same way, how do you sort actual effects from statistical noise?

Designed by Karl Zener in the 1930s, Zener Cards are a set of twenty-five cards with one of five different visually distinct symbols on one side. Karl Z. used them as part of a series of parapsychology experiments attempting to prove psychic powers existed; the typical experiment involved a test subject attempting to guess which card was being held by the test administrator. The experiments initially showed that some people are naturally psychic, guessing correctly more often than chance. Later experiments concluded that this was entirely down to poor experimental design. So my question to you: are any of us psychic? What do you think you know, and how do you think you know it?

I'm bringing a couple packs of Zener cards and a couple packs of regular 52 decks. Let's do some science!”

Image from Wikipedia. https://en.wikipedia.org/wiki/Zener_cards

Description: The basic zener card experiment works like this; one person (the Tester) takes a deck of 25 cards and shuffles them, then one by one draws a card from the top of the deck and asks another person (the Subject) to guess which symbol is on the card. If the Subject guesses correctly, this is theoretically an indication of telepathic ability.

Describe the experiment, ask for a volunteer, and do a couple of draws to show people how it works. Now, ask people to figure out how how they would determine if something weird was going on?

From here on, the meetup is largely in their hands. Helpful things to point out include:

  • Before an experiment is run, ask people to state what they think will happen. If your audience is familiar with Bayes Theorem, ask for a couple of priors. If you want to do a more directed variation, this is a great time to explain how Bayes Theorem works. Explain that if people have complaints about the experimental procedure, now is the best time to voice their complaints and that pointing out flaws after the fact is poor form.
  • Emphasize the distinction between “this proves psychic powers are real” and “this is evidence something weird is going on.” People can get very fixed on trying to explain why this doesn’t prove psychic powers are real and spend a lot of time trying to explain away statistics when the answer can be that nobody shuffled the deck. If you’re using the thumb on the scale variation, try not to smirk too much when you keep saying “evidence for something unusual” instead of “evidence for psychic powers.” If you're not using the thumb on the scale variation, smirking a bunch when you say "evidence for something unusual" provides a little bit of extra cover for people who are.

Variations: As written, once the basic experiment is explained then the meetup is largely in the attendees hands. However, you can take a stronger hand directing them or correcting them. This can ultimately look like a statistics lecture, and there isn’t anything particularly wrong with that if that’s what you’re going for. Still, try and at least have it be a hands-on lecture, and try to make it a good lecture. Practice the math yourself beforehand, have your notes handy if you don't work with it regularly, use an app like fatebook or a google sheet if you think that would help.

The thumb on the scale variation involves making sure something weird is happening. If you want to mess with your attendees, there are lots of ways to cheat the zener test. I suggest throwing in any of the following: stacking the deck such that it’s poorly shuffled and predictable, arranging a conspiracy between the Tester and the Subject to communicate via blinks or foot taps under the table, getting decks of cards with marked backs, arranging a reflective surface to be behind the Tester such that the Subject can see the cards, counting cards such that by the end of the deck it’s easier to guess, spotting the card in the reflection of the Tester's glasses, or snapping your fingers and developing actual psychic powers.

Some of these are actually in-scope for scientific problems! In particular, poor shuffling or counting cards can happen by accident and be picked up on by accident. 

I have two important suggestions for the thumb on the scale variation. First, have some kind of distinguishing feature between you normally and you when cheating at cards. I use a nametag or index card stuck to my hat that says “The Hat Of Lies” but could probably stand to be more obvious about it. I think it's fine to wear The Hat Of Lies even if you're not using the thumb on the scale version. The important thing is that people don't feel like you're more likely to lie to them under normal circumstances. 

Second, if you want your subterfuge to last a lot longer in the face of skeptics, cheat to lose. That is, instead of stacking the deck and then guessing correctly every time, stack the deck and guess wrong every time. That’s also statistically weird, someone paying attention to base rates should notice it’s weird, but in my experience people are way less likely to notice this is weird even if I say something like “gosh, it’s like I’m anti-psychic or something.” I suspect that making sure to have each trial come out exactly normal (that is, getting exactly the number of correct guesses that basic odds suggest you should, no more no less) would be even sneakier but you need to be sure people are going to repeat the experiment a lot (or have started by deciding there should be a lot of trials) for them to pick up on it. If you flipped a coin one thousand times and got exactly five-hundred heads and five-hundred tails, that is a little bit weird.

Other parapsychology experiments! Some of these are easier or harder to reproduce with materials you're likely to have in a normal meetup.

A larger variation whose details are for the moment left as an exercise for the reader is doing any kind of science on any kind of effect. Ideally it’s something you can test in an average living room with equipment you have in your house, but plotting field trips is totally encouraged. Submitting a paper for publication is overkill, but if your group does produce a paper as a result of a meetup I declare that you win meetups for the day and you should totally brag about it.

Notes: The thing I like about this experiment is that the results are going to vary a bit. If someone gets the answer right six times out of twenty-five, is that a sign of psychic powers? (Or cheating?) How about seven times? How about ten? It’s possible someone just got lucky, but surely a proper Bayesian should update at some point towards something weird going on. This suggests a variation where you outright explain how Bayes Rule works before you begin.

A worthwhile note is that at some point when attempting to test if anyone is psychic, the attendees may suggest experiments totally unrelated to the Zener cards, like asking the subjects to guess what number they’re thinking of or to draw the scene they’re imagining. Some of these (like drawing the scene) are harder to quantify and do statistics to. That’s basically fine and can be fun. If you’re inclined to use the thumb on the scale variation then this might escalate into a game between you and the crowd. Lose gracefully when you get caught out. A good chunk of the fun here is in designing good experiments, so I suggest putting more energy into pointing out flaws in experimental design even if it’s obvious what you’re doing.

A further note on tone is that “haha, obviously ESP is fake, there’s no test that could change my mind” is not really the suggested angle to come into this with. “People who believe ESP might be real are idiots” is, I claim, completely wrong. Encourage people to take the idea seriously, at least long enough to test. What experimental result would change their mind? Invite them to perform the experiment, if it can be arranged! Ideally, people who believe in ESP and people who don’t believe in ESP should sit next to each other, talk about how science ought to be done, do some experiments, and if not change their mind then at least update in one direction or another. Plus, while this could be pure confirmation bias I’ve found that saying “only idiots believe in X” in a meetup is the surest way I know of to have someone standing right next to you who believes in X.

Credits: Zener cards (and the original experiment) were designed by Karl Zener.

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 2:26 AM

Materials: Zenner cards are useful but a 52 deck can do in a pinch. 

Saying that you can just exchange one deck that's designed for the purpose with a standard 52-card deck seems to me like cargo-culting if you don't believe in the trick working with the 52-card deck in the first place.

Description: The basic zener card experiment works like this; one person (the Tester) takes a deck of 25 cards and shuffles them, then one by one draws a card from the top of the deck and asks another person (the Subject) to guess which symbol is on the card. If the Subject guesses correctly, this is theoretically an indication of telepathic ability.

This again looks like a cargo-cult description. For contrasts, the ChatGPT description is "In the classic experiment with Zener cards, a "sender" and a "receiver" are used. The sender would draw a card from the shuffled pack of Zener cards and attempt to mentally transmit the image (without any physical cues) to the receiver, who would then try to identify the card. The receiver's guess was recorded and the process was repeated numerous times."

If you cut out the step of attempt to mentally transmit the image that sounds like cutting out one of the primary telepathy steps. If you want to prove that the ESP doesn't work, skipping that step seems quite fishy. Yes, if you run the experiment properly you are likely to get a null result but 

If the purpose is "To learn how to do science on a confusing phenomenon." cargo-culting some experiments is not useful. If you are already certain that no ESP works, you might run the experiment to see how easy it is to get somehow bad data, but if you feel that there's anything confusing going on you would likely to actually do the experiments with the steps that some people claim to produce results.

Quoth wikipedia,

In a test for ESP, the experimenter picks up a card in a shuffled pack, observes the symbol, and records the answer of the person being tested, who would guess which of the five designs is on the card. The experimenter continues until all the cards in the pack are tested.

K. Zener and his colleague for whom he made the cards, J. B. Rhine, published a number of experiments using zener cards. (I think Rhine's name is on them and Zener's isn't.) The exact methodology changed over time as other people failed to replicate the effects or pointed out potential ways the subject/receiver could have been getting information other than telepathy. There isn't a singular canonical experiment, instead there's a back and forth between various researchers trying to figure out what's going on. 

The part of this that seems to me to make for a good meetup isn't the full literature review, it's thinking about how to design an experiment that provides evidence one way or another. Hence "do a couple of draws" and then "the meetup is largely in [the attendees] hands."

Do the cards matter? The Zener Card symbols are clearly differentiated and easy to visually recognize, so maybe if extra sensory perception is blurry and indistinct a subject/receiver can perceive the difference between a red cross and a yellow circle, but not the difference between a hearts and a diamonds. If you didn't have Zener Cards handy and that's the objection, maybe you can filter the 52 deck to contain only aces, tens, and kings, which are fairly visually distinct from one another. Or maybe the cards matter for some other reason. Does the tester/sender actively trying to transmit information matter? Maybe ESP functions like sight, and the tester/sender doesn't need to actively transmit any more than they need to actively try and be visible. Are we testing the ability to send, the ability to receive, or both? Alternately (as the person who did believe in ESP who showed up to my test meetup thought it worked) does this work by looking at your own future, so it doesn't matter if the tester even sees the cards themselves before the reveal? In all of these cases it's good form to point out these objections before running the experiment, not after. 

I agree, doing experiments by rote without thinking about what you're testing isn't a good way to practice doing science on something confusing. Hence my suggestion to have people think about what experiments would provide interesting evidence. "Maybe it works like this, and we could test it by doing that. Hrm, nope, that looks like normal variation."

I think the cards are designed to have symbols that are easy to visualize and be distinct. My general impression of the telepathy idea in relation to the cards is that it's about picking up what someone else visualized.

In the scenario where someone would be reliably able to get 30% instead of the 20% consistently right that would indeed look to me like there's a lot of blur involved.

Alternately (as the person who did believe in ESP who showed up to my test meetup thought it worked) does this work by looking at your own future, so it doesn't matter if the tester even sees the cards themselves before the reveal? 

I'm a bit surprised that you would find someone holding that position who comes to an LW meetup. 

How likely did they think the experiment would show evidence for ESP?

For clarity it was listed as an ACX meetup, not a LW meetup. Still, I'm not that surprised; the rationalist community seems to select for eccentricity, and sometimes the way someone's eccentric is they believe in ESP. Do meetups you attend not have this?

Despite making sure Bayes was fresh in everyone's minds, the believer couldn't be prompted into using numbers. If I recall correctly, they were "pretty sure that it might" show up after enough trials. After three sets of ten cards each set, I believe they were 2, 2, and 3 for correct guesses. (Quote marks and correctness are to the best of my memory.) A different attendee worked out the base rates for different levels of correct guesses, most attendees made a rough guess at the base rates and were pretty confident the results would be one to three correct guesses. I wish we'd done more trials with them!

There's a huge gap between believing that ESP is possible in principle and believing that this particular setup will show effects. This setup is about people doing the guesses without strongly developed skills and without much intentionality. on the part of the person who's thoughts are guessed.

Even if a person would generally believe that ESP is possible in principle, I would expect a rationalist to be more into Leverage style intention phenomena experiments than having believes about these kinds of card-guessing in experimental setups that look like what you described.

It seems to me that there's a large body of papers on hypnosis that do suggest that it's not in the same reference class as telepathy. 

If you want to prove hypnosis, one problem is that it needs some skill to do it. As an experiment, I think suggesting color blindness and letting people do the stoop test would be one of the best setups. 

On the English Wikipedia article on Parapsychology, hypnosis is listed between false awakening and ideomotor phenomenon. When I google "parapsychology" I get 

"the study of mental phenomena which are excluded from or inexplicable by orthodox scientific psychology (such as hypnosis, telepathy, etc.)." -Definitions from Oxford Languages.

On the one hand, that's obviously argument by definition and I maybe shouldn't do that. On the other hand, the reference class I'm trying to construct isn't "things Screwtape is very confident are false" it's "things which Screwtape expects meetup attendees are skeptical of and which a local meetup could test, ideally requiring a bit of statistics to investigate." For that use case, the remote detection of staring is the odd one out since the equipment to measure electrodermal activity is kind of niche. It got included on the list essentially as a hat tip to the Scott Alexander article that gave me the idea for this meetup. The worst thing about that list to me is that halfway I switch from specific experiments on telepathy to one word parapsychology or pseudoscience effects. That's just messy categorization.

I realized after I hit post that I forgot to say thank you for suggesting an experiment for hypnosis, so, thank you! I'll try and get ahold of a copy to read myself, but from reading the abstract I suspect once I do I'll edit that in as a link with Hypnosis.

I'm not sure how easy is it for someone without hypnosis experience to run a hypnosis induction to create the effect but in general the stroop experiment is a good way to move hypnosis effects out of subjective effects to objective effects.

A lot of skepticism toward hypnosis is about whether someone who's hypnotized is just role-playing and the good thing about the stroop test is that you can't just role play to not feel any interference in the stroop test. 

Once can be a misprint but three in a row is enemy action a mistaken belief. Stroop!

You are right, it should be stroop I edited the typo.