So you think you want to be rational, to believe what is true even when sirens tempt you?  Great, get to work; there's lots you can do.  Do you want to justifiably believe that you are more rational than others, smugly knowing your beliefs are more accurate?  Hold on; this is hard

Humans nearly universally find excuses to believe that they are more correct that others, at least on the important things. They point to others' incredible beliefs, to biases afflicting others, and to estimation tasks where they are especially skilled.  But they forget most everyone can point to such things.  

But shouldn't you get more rationality credit if you spend more time studying common biases, statistical techniques, and the like?  Well this would be good evidence of your rationality if you were in fact pretty rational about your rationality, i.e., if you knew that when you read or discussed such issues your mind would then systematically, broadly, and reasonably incorporate those insights into your reasoning processes. 

But what if your mind is far from rational?  What if your mind is likely to just go through the motions of studying rationality to allow itself to smugly believe it is more accurate, or to bond you more closely to your social allies? 

It seems to me that if you are serious about actually being rational, rather than just believing in your rationality or joining a group that thinks itself rational, you should try hard and often to test your rationality.  But how can you do that? 

To test the rationality of your beliefs, you could sometimes declare beliefs, and later score those beliefs via tests where high scoring beliefs tend to be more rational.  Better tests are those where scores are more tightly and reliably correlated with rationality.  So, what are good rationality tests?

76 comments, sorted by
magical algorithm
Highlighting new comments since Today at 4:15 AM
Select new highlight date
Moderation Guidelinesexpand_more

Play poker for significant amounts of money. While it only tests limited and specific areas of rationality, and of course requires some significant domain-specific knowledge, poker is an excellent rationality test. The main difficulty of playing the game well, once one understands the basic strategy, is in how amazingly well it evokes and then punishes our irrational natures. Difficulties updating (believing the improbable when new information comes in), loss aversion, takeover by the limbic system (anger / jealousy / revenge / etc), lots of aspects that it tests.

Agreed, but I think it is easier to see yourself confront your irrational impulses with blackjack. For instance, you're faced with a 16 versus a 10; you know you have to hit, but your emotions (at least mine) tell me not to. Anyone else experience this same accidental rationality test?

For amateur players, sure. But there is an easily memorizable table by which to play BJ perfectly, either basic strategy or counting cards. So you always clearly know what you should do. If you are playing BJ to win, it stops being a test of rationality.

Whereas even when you become skilled at poker, it is still a constant test of rationality both because optimal strategy is complex (uncertainty about correct strategy means lots of opportunity to lie to yourself) and you want to play maximally anyway (uncertainty about whether opponent is making a mistake gives you even more chances to lie to yourself). Kinda like life...

Whether a person memorizes and uses the table is still a viable test. No rational person playing to win would take an action incompatible with the table, and acting only in ways compatible with the table is unlikely to be accidental for an irrational person.

A way of determining whether people act rationally when it is relatively easy to do so can be quite valuable, since most people don't.

A problem here is that it takes something like tens or hundreds of thousands of hands for the signal to emerge from the noise.

Poker isn't just about calculating probabilities, it's also about disguising your reactions and effectively reading others'. Being rational has nothing to do with competence at social interaction and deception.

A good test has no confounding variables. Poker, then, is not a good test of rationality.

I understand Annoyance's point to be: Prefer online poker to in-person.

An excellent point and suggestion.

Any test in which there are confounding variables should be suspect, and every attempt should be made to eliminate them. Looking at 'winners' isn't useful unless we know the way in which they won indicates rationality. Lottery winners got lucky. Playing the lottery has a negative expected return. Including lottery winners in the group you scrutinize means you're including stupid people who were the beneficiaries of a single turn of good fortune.

The questions we should be asking ourselves are: What criteria distinguish rationality from non-rationality? What criteria distinguish between degrees of rationality?

The thought occurs to me that the converse question of "How do you know you're rational?" is "Why do you care whether you have the property 'rationality'?" It's not unbound - we hope - so for every occasion when you might be tempted to wonder how rational you are, there should be some kind of performable task that relates to your 'rational'-ness. What kind of test could reflect this 'rationality' should be suggested from consideration of the related task. Or conversely, we ask directly what associates to the task.

Prediction markets would be suggested by the task of trying to predict future variables; and then conversely we can ask, "If someone makes money on a prediction market, what else are they likely to be good at?"

I think there is likely a distinction between being rational at games and rational at life. In my experience those who are rational in one way are very often not rational in the other. I think it highly unlikely that there is a strong correlation between "good at prediction markets" or "good at poker" and "good at life". Do we think the best poker players are good models for rational existence? I don't think I do and I don't even think THEY do.

A suggestion:

List your goals. Then give the goals deadlines along with probabilities of success and estimated utility (with some kind of metric, not necessarily numerical). At each deadline, tally whether or not the goal is completed and give an estimation or the utility.

From this information you can take at least three things.

  1. Whether or not you can accurately predict your ability.
  2. Whether or not you are picking the right goals (lower than expected utility would be bad, I think)
  3. With enough date points your could determine your ration of success to utility. Too much success and not enough utility means you need to aim higher. Too little success for goals with high predicted utility mean either aim lower or figure out what you're doing wrong in pursuing the goals. If both are high you're living rationally, if both are low YOU'RE DOING IT WRONG.

The process could probably be improved if it was done transparently and cooperatively. Others looking on would help prevent you from cheating yourself.

Not terribly rigorous, but thats the idea.

I'm not sure if this works, since a lazy person could score very low on utility yet still be quite rational.

Are there cognitive scientists creating tests like this? If not, why not?

This almost seems too obvious to mention in one of Robin's threads, but I'll go ahead anyway: success on prediction markets would seem to be an indicator of rationality and/or luck. Your degree of success in a game like HubDub may give some indication as to the accuracy of your beliefs, and so (one would hope) the effectiveness of your belief-formation process.

I would expect success in a prediction market to be more correlated with amount of time spent researching than with rationality. At best, rationality would be a multiplier to the benefit gained per hour of research; alternatively, it could be an upper bound to the total amount of benefit gained from researching.

Prediction markets tend to be zero-sum games. Most rational agents would prefer to play in a real stock market - where you can at least expect to make money in line with inflation.

The relevant category is constant-sum games, and stock markets are that as well if liquidity traders are included in the relevant trader set. One can subsidize prediction markets so that all traders can gain by revealing info.

Tim: but don't prediction markets have a lot of benefits compared to stock markets? They terminate on usually set dates, they're very narrowly focused (compare 'will the Democrats win in 2008' to 'will GE's stock go up on October 11, 2008' - there are so many fewer confounding factors for the former), and they're easier to use.

Prediction markets as implemented in the real world mostly use fake money, which is a drawback.

Well, you don't have to use the fake-money ones. Intrade and Betfair have always seemed perfectly serviceable to me, and they're real money prediction markets.

On a related point, fake money could actually be good. There's less motivation to bet what you really truly think, but not wagering real money means you can make trades on just about everything in that market - you aren't so practically or mentally constrained. You're more likely to actually play, or play more.

(Suppose I don't have $500 to spare or would prefer not to risk $500 I do have? Should I not test myself at all?)

This is the fundamental question that determines whether we can do a lot of things - if we can't come up with evidence-based metrics that are good measures of the effect of rationality-improving interventions, then everything becomes much harder. If the metric is easily gamed once people know about it, everything becomes much harder. If it can be defeated by memorization like school, everything becomes much harder. I will post about this myself at some point.

This problem is one we should approach with the attitude of solving as much as possible, not feeling delightfully cynical about how it can't be solved, but at least you know it. It's too important for that. It sets up the incentives in the whole system. If the field of hedonics can try to measure happiness, we can at least try to measure rationality.

...but not to derail the discussion, Robin's individual how-do-you-know? stance is a valid perspective, and I'll post about the scientific measurement / institutional measurement problems later.

Prediction markets seem like the obvious answer, but the range of issues currently available as contracts is too narrow to be of much use. Most probability calibration exercises are focus on trivial issues. I think they are still useful, but the real test is how you deal with emotional issues, not just neutral ones.

This might not be amenable to a market, but I would like to see a database collected of the questions being addressed by research in-progress. Perhaps when a research grant is issued, if a definite conclusion is anticipated, the question can be entered in the database. The question would have to be constructed so that users could enter in definite predictions. At first glance, I think the predictions would have to remain private until after a result is published, but I'm unsure. In contrast to existing prediction sites, this would have the benefits of a broad range of questions formulated by experts who are concerned about precisely defining the issue at hand. How would a standard procedure of formulating a question for a prediction database influence the type of research done?

Another broad test I've considered is whether your judgment of the quality of an individual's claims is correlated with their social club affiliations. To me, political party stands out as the most relevant example of a social club for this purpose. If you find yourself disagreeing with Republicans more frequently than with Democrats over factual issues, that appears to be a sign of confirmation bias. Because association with social clubs tends to be caused by how you were raised, social class, or the sheer desire to be part of a group, there is no reason to think that affiliation should be a strong predictor of quality. Any thoughts?

"If you find yourself disagreeing with Republicans more frequently than with Democrats over factual issues, that appears to be a sign of confirmation bias."

Only to the extent that you think Republicans and Democrats are equally wrong. I don't see any rule demanding this.

Since all accurate maps are consistent with eachother, everyone with accurate political beliefs are going to be consistent, and you might as well use a new label for this regularity. It's fine to be a Y if the causality runs from X is true -> you believe X is true -> you're labeled "member of group Y".

Tests for "Group Y believes this-> I believe this" that can rule out the first causal path would be harder to come up with, especially since irrational group beliefs are chosen to be hard to prove (to the satisfaction of the group members).

The situation gets worse when you realize that "Group Y believes this-> I believe this" can be valid to the extent that you have evidence that Group Y gets other things right.

ISO quality certification doesn't look primarily at the results, but primarily at the process. If the process has a good argument or justification that it consistently produces high quality, then it is deemed to be compliant. For example "we measure performance in [this] way, the records are kept in [this] way, quality problems are addressed like [this], compliance is addressed like [such-and-so]".

I can imagine a simple checklist for rationality, analogous to the software carpentry checklist.

  1. Do you have a procedure for making decisions?
  2. Is the procedure available at the times and locations that you make decisions?
  3. How do you prevent yourself from making decisions without following this procedure?
  4. If your procedure depends on calibration data, how do you guarantee the quality of your calibration data?
  5. How does your procedure address (common rationality failure #1)?
  6. et cetera

Sorry, it's just a sketch of a checklist, not a real proposal, but I think you get the idea. Test the process, not the results. Of course, the process should describe how it tests the results.

How do you know whether the checklist actually works or if it's just pointless drudgery?

Sorry, I said "Test the process, not the results", which is a strictly wrong misstatement. It is over-strong in the manner of a slogan.

A more accurate statement would be "Focus primarily on testing process, and secondarily on testing results."

Consult someone else who commented on this article. I didn't have an idea for how to solve Dr. Hanson's original question. I was trying to pull the question sideways.

Set up a website where people can submit artistic works - poetry, drawings, short stories, maybe even pictures of themselves - and it's expected rating on a 1-10 scale.

The works would be publicly displayed, but anonymously, and visitors could rate them ("nonymously" is to make sure the ratings are "global" and not "compared to other work by the same guy" - so maybe the author could be displayed once you rated it).

You could then compare the expected rating of a work to the actual ratings it received, and see how much the author under- or over-estimates himself.

(for extra measurment of calibration, you could also ask the author to give a confidence factor, though I'm not sure how exactly it should be presented and calculated)

Your own art has the advantage of being something about which you might be systematically biased, and which can still be evaluated pretty easily (as opposed to predictions about how to get out of the financial crisis).

Another test.

  1. Find out the general ideological biases of the test subject

  2. Find two studies, one (Study A) that supports the ideological biases of the test subject, but is methodologically flawed. The other (Study B) refutes the ideological biases of the subject, but is methodologically sound.

  3. Have the subject read/research information about the studies, and then ask them which study is more correct.

If you randomize this a bit (sometimes the study is both correct and "inline with one's bias") and run this multiple times on a person, you should get a pretty good read on how rational they are.

Some people might decide "Because I want to show off how rational I am, I'll accept that study X is more methodologically sound, but I'll still believe in my secret heart that Y is correct"

I'm not sure any amount of testing can handle that much self-deception, although I'm willing to be convinced otherwise :)

  1. How do you know your determination of "ideological bias" isn't biased itself?
  2. All experiments are flawed in one way or another to some degree. Are you saying one study is more methodologically flawed than another? How do you measure the degree of the flaws? How do you know your determination of flaws isn't biased?
  3. Again, you've already decided the which study is "correct" based on your own biased interpretations. How do you prove the other person is wrong and it's not you that is biased?

I agree with the randomize and repeat bit though.

However, I would like to propose that this test methodology for rationality is deeply flawed.

Keep track of when you change your mind about important facts based on new evidence.

a) If you rarely change your mind, you're probably not rational.

b) If you always change your mind, you're probably not very smart.

c) If you sometimes change your mind, and sometimes not, I think that's a pretty good indication that you're rational.

Of course, I feel that I fall into category (c), which is my own bias. I could test this, if there was a database of how often other people had changed their mind, cross-referenced with IQ.

Here's some examples from my own past:

  1. I used to completely discount AGW. Now I think it is occuring, but I also think that the negative feedbacks are being ignored/downplayed.

  2. I used to think that the logical economic policy was always the right one. Now, I (begrudgingly) accept that if enough people believe an economic policy is good, it will work, even though it's not logical. And, concomitantly, a logical economic policy will fail if enough people hate it.

  3. Logic is our fishtank, and we are the fish swimming in it. It is all we know. But there is a possibility that there's something outside the fishtank, that we are unable to see because of our ideological blinders.

  4. The two great stresses in ancient tribes were A) "having enough to eat" and B) "being large enough to defend the tribe from others". Those are more or less contradictory goals. But both are incredibly important. People who want to punish rulebreakers and free-riders are generally more inclined to weigh A) over B). People who want to grow the tribe, by being more inclusive and accepting of others are more inclined to weight B) over A).

  5. None of the modern economic theories seem to be any good at handling crises. I used to think that Chicago and Austrian schools had better answers than Keynesians.

  6. I used to think that banks should have just been allowed to die, now I'm not so sure - I see a fair amount of evidence that the logical process there would have caused a significant panic. Not sure either way.

I'm not sure about this.

The words are vague enough that I think we'll usually see ourselves as only sometimes changing our mind. That becomes the new happy medium that we all think we've achieved, simply because we're too ignorant on what it actually means to change your beliefs the right amount that we think.

I'm having a hard time knowing how I could decide if I'm changing my beliefs the right amount; since that would be a (very rough) estimation of an indirect indicator, I feel like I have to disagree with the potential of this idea.

I am 95% confident that calibration tests are good tests for a very important aspect of rationality, and would encourage everyone to try a few.

Yes calibration tests are rationality tests, but they are better tests on subjects where you are less likely to be rational. So what are the best subjects on which to test your calibration?

I suspect I should also be writing down calibrated probability estimates for my project completion dates. This calibration test is easy to do oneself, without infrastructure, but I'd still be interested in a website tabulating my and others' early predictions and then our actual performance -- perhaps a page within LW?. Might be especially good to know about people within a group of coworkers, who could perhaps then know how much to actually estimate timelines when planning or dividing complex projects.

Wouldn't making a probability estimate for your project completion dates influence your date of completion? Predicting your completion times successfully won't prove your rationality.

This is a good point. Still, it would provide evidence of rationality, especially in the likely majority of cases where people didn't try to game the system by e.g. deliberately picking dates far in advance of their actual completions, and then doing the last steps right at that date. My calibration scores on trivia have been fine for awhile now, but my calibration at predicting my own project completions is terrible.

I wonder to what degree this is a problem of poor calibration vs. poor motivation. Maybe commitment mechanisms like Stikk.com would have a greater marginal benefit than better calibration. I don't know about you, but that seems to be the case with regards to similar issues on my end.

Perhaps we could make a procedure for asking your friends, coworkers, and other acquaintance (all mixed together) to rate you on various traits, and anonymizing who submitted which rating to encourage honesty? You could then submit calibrated probability estimates as to what ratings were given.

I'd find this a harder context in which to be rational than I'd find trivia.

Actually, there's probably some website out there already that lets one solicit anonymous feedback. (Which would be a rationality boost for some of us in itself, even apart from calibration -- though I'd like to try calibration on it, too.)

Does anybody know of such a site? I spent an hour looking on Google -- perhaps not with the right keywords -- and found only What Others Think, Kumquat, and a couple Facebook/Myspace apps.

Both look potentially worth using, but neither is ideal. Are there other competitors?

An ideal rationality test would be perfectly specific: there would be no way to pass it other than being rational. We can't conveniently create such a test, but we can at least make it difficult to pass our tests by utilizing simple procedures that don't require rationality to implement.

Any 'game' in which the best strategies can be known and preset would then be ruled out. It's relatively easy to write a computer program to play poker (minus the social interaction). Same goes for blackjack. It takes rationality to create such a program, but the program doesn't need rationality to function.

"Do you want to justifiably believe that you are more rational than others, smugly knowing your beliefs are more accurate?"

Is this what people want? To me it would make more sense to cultivate the belief that one is NOT more rational than others, and that one's beliefs are no more likely than theirs to be accurate, a priori. Try to overcome the instinct that a belief is probably correct merely because it is yours.

Now I can understand that for people at the cutting edge of society, pushing into new frontiers like Robin and Eliezer, this would not work. If someone came up to Robin and criticized idea futures, or to Eliezer and said that friendly AI would not work, and they responded, "oh, I guess maybe you're right, thanks" - well, then, they wouldn't get anything done.

But for most of us, this is not an issue. Factual disagreements in my experience are seldom about things that would keep us from being productive and successful in our lives. People tend to disagree most vociferously on things that don't have the slightest impact on their lives, like political and sports questions. Isn't that right?

Even for researchers, in a way it doesn't matter because we are paying them to push the boundaries. It is their job to adopt opinions and fight for them. They are obligated to assume that just because an idea is theirs, it is probably right. Researchers are paid to be irrational in this way, and indeed it is hard to see how a rational person could be successful in science.

Just a personal problem that seems to me to be a precursor to the rationality question.

Various studies have shown that a persons 'memory' of events is very much influenced by later discussion of the event, when put into situations such as the 'Stanford Prison Experiment' or the 'Milgram Experiment' people will do unethical acts under pressure of authority and situation.

Yet people have a two-fold response to these experiments. A) They deny the experiments are accurate, either in whole, or in degree B) They deny that they fall into the realm of those that would be so affected.

With of course, the obvious caveat that some people actually are not so affected in those experiments (or do remember thing accurately), and will stand up for what they determine as ethical regardless.

The obvious fact seems to be that it is among those that honestly consider the possibility that their thoughts can be affected by these outside influences that the greatest chance of successfully maintaining one's own identity against them exists, but others than acknowledging this fact (Which can certainly be faked, even self-deceptively) what self-assessments allow one to develop this?

Once we have that, it seems to me that the question of maintaining rationality itself clarifies itself greatly.

Jonnan

There is quite a gap between wanting to be rational and wanting to know how unbiased you are. Since the test is self-administered, pursuing the first desire could easily lead to a favourable, biased, seemingly rational test result. This result would be influenced by personal expectations, and it's reliability is null according to Löb's Theorem. The latter desire implies one being open to his biased state and states his purpose of assessing some sort of bias/rational balance. This endeavour is more profitable than the previous because, hopefully, it offers actionable information.

Perhaps one could have a good shot at finding out more about his biases by making quick judgements and later trying to contemplate various aspects and sequences of his or her judgement with accounting of seemingly absurd alternatives and attention paid to the smallest of details. The result should occur as a percentage of correct/faulty conclusions. Apart from discovering some sort of rational/biased ratio in a line of thought, this process should automatically bring one closer to being rational by the memorizing of judgement flaws, their sources and pattern, and by the development of a habit for righteous thinking from a rationality point of view.

This test could have a much more reliable result when performed on someone else by providing all necessary information for a right conclusion to be reached together with vague, inconclusive information for incorrect conclusions to be reached, and great incentives for reaching some of the wrong conclusions.

Speaking of incentives, I believe anyone trying to be as rational as possible within a group could be influenced by group values and beliefs. Therefore, trying to find out biases within the group's/group members' judgements could be correlated with one's affinity for that group. Rationality should be neutral, but neutrality is seldom a group value so chances are high that instinctive-rationalists will be outliers. The tendency to agree with beliefs is probably as wrong as the tendency of finding biases, the two depending on one's grade of sympathy for a specific group.

Identifying exterior biases will be an unreliable measure of one's rationality, because of the incentives which exist in interacting with others and also because there is usually little information on exterior thought processes which led to specific outcomes. Also, beliefs widely spread across a social system can have consequences that seemingly prove those beliefs even without their being rational, in which case, comparing one's judgement to facts would be an indicator of power rather than rationality.

it seems that the the relevance of the calibration tests are that the better calibrated you are the better you will perform on predicting how happy various outcomes will make you. being good at this puts you at a huge advantage relative to the average person.

My concern is less with the degree to which I wear the rationality mantle relative to others (which is low to the point of insignificance, though often depressing) and more with ensuring that the process I use to approach rationality is the best one available. To that end, I'm finding that lurking on LessWrong is a pretty effective process test, particularly since I tend to come back to articles I've previously read to see what further understanding I can extract in the light of previous articles. SCORING such a test is a more squiffy concept, though correlation of my (defeasibly) rational conclusions to the evidence of reality seems an effective measure... though I've now run into a concern that my own self-assessment of confirmation bias elimination may not be satisfactorily objective. The obvious solution to THAT problem would be to start publishing process/conclusion articles to LessWrong. I think I may have to start doing so.