What attributes make a task useful for rationality verification?

After thinking it over I believe I have identified three main components.  The first is that the task should be as grounded in the real world as is possible. The second is that the task should be involve a wide variety of subtasks, preferably ones that involve decision making and/or forecasting.  This will help insure that the effect is from general rationality, rather than from the rationality training helping with domain specific skills.  The third is that there should be clear measure of successes for the task.

As I am not personally involved with the field I could be missing something important, but it seems like founding a successful startup would fulfill all three components.  I propose that investigating the effect of giving startup founders rationality training would be a good basis for an experiment. Unfortunately, I do not know if it would be feasible to run such an experiment in real life.  Thus, I am turning to the LW community to see if the people reading this have any suggestions.

-addendum 

I didn't go into details about exact exprimental methods for a couple of reasons.   Partially because I assumed, apparently incorrectly, that it was obvious that any experiment for testing rationality would be conducted with the best experimental protocols that we could manage.   But mostly, because I thought that it would be good to get feed back on the basic idea of rationally verification + startups ?= good before spending time going into detail about things like control groups, random assignment ect. 

I welcome suggestions along those lines, and given the attention this has received will try to go back and add some of my own ideas when I have time, but wanted to make cleat that I wasn't intending this post as a detailed experimental design.

Also does anyone have any idea why the first part of this post has different spacing from the second?  It's not intentional on my part.

New to LessWrong?

New Comment
21 comments, sorted by Click to highlight new comments since: Today at 12:41 PM
[-][anonymous]12y50

I have thought of this as well. A startup seems like a great test of rationality. Unfortunately, doing a successful startup is hard. It would be great to get rationality to the point of it empowering any old smart person to do a startup, but I think we will need smaller tests before then.

That said, I had not though of coming at it the other way like you did. Take potential startup people, teach rationality to half of them, and observe the outcome. What a great idea!

I think the best source of startup-caliber people who are about to start one is ycombinator. I wonder if we could convince paul graham to let us test some ideas on his founders. The information feedback time is very long tho. I don't think you could get definitive results for at least half a year. Testing rationality ideas on startup people would be a great way to solidly verify the methods, but it's not fast enough to develop them.

(Nothing against your idea of a study - perhaps a control group receiving some other variety of self-help training would be better than a control of nothing at all).

Against taking lessons from startups: the big winners probably overestimated their likelihood of success.

That said, of course freedom from bias and rational-thinking-avoidance should improve your outcomes given a commitment to a particular goal.

[-][anonymous]12y00

control self-help

this is a great idea! more symmetry is always better. It would be good to test against no course as well tho.

the big winners probably overestimated their likely hood of success

It seems odd to bring this up. And so what? They won. If they used irrational methods (which they probably did), we should study those methods, pull out the rational core and use it to go start some businesses.

JG likely meant "likelihood" in the LW/Bayesian sense, where the proper estimate is the one that is justified by the available evidence at the time of estimation, not the one that is subsequently justified by the way things turned out. It's often useful to keep those concepts separate.

That's a good point. I agree with it, and that's how I try to use the term. Equivalent to your formulation: I was imagining the people who are just like the success stories you're trying to emulate, who weren't so lucky - how many of them are there, and what did they lose by trying and failing?

[-][anonymous]12y20

From what I've heard, if you fail at a startup, you come out of it with zero net worth and a lot of experience. So you don't lost much, even if it is not optimally productive.

I agree, but I consider the opportunity cost (and stress/sleep/health toll) significant.

[-][anonymous]12y00

Oops. You are right they should be separate. Fixed.

If they used irrational methods (which they probably did), we should study those methods, pull out the rational core and use it to go start some businesses.

A proposal worth trying. My point is that we should ideally be able to do the correct expected value + risk tolerance calculation in deciding whether to try a given venture, and that many of those who succeeded skipped that step or made an optimistic error. More generally, studying what properties are most frequent amongst the winners doesn't tell you enough about the value of acquiring those properties. (I'm assuming you care what happens to you if you fail.)

I'm hopeful that if overconfidence is necessary in a method-acting or emotional-battery sense, someone who understands that they're being overconfident can nonetheless knowingly push their affect in that direction (an open question).

Testing rationality ideas on startup people would be a great way to solidly verify the methods, but it's not fast enough to develop them.

Right. I picture us developing tests of how well people have learned specific methods, like recognizing mysterious answers to mysterious questions, and using those to develop the training program. Then performing the start up experiment to see if the separate training techniques were synergizing into something useful IRL. Do you think I should have mentioned that in the main post? It seems relevant, but I didn't want to obscure the main point with unnecessary detail.

That was one of the reasons I was thinking about the conjunction of rationality and startups.

[-]knb12y20

I like this idea a lot, but I have a few questions.

  1. Approximately how many hours of instruction are you imagining?
  2. Would you do any placebo controlling? Maybe give a week of generic business classes to the control group?
  3. How would you measure success? Just whether the company still exists? If they make a profit? Maybe some kind of subjective evaluation by the owners (i.e. do they consider the business to have met their own definition of success)?

How would you measure success? Just whether the company still exists? If they make a profit? Maybe some kind of subjective evaluation by the owners (i.e. do they consider the business to have met their own definition of success)?

Reading Paul Graham's essays gave my the impression that measuring start up success was a (mostly) solved problem, but looking back I can't see exactly what caused me to think that. If he does have some magic, yet publicly available method I would use that. Otherwise, I would look at all three of the things you mentioned.

Approximately how many hours of instruction are you imagining?

However many are usually included in the training program we are testing. If we test multiple programs there could be a variable number of hours.

Would you do any placebo controlling?

Yes, if possible.

The spinoff rationality training org is developing training materials, and their target level is that of the Silicon Valley startup crowd.

I'd say you're probably thinking along very similar lines as Eliezer et al.

The trouble is that you don't want to also measure a bunch of other stuff. Luck, connections, skills specific to your company, etc. This stuff makes your error bars really big. In order to get your error bars down, you might be able to measure some of these extra factors separately. But ultimately, I don't think running a startup will work as a standalone measure of rationality.

You could also do the lower-cost test of (simulated) venture-capital decisions, which presumabky tests a similar set of epistemic skills.

A cheap(er) way to get a less-clear look at the same information would be to find as many aspiring rationalist startup founders as possible (ones who have attended CFAR classes would be preferable, but we'd probably want to look at anyone associated with the rationalist community to have a larger sample), get data on startups without rationalist founders, and compare how well each group did. It would be really useful to know enough about the startups in the control group to control for things that could influence chances of success, especially things where rationalists tend to differ from the general population - IQ, autism spectrum quotient, etc.

So few startups succeed that it might not be possible to get a meaningful result out of this - I'm not sure how many rationalists have founded startups, but probably not enough to make a decent sample size.

Startups are hit and miss. There is a fair amount of luck involved, as well as other skills not related to rationality, and sometimes conflicting with it (using dark arts to successfully sell your idea/product, hire employees on a cheap with promise of a future big payout).

So no, it's not a good test, unless you control for for all other factors, which means you have to have many startups running at the same time and analyze the statistics after a few years.

So no, it's not a good test, unless you control for for all other factors, which means you have to have many startups running at the same time and analyze the statistics after a few years.

I think we have a illusion of transparency/dialect problem. I referred to the test as an experiment and in the usage II'm familiar with that would necessitate a control group and imply following other other generally accepted experimental protocols (e.x. random assignment ) as closely as is feasible.

This would only be useful if there is skill in tech entrepreneurship, and it's not just luck. The fact that previously successful entrepreneurs have a higher success rate points to that, but it's hardly conclusive.