Wiki Contributions

Comments

Answer by omarkJul 23, 202370

I have no data and all I'll talk about is my experience and my gut feelings on this topic.

The first question I ask myself is what problem am I trying to solve or what am I trying to improve? The answer for me is that I suspect that I am vastly overconfident in my predictions and that I selectively forget my worst forecasts. For example I remember being skeptical about someone after interviewing them for a job, writing as much to my supervisor and the applicant getting the job anyway. A few years later the person was doing an excellent job and I was surprised when stumbling upon my own e-mail. I had forgotten about it. On the other hand, I believe I have almost never forgotten cases where I made a good call about something.

So the first problem I want to solve is to become more humble in my predictions by making my failures more visible to myself.

The second improvement I would like to achieve is determining whether the probability numbers I attach to future events are reliable indicators or completely useless. That is calibration e.g. Brier score. I suspect these values have to be interpreted by "category" (i.e. you might have good calibration in politics but bad calibration in personal relationships) and that you only start getting useful results after one or two years and a few hundred forecasts.

I find it difficult to be sufficiently detailed and specific that allows for unambiguous resolution of the questions

Future-you is presumably not maliciously trying to deceive you, right? So the only case you need to worry about is future-you misunderstanding what you meant when you wrote the forecast.

I quite dislike how fuzzy the resolution becomes this way: did I really undertake a reasonable effort? Did I undershot or overshot it?

Do you think it very likely that present-you and future-you will have a very different perspective on what "reasonable effort" means? I would only clarify things up to the point where you trust future-you to do the right thing.

Perilous feedback loops can also creep in nonetheless: my reasonable effort for a 90% prediction might mean being more relaxed than otherwise: it’s a done deal, I might think.

I agree with these feedback loops. My perspective is that you should not strive for perfection but try to improve upon the status-quo. Even without writing down explicit predictions you will have such feedback loops. Do you think they become worse when you write the predictions down i.e. worse than when you just have this gut feeling something is a "done deal"?

You are right that making predictions and asking yourself questions that you might not have asked yourself otherwise might change your behavior. I would even say that it's not uncommon to use predictions as a motivational tool because you don't want to be proven wrong in front of yourself or others. The feedback loop is then purposefully built in.

One way of minimizing this might be to try to make predictions that are farther in the future and then trying to forget about them. For example make a lot of predictions so that you forget the particulars and then only look at the file a year later. This is a trade-off with updating the predictions on a regular basis with new information, which to me is more important.

Another potential solution is to ask other people (friends) to make predictions about you without telling you the details. They could give you a happiness questionnaire once every 3 months and not tell you until after resolution what they do with the data. In this case they are the ones working on their calibration. If you want to work on your own, you can make predictions about them.

Or take this simpler version:

If I switch to another job in the next 12 months, how likely is it that I’ll be more satisfied with it in the first two months than I’m now?

Hoo boy, where do we even start with this one, even though lots of people make major life decisions on exactly these kinds of hinges! What if I am just a little bit happier afterward, and it’s hard to say? Can I grade this as 60% passed (and 40% failed)?

No, I don't think you should grade it as 60% passed. It was a yes/no question. As long as you are even a little bit happier, the answer is yes.

At evaluation, I need only concern myself with how sure I am that I’m below 9 and above 7--"or am I at only 6.8?"

When making the prediction you already knew that your judgement at resolution was going to be subjective. If you dislike that, maybe it's not a useful prediction to make.

One way around this might be to try to make "job satisfaction" something you derive from multiple variables (e.g. take the mean value of "how nice is the office", "how nice are the colleagues", "how challenging are the tasks", ...). Then it won't be obvious at resolution time how to get the result that you wanted but rather you aggregate and you roll with the result.

I am really interested in forecasting and getting better at it so I am developing Cleodora, a free and open-source tool to track such forecasts. I encourage you and other readers to have a look, leave me your thoughts and help me improve it!

I think in many cases people would have been happy to continue after the timers were done. There was no really heated interaction and I assume it was also because of the time limit. The results would probably look different (better?) with more time.

The thing I'm worried about (and the source of the suggestion not to do controversial topics early) is people searching for reasons they think are respectable, not things they have as cruxes.

Sounds plausible. We did not really try to dig into that.

I would like to repeat the event in the future and maybe I'll introduce some variation to figure some of these questions out :-)

We ran this in Freiburg, Germany.

It worked very well! 11 people came and it was suitable for that group size. I made some small changes.

I started with steps 1 - 3 as described above. People had 5 minutes to come up with the reasons why they (dis-)believed the statement ("The earth is a flat disk") and then 10 minutes to poke holes (5 minutes per partner).

Then we came back to a big circle and ensured everything was clear. I then projected a list of statements onto the wall and gave each person 5 minutes to choose one statement and come up with reasons for (dis-)believing. I explained that making any assumptions or having only a certain degree of conviction was perfectly fine (e.g. I am 60% certain that capitalism has a net positive effect on the world), they should just make that clear to their partner later. Some people struggled making a choice so I offered to choose for them, which they accepted (so having some mechanism for assigning statements at random may be valuable).

Then people split again into (different!) pairs and had 15 minutes (7.5 minutes per person) to poke holes in each other's reasoning. For the most part people randomly chose different statements from their partner, which made the exercise more interesting because you had to poke holes into something that you hadn't thought about much yourself. Funnily, in one case, a pair chose the same statement but had opposing opinions on it.

We repeated it two more times with new statements and new partners.

The list of statements I projected onto the wall. Note that I don't necessarily endorse any of these statements as they are here listed. They are intentionally meant to be though provoking and controversial.

Many people preferred the more controversial statements at the bottom.

Easy

  • Earth is a flat disk.
  • The moon is made of cheese.
  • Humans have walked on the moon.
  • COVID is a dangerous disease that killed many people.
  • Vaccines cause autism.
  • Things fall because of gravity, which accelerates objects at 9.8 m/s^2
  • Humans use only 10% of their brain.
  • Humans are the most intelligent in the animal kingdom because they have the biggest total brain size.
  • The scientific method is the most effective way of achieving knowledge about the world and the universe.
  • Seeds are the spicy part of chili peppers.
  • Microwave ovens heat food from the inside out.
  • (Biological) evolution necessarily causes organisms to evolve from less complex to more complex.
  • Glass is a high-viscosity liquid at room temperature (e.g. old cathedral windows are thicker at the bottom)
  • Muscle soreness after exercising is caused by lactic acid.
  • Rusty metal can cause tetanus infections.
  • Cold temperatures in winter can cause people to catch a cold.
  • 0.999.... is the same as 1
  • Human beings have five senses.

Hard

  • Artificial Intelligence very seriously risks extinguishing all human life within the next 30 years.
  • Capitalism has a net positive effect on the world.
  • Colonialism had a net positive effect on the ex-colonies.
  • Open borders (no travel restrictions between countries) would have a net positive effect on the world.
  • Human beings are intrinsically more valuable than plants or animals.
  • Spending significant taxpayer resources (e.g. $1 million) to save five people lost in a submarine has a net positive effect on the world.
  • Investing significant resources (e.g. $1 million / year) to fight the extinction of animals such as the Sumatran rhinoceros (~ 80 left in existence) has a net positive effect on the world.
  • God exists.
  • Religion has a net positive effect on the world.
  • Gender is (almost) entirely a social construct.
  • Nuclear energy has a net positive effect on the world.
  • Every controversial topic should be publicly debated instead of censored.
  • Liberal democracy is the form of government with the highest net positive effect on the world.
  • Voting in national elections is a duty for the individual.
  • If a foreign nation starts a military invasion of your home country you have a moral duty to take up arms.
  • Objective reality exists.

Very nice exercise! We were a group of 11 and split into two groups of 5 and 6. After playing a few times we made one big round with all 11. Interestingly, not once did we hunt a stag! The exercise lead to some very interesting discussions afterwards.

I understand the idea of trading one things for another e.g. sleep for pleasure and I understand the idea that if you break something complex down into components and understand those components you understand the whole (reductionism). What I do not understand is the relation between the two things. Could someone explain it differently? It feels like the "Lego principle" section is disconnected and could have been omitted without losing anything.

We ran this in Freiburg, Germany in January 2023 with a total of 10 people.

Great:

  • It was very fun!
  • Many people had never explicitly thought about such rates of exchange and they found the implications interesting
  • Some people were able to identify inconsistencies at the end while going over their list of bookings

Worthy of improvement / Unclear things:

  • In the trade I am giving up what I have on my card. Can I give up more than one item? Do I already own that item beforehand? If I am giving up several is it because I own all of them or do I need to get them somehow (e.g. by paying money)? For example if my index card says "An apple" and the other person's index card says "1 million USD", would I trade my single apple for a fraction of the million USD or would I trade many tons of apples for the entire 1 million USD. Do I already own those tons of apples?
    • We solved this by specifying that you own your item once and always trading for a fraction of what the other person owns.
  • Including monetary values like "1 million USD" makes it less fun because those exchange rates are more common anyway
  • Some things are really hard to multiply or subdivide such as "the ability to breathe underwater".
    • We solved this by specifying that the fraction was having this ability for a certain amount of time instead of the rest of your life
  • Since my booking card always compares everything to my own "index card" I don't ever get any obvious inconsistencies. e.g. I only have trades involving the "mediocre laptop" with the plane ticket, with the bicycle, with everything else. But never an exchange between the plane ticket and the bicycle.
    • The inconsistencies only become clear once you start comparing the things on your booking card among themselves, which is made more complicated by the fact that you had to use fractions or multiples in the other trades. It would be great if these inconsistencies become directly obvious as part of the game rules e.g. by trading index cards at some point.

We tried this exercise in the Freiburg, Germany meetup a few days ago. We were 12 people, divided into two groups of 6.

The martial arts analogy with falling works really well and was well received.

Most people, myself included, felt that the exercise did not work as well as we were hoping. As a participant one is aware that the setting is artificial and admitting you are wrong is therefore quite easy. We even tried the harder variants with "making fun" and such.

One suggestion made in the feedback round was to ask participants to also provide confidence intervals. Then making fun could also be about having chosen those to be too wide.

I feel like it would be necessary to find questions people think they know the answers to and use those. Then admitting you are wrong would be more painful. Maybe a list of common misconceptions and people provide answers before knowing what the game is about. Then during the game people read their answers aloud with as much conviction as possible.

I wondered about the same thing.

Just to clarify: Did the LW team discover a bug and take the site down while the bug was being fixed or did someone with zero karma actually push the button?

If it's the second case:

  • How did you discover this given that no information about the person pressing (or rather entering the code) is being collected?
  • Shouldn't this count as having the nukes launched and the site simply staying down? Just like a real-life system where the security clearance system is severely buggy and a random janitor launches the missiles by simply trying some knobs. Sure, it would suck, but it wouldn't change the outcome.
Load More