It's probably worth figuring out what went wrong in Approach 1 to Example 1, which I think is this part:
[300 cities of 10,000 or more people per county] × [2500 counties in the USA]
Note that this gives 750,000 cities of 10,000 or more people in the US, for a total of at least 7.5 billion people in the US. So it's already clearly wrong here. I'd say 300 cities of 10,000 people or more per county is way too high; I'd put it at more like 1 (Edit: note that this gives at least 250 million people in the US and that's about right). This brings down the final estimate from this approach by a factor of 300, or down to 3 million, which is much closer.
(Verification: I just picked a random US state and a random county in it from Wikipedia and got Bartow County, Georgia, which has a population of 100,000. That means it has at most 10 cities with 10,000 or more people, and going through the list of cities it actually looks like it only has one such city.)
This gives about 2,500 cities in the US total with population 10,000 or more. I can't verify this number, but according to Wikipedia there are about 300 cities in the US with population 100,000 or more. Assuming the populations of cities a...
I feel compelled to repeat this old physics classic:
How Fermi could estimate things!
Like the well-known Olympic ten rings,
And the one-hundred states,
And weeks with ten dates,
And birds that all fly with one... wings.
:-)
I've run meetups on this topic twice now. Every time I do, it's difficult to convince people it's a useful skill. More words about when estimation is useful would be nice.
In most exercises that you can find on Fermi calculations, you can also actually find the right answer, written down somewhere online. And, well, being able to quickly find information is probably a more useful skill to practice than estimation; because it works for non-quantified information too. I understand why this is; you want to be able to show that these estimates aren't very far off, and for that you need to be able to find the actual numbers somehow. But that means that your examples don't actually motivate the effort of practicing, they only demonstrate how.
I suspect the following kinds of situations are fruitful for estimation:
Fermi's seem essential for business to me. Others agree; they're taught in standard MBA programs. For example:
Can our business (or our non-profit) afford to hire an extra person right now? E.g., if they require the same training time before usefulness that others required, will they bring in more revenue in time to make up for the loss of runway?
If it turns out that product X is a success, how much money might it make -- is it enough to justify investigating the market?
Is it cheaper (given the cost of time) to use disposable dishes or to wash the dishes?
Is it better to process payments via paypal or checks, given the fees involved in paypal vs. the delays, hassles, and associated risks of non-payment involved in checks?
And on and on. I use them several times a day for CFAR and they seem essential there.
They're useful also for one's own practical life: commute time vs. rent tradeoffs; visualizing "do I want to have a kid? how would the time and dollar cost actually impact me?", realizing that macademia nuts are actually a cheap food and not an expensive food (once I think "per calorie" and not "per apparent size of the container"), and so on and so on.
Oh, right! I actually did the comute time vs. rent computation when I moved four months ago! And wound up with a surprising enough number that I thought about it very closely, and decided that number was about right, and changed how I was looking for apartments. How did I forget that?
Thanks!
The main use I put Fermi estimates to is fact-checking: when I see a statistic quoted, I would like to know if it is reasonable (especially if I suspect that it has been misquoted somehow).
There's a free book on this sort of thing, under a Creative Commons license, called Street-Fighting Mathematics: The Art of Educated Guessing and Opportunistic Problem Solving. Among the fun things in it:
Chapter 1: Using dimensional analysis to quickly pull correct-ish equations out of thin air!
Chapter 2: Focusing on easy cases. It's amazing how many problems become simpler when you set some variables equal to 1, 0, or ∞.
Chapter 3: An awful lot of things look like rectangles if you squint at them hard enough. Rectangles are nice.
Chapter 4: Drawing pictures can help. Humans are good at looking at shapes.
Chapter 5: Approximate arithmetic in which all numbers are either 1, a power of 10, or "a few" -- roughly 3, which is close to the geometric mean of 1 and 10. A few times a few is ten, for small values of "is". Multiply and divide large numbers on your fingers!
... And there's some more stuff, too, and some more chapters, but that'll do for an approximate summary.
XKCD's What If? has some examples of Fermi calculations, for instance at the start of working out the effects of "a mole of moles" (similar to a mole of choc donuts, which is what reminded me).
Thanks, Luke, this was helpful!
There is a sub-technique that could have helped you get a better answer for the first approach to example 1: perform a sanity check not only on the final value, but on any intermediate value you can think of.
In this example, when you estimated that there are 2500 counties, and that the average county has 300 towns with population greater than 10,000, that implies a lower bound for the total population of the US: assuming that all towns have exactly 10,000 people, that gets you a US population of 2,500x300x10,000=7,500,000,000! That's 7.5 billion people. Of course, in real life, some people live in smaller towns, and some towns have more then 10,000 people, which makes the true implied estimate even larger.
At this point you know that either your estimate for number of counties, or your estimate for number of towns with population above 10,000 per county, or both, must decrease to get an implied population of about 300 million. This would have brought your overall estimate down to within a factor of 10.
I had the pleasure the other day of trying my hand on a slightly unusual use of Fermi estimates: trying to guess whether something unlikely has ever happened. In particular, the question was "Has anyone ever been killed by a falling piano as in the cartoon trope?" Others nearby at the time objected, "but you don't know anything about this!" which I found amusing because of course I know quite a lot about pianos, things falling, how people can be killed by things falling, etc. so how could I possibly not know anything about pianos falling and killing people? Unfortunately, our estimate gave it at around 1-10 deaths by piano-falling so we weren't able to make a strong conclusion either way over whether this happened. I would be interested to hear if anyone got a significantly different result. (We only considered falling grands or baby grands to count as upright pianos, keyboards, etc. just aren't humorous enough for the cartoon trope.)
I'll try. Let's see, grands and baby grands date back to something like the 1700s; I'm sure I've heard of Mozart or Beethoven using pianos, so that gives me a time-window of 300 years for falling pianos to kill people in Europe or America.
What were their total population? Well, Europe+America right now is, I think, something like 700m people; I'd guess back in the 1700s, it was more like... 50m feels like a decent guess. How many people in total? A decent approximation to exponential population growth is to simply use the average of 700m and 50m, which is 325, times 300 years, 112500m person-years, and a lifespan of 70 years, so 1607m persons over those 300 years.
How many people have pianos? Visiting families, I rarely see pianos; maybe 1 in 10 had a piano at any point. If families average a size of 4 and 1 in 10 families has a piano, then we convert our total population number to, (1607m / 4) / 10, 40m pianos over that entire period.
But wait, this is for falling pianos, not all pianos; presumably a falling piano must be at least on a second story. If it simply crushes a mover's foot while on the porch, that's not very comedic at all. We want genuine verticality, real free fall. So...
0 is within an order of magnitude of 3.9, after all.
No it's not! Actually it's infinitely many orders of magnitude away!
I disagree that the lower bound is 0; the right range is [-39,39]. Because after all, a falling piano can kill negative people: if a piano had fallen on Adolf Hitler in 1929, then it would have killed -5,999,999 people!
Cecil Adams tackled this one. Although he could find no documented cases of people being killed by a falling piano (or a falling safe), he did find one case of a guy being killed by a RISING piano while having sex with his girlfriend on it. What would you have estimated for the probability of that?
From the webpage:
The exception was the case of strip-club bouncer Jimmy Ferrozzo. In 1983 Jimmy and his dancer girlfriend were having sex on top of a piano that was rigged so it could be raised or lowered for performances. Apparently in the heat of passion the couple accidentally hit the up switch, whereupon the piano rose and crushed Jimmy to death against the ceiling. The girlfriend was pinned underneath him for hours but survived. I acknowledge this isn’t a scenario you want depicted in detail on the Saturday morning cartoons; my point is that death due to vertical piano movement has a basis in fact.
You did much better in Example #2 than you thought; the conclusion should read
60 fatalities per crash × 100 crashes with fatalities over the past 20 years = 6000 passenger fatalities from passenger-jet crashes in the past 20 years
which looks like a Fermi victory (albeit an arithmetic fail).
There are 3141 counties in the US. This is easy to remember because it's just the first four digits of pi (which you already have memorised, right?).
Thanks for writing this! This is definitely an important skill and it doesn't seem like there was such a post on LW already.
Some mild theoretical justification: one reason to expect this procedure to be reliable, especially if you break up an estimate into many pieces and multiply them, is that you expect the errors in your pieces to be more or less independent. That means they'll often more or less cancel out once you multiply them (e.g. one piece might be 4 times too large but another might be 5 times too small). More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).
Another mild justification is the notion of entangled truths. A lot of truths are entangled with the truth that there are about 300 million Americans and so on, so as long as you know a few relevant true facts about the world your estimates can't be too far off (unless the model you put those facts into is bad).
Tip: frame your estimates in terms of intervals with confidence levels, i.e. "90% probability that the answer is within and ". Try to work out both a 90% and a 50% interval.
I've found interval estimates to be much more useful than point estimates, and they combine very well with Fermi techniques if you keep track of how much rounding you've introduced overall.
In addition, you can compute a Brier score when/if you find out the correct answer, which gives you a target for improvement.
I will note that I went through the mental exercise of cars in a much simpler (and I would say better) way: I took the number of cars in the US (300 million was my guess for this, which is actually fairly close to the actual figure of 254 million claimed by the same article that you referenced) and guessed about how long cars typically ended up lasting before they went away (my estimate range was 10-30 years on average). To have 300 million cars, that would suggest that we would have to purchase new cars at a sufficiently high rate to maintain that number ...
Alternatively, you might allow yourself to look up particular pieces of the problem — e.g. the number of Sikhs in the world, the formula for escape velocity, or the gross world product — but not the final quantity you're trying to estimate.
Would it bankrupt the global economy to orbit all the world's Sikhs?
So, this isn't quite appropriate for Fermi calculations, because the math involved is a bit intense to do in your head. But here's how you'd actually do it:
Age-related mortality follows a Gompertz curve, which has much, much shorter tails than a normal distribution.
I'd start with order statistics. If you have a population of 5 billion people, then the expected percentile of the top person is 1-(1/10e9), and the expected percentile of the second best person is 1-(3/10e9). (Why is it a 3, instead of a 2? Because each of these expectations is in the middle of a range that's 1/5e9, or 2/10e9, wide.)
So, the expected age* of death for the oldest person is 114.46, using the numbers from that post (and committing the sin of reporting several more significant figures), and the expected age of death for the second oldest person is 113.97. That suggests a gap of about six months between the oldest and second oldest.
* I should be clear that this is the age corresponding to the expected percentile, not the expected age, which is a more involved calculation. They should be pretty close, especially given our huge population size.
But terminal age and current age are different- it could actually be...
Fermi estimates can help you become more efficient in your day-to-day life, and give you increased confidence in the decisions you face. If you want to become proficient in making Fermi estimates, I recommend practicing them 30 minutes per day for three months. In that time, you should be able to make about (2 Fermis per day)×(90 days) = 180 Fermi estimates.
I'm not sure about this claim about day-to-day life. Maybe there are some lines of work where this skill could be useful, but in general it's quite rare in day-to-day life where you have to come up w...
The guys at last.fm are usually very willing to help out with interesting research (or at least were when I worked there a couple of years ago), so if you particularly care about that information it's worth trying to contact them.
One of my favorite numbers to remember to aid in estimations is this: 1 year = pi * 10^7 seconds. Its really pretty accurate.
Of course for Fermi estimation just remember 1 Gs (gigasecond) = 30 years.
I spend probably a pretty unusual amount of time estimating things for fun, and have come to use more or less this exact process on my own over time from doing it.
One thing I've observed, but haven't truly tested, is my geometric means seem to be much more effective when I'm willing to put a more tight guess on them. I started off bounding them with what I thought the answer conceivably could be, which seemed objective and often felt easier to estimate. The problem was that often either the lower or upper bound was too arbitrary relative to it's weight on ...
To help remember this post and it's methods I broke it down into song lyrics and used Udio to make the song.
Play Fermi Questions: 2100 Fermi problems and counting.
http://www.fermiquestions.com/ link doesn't work anymore. It is also not on wayback machine. :(
I just wanted to say, after reading the Fermi estimate of cars in the US, I literally clapped - out loud. Well done. And I highly appreciate the honest poor first attempt - so that I don't feel like such an idiot next time I completely fail.
I'm happy to see that the Greatest Band of All Time is the only rock band I can recall ever mentioned in a top-level LessWrong post. I thought rationalists just sort of listened only to Great Works like Bach or Mozart, but I guess I was wrong. Clearly lukeprog used his skills as a rationalist to rationally deduce the band with the greatest talent, creativity, and artistic impact of the last thirty years and then decided to put a reference to them in this post :)
From On Things That Are Awesome:
Whenever someone compliments "Eliezer Yudkowsky", they are really complimenting "Eliezer Yudkowsky's writing" or "Eliezer Yudkowsky's best writing that stands out most in my mind". People who met me in person were often shocked at how much my in-person impression departed from the picture they had in their minds. I think this mostly had to do with imagining me as being the sort of actor who would be chosen to play me in the movie version of my life—they imagined way too much dignity. That forms a large part of the reason why I occasionally toss in the deliberate anime reference, which does seem to have fixed the divergence a bit.
I recently ran across an article describing how to find a rough estimate of the standard deviation of a population, given a number of samples, which seems that it would be suitable for Fermi estimates of probability distributions.
First of all, you need a large enough population that the central limit theorem applies, and the distribution can therefore be assumed to be normal. In a normal distribution, 99.73% of the samples will be within three standard deviations of the mean (either above or below; a total range of six standard deviations). Therefore, one ...
How long can the International Space Station stay up without a boost? I can think of a couple of ways to estimate that.
Out of the price of a new car, how much goes to buying raw materials? How much to capital owners? How much to labor?
I recommend trying to take the harmonic mean of a physical and an economic estimate when appropriate.
I recommend doing everything when appropriate.
Is there a particular reason why the harmonic mean would be a particularly suitable tool for combining physical and economic estimates? I've spent only a few seconds trying to think of one, failed, and had trouble motivating myself to look harder because on the face of it it seems like for most problems for which you might want to do this you're about equally likely to be finding any given quantity as its reciprocal, which suggests that a general preference for the harmonic mean is unlikely to be a good strategy -- what am I missing?
Just before the Trinity test, Enrico Fermi decided he wanted a rough estimate of the blast's power before the diagnostic data came in. So he dropped some pieces of paper from his hand as the blast wave passed him, and used this to estimate that the blast was equivalent to 10 kilotons of TNT. His guess was remarkably accurate for having so little data: the true answer turned out to be 20 kilotons of TNT.
Fermi had a knack for making roughly-accurate estimates with very little data, and therefore such an estimate is known today as a Fermi estimate.
Why bother with Fermi estimates, if your estimates are likely to be off by a factor of 2 or even 10? Often, getting an estimate within a factor of 10 or 20 is enough to make a decision. So Fermi estimates can save you a lot of time, especially as you gain more practice at making them.
Estimation tips
These first two sections are adapted from Guestimation 2.0.
Dare to be imprecise. Round things off enough to do the calculations in your head. I call this the spherical cow principle, after a joke about how physicists oversimplify things to make calculations feasible:
By the spherical cow principle, there are 300 days in a year, people are six feet (or 2 meters) tall, the circumference of the Earth is 20,000 mi (or 40,000 km), and cows are spheres of meat and bone 4 feet (or 1 meter) in diameter.
Decompose the problem. Sometimes you can give an estimate in one step, within a factor of 10. (How much does a new compact car cost? $20,000.) But in most cases, you'll need to break the problem into several pieces, estimate each of them, and then recombine them. I'll give several examples below.
Estimate by bounding. Sometimes it is easier to give lower and upper bounds than to give a point estimate. How much time per day does the average 15-year-old watch TV? I don't spend any time with 15-year-olds, so I haven't a clue. It could be 30 minutes, or 3 hours, or 5 hours, but I'm pretty confident it's more than 2 minutes and less than 7 hours (400 minutes, by the spherical cow principle).
Can we convert those bounds into an estimate? You bet. But we don't do it by taking the average. That would give us (2 mins + 400 mins)/2 = 201 mins, which is within a factor of 2 from our upper bound, but a factor 100 greater than our lower bound. Since our goal is to estimate the answer within a factor of 10, we'll probably be way off.
Instead, we take the geometric mean — the square root of the product of our upper and lower bounds. But square roots often require a calculator, so instead we'll take the approximate geometric mean (AGM). To do that, we average the coefficients and exponents of our upper and lower bounds.
So what is the AGM of 2 and 400? Well, 2 is 2×100, and 400 is 4×102. The average of the coefficients (2 and 4) is 3; the average of the exponents (0 and 2) is 1. So, the AGM of 2 and 400 is 3×101, or 30. The precise geometric mean of 2 and 400 turns out to be 28.28. Not bad.
What if the sum of the exponents is an odd number? Then we round the resulting exponent down, and multiply the final answer by three. So suppose my lower and upper bounds for how much TV the average 15-year-old watches had been 20 mins and 400 mins. Now we calculate the AGM like this: 20 is 2×101, and 400 is still 4×102. The average of the coefficients (2 and 4) is 3; the average of the exponents (1 and 2) is 1.5. So we round the exponent down to 1, and we multiple the final result by three: 3(3×101) = 90 mins. The precise geometric mean of 20 and 400 is 89.44. Again, not bad.
Sanity-check your answer. You should always sanity-check your final estimate by comparing it to some reasonable analogue. You'll see examples of this below.
Use Google as needed. You can often quickly find the exact quantity you're trying to estimate on Google, or at least some piece of the problem. In those cases, it's probably not worth trying to estimate it without Google.
Fermi estimation failure modes
Fermi estimates go wrong in one of three ways.
First, we might badly overestimate or underestimate a quantity. Decomposing the problem, estimating from bounds, and looking up particular pieces on Google should protect against this. Overestimates and underestimates for the different pieces of a problem should roughly cancel out, especially when there are many pieces.
Second, we might model the problem incorrectly. If you estimate teenage deaths per year on the assumption that most teenage deaths are from suicide, your estimate will probably be way off, because most teenage deaths are caused by accidents. To avoid this, try to decompose each Fermi problem by using a model you're fairly confident of, even if it means you need to use more pieces or give wider bounds when estimating each quantity.
Finally, we might choose a nonlinear problem. Normally, we assume that if one object can get some result, then two objects will get twice the result. Unfortunately, this doesn't hold true for nonlinear problems. If one motorcycle on a highway can transport a person at 60 miles per hour, then 30 motorcycles can transport 30 people at 60 miles per hour. However, 104 motorcycles cannot transport 104 people at 60 miles per hour, because there will be a huge traffic jam on the highway. This problem is difficult to avoid, but with practice you will get better at recognizing when you're facing a nonlinear problem.
Fermi practice
When getting started with Fermi practice, I recommend estimating quantities that you can easily look up later, so that you can see how accurate your Fermi estimates tend to be. Don't look up the answer before constructing your estimates, though! Alternatively, you might allow yourself to look up particular pieces of the problem — e.g. the number of Sikhs in the world, the formula for escape velocity, or the gross world product — but not the final quantity you're trying to estimate.
Most books about Fermi estimates are filled with examples done by Fermi estimate experts, and in many cases the estimates were probably adjusted after the author looked up the true answers. This post is different. My examples below are estimates I made before looking up the answer online, so you can get a realistic picture of how this works from someone who isn't "cheating." Also, there will be no selection effect: I'm going to do four Fermi estimates for this post, and I'm not going to throw out my estimates if they are way off. Finally, I'm not all that practiced doing "Fermis" myself, so you'll get to see what it's like for a relative newbie to go through the process. In short, I hope to give you a realistic picture of what it's like to do Fermi practice when you're just getting started.
Example 1: How many new passenger cars are sold each year in the USA?
The classic Fermi problem is "How many piano tuners are there in Chicago?" This kind of estimate is useful if you want to know the approximate size of the customer base for a new product you might develop, for example. But I'm not sure anyone knows how many piano tuners there really are in Chicago, so let's try a different one we probably can look up later: "How many new passenger cars are sold each year in the USA?"
As with all Fermi problems, there are many different models we could build. For example, we could estimate how many new cars a dealership sells per month, and then we could estimate how many dealerships there are in the USA. Or we could try to estimate the annual demand for new cars from the country's population. Or, if we happened to have read how many Toyota Corollas were sold last year, we could try to build our estimate from there.
The second model looks more robust to me than the first, since I know roughly how many Americans there are, but I have no idea how many new-car dealerships there are. Still, let's try it both ways. (I don't happen to know how many new Corollas were sold last year.)
Approach #1: Car dealerships
How many new cars does a dealership sell per month, on average? Oofta, I dunno. To support the dealership's existence, I assume it has to be at least 5. But it's probably not more than 50, since most dealerships are in small towns that don't get much action. To get my point estimate, I'll take the AGM of 5 and 50. 5 is 5×100, and 50 is 5×101. Our exponents sum to an odd number, so I'll round the exponent down to 0 and multiple the final answer by 3. So, my estimate of how many new cars a new-car dealership sells per month is 3(5×100) = 15.
Now, how many new-car dealerships are there in the USA? This could be tough. I know several towns of only 10,000 people that have 3 or more new-car dealerships. I don't recall towns much smaller than that having new-car dealerships, so let's exclude them. How many cities of 10,000 people or more are there in the USA? I have no idea. So let's decompose this problem a bit more.
How many counties are there in the USA? I remember seeing a map of counties colored by which national ancestry was dominant in that county. (Germany was the most common.) Thinking of that map, there were definitely more than 300 counties on it, and definitely less than 20,000. What's the AGM of 300 and 20,000? Well, 300 is 3×102, and 20,000 is 2×104. The average of coefficients 3 and 2 is 2.5, and the average of exponents 2 and 4 is 3. So the AGM of 300 and 20,000 is 2.5×103 = 2500.
Now, how many towns of 10,000 people or more are there per county? I'm pretty sure the average must be larger than 10 and smaller than 5000. The AGM of 10 and 5000 is 300. (I won't include this calculation in the text anymore; you know how to do it.)
Finally, how many car dealerships are there in cities of 10,000 or more people, on average? Most such towns are pretty small, and probably have 2-6 car dealerships. The largest cities will have many more: maybe 100-ish. So I'm pretty sure the average number of car dealerships in cities of 10,000 or more people must be between 2 and 30. The AGM of 2 and 30 is 7.5.
Now I just multiply my estimates:
[15 new cars sold per month per dealership] × [12 months per year] × [7.5 new-car dealerships per city of 10,000 or more people] × [300 cities of 10,000 or more people per county] × [2500 counties in the USA] = 1,012,500,000.
A sanity check immediately invalidates this answer. There's no way that 300 million American citizens buy a billion new cars per year. I suppose they might buy 100 million new cars per year, which would be within a factor of 10 of my estimate, but I doubt it.
As I suspected, my first approach was problematic. Let's try the second approach, starting from the population of the USA.
Approach #2: Population of the USA
There are about 300 million Americans. How many of them own a car? Maybe 1/3 of them, since children don't own cars, many people in cities don't own cars, and many households share a car or two between the adults in the household.
Of the 100 million people who own a car, how many of them bought a new car in the past 5 years? Probably less than half; most people buy used cars, right? So maybe 1/4 of car owners bought a new car in the past 5 years, which means 1 in 20 car owners bought a new car in the past year.
100 million / 20 = 5 million new cars sold each year in the USA. That doesn't seem crazy, though perhaps a bit low. I'll take this as my estimate.
Now is your last chance to try this one on your own; in the next paragraph I'll reveal the true answer.
…
…
...
Now, I Google new cars sold per year in the USA. Wikipedia is the first result, and it says "In the year 2009, about 5.5 million new passenger cars were sold in the United States according to the U.S. Department of Transportation."
Boo-yah!
Example 2: How many fatalities from passenger-jet crashes have there been in the past 20 years?
Again, there are multiple models I could build. I could try to estimate how many passenger-jet flights there are per year, and then try to estimate the frequency of crashes and the average number of fatalities per crash. Or I could just try to guess the total number of passenger-jet crashes around the world per year and go from there.
As far as I can tell, passenger-jet crashes (with fatalities) almost always make it on the TV news and (more relevant to me) the front page of Google News. Exciting footage and multiple deaths will do that. So working just from memory, it feels to me like there are about 5 passenger-jet crashes (with fatalities) per year, so maybe there were about 100 passenger jet crashes with fatalities in the past 20 years.
Now, how many fatalities per crash? From memory, it seems like there are usually two kinds of crashes: ones where everybody dies (meaning: about 200 people?), and ones where only about 10 people die. I think the "everybody dead" crashes are less common, maybe 1/4 as common. So the average crash with fatalities should cause (200×1/4)+(10×3/4) = 50+7.5 = 60, by the spherical cow principle.
60 fatalities per crash × 100 crashes with fatalities over the past 20 years = 6000 passenger fatalities from passenger-jet crashes in the past 20 years.
Last chance to try this one on your own...
…
…
…
A Google search again brings me to Wikipedia, which reveals that an organization called ACRO records the number of airline fatalities each year. Unfortunately for my purposes, they include fatalities from cargo flights. After more Googling, I tracked down Boeing's "Statistical Summary of Commercial Jet Airplane Accidents, 1959-2011," but that report excludes jets lighter than 60,000 pounds, and excludes crashes caused by hijacking or terrorism.
It appears it would be a major research project to figure out the true answer to our question, but let's at least estimate it from the ACRO data. Luckily, ACRO has statistics on which percentage of accidents are from passenger and other kinds of flights, which I'll take as a proxy for which percentage of fatalities are from different kinds of flights. According to that page, 35.41% of accidents are from "regular schedule" flights, 7.75% of accidents are from "private" flights, 5.1% of accidents are from "charter" flights, and 4.02% of accidents are from "executive" flights. I think that captures what I had in mind as "passenger-jet flights." So we'll guess that 52.28% of fatalities are from "passenger-jet flights." I won't round this to 50% because we're not doing a Fermi estimate right now; we're trying to check a Fermi estimate.
According to ACRO's archives, there were 794 fatalities in 2012, 828 fatalities in 2011, and... well, from 1993-2012 there were a total of 28,021 fatalities. And 52.28% of that number is 14,649.
So my estimate of 6000 was off by less than a factor of 3!
Example 3: How much does the New York state government spends on K-12 education every year?
How might I estimate this? First I'll estimate the number of K-12 students in New York, and then I'll estimate how much this should cost.
How many people live in New York? I seem to recall that NYC's greater metropolitan area is about 20 million people. That's probably most of the state's population, so I'll guess the total is about 30 million.
How many of those 30 million people attend K-12 public schools? I can't remember what the United States' population pyramid looks like, but I'll guess that about 1/6 of Americans (and hopefully New Yorkers) attend K-12 at any given time. So that's 5 million kids in K-12 in New York. The number attending private schools probably isn't large enough to matter for factor-of-10 estimates.
How much does a year of K-12 education cost for one child? Well, I've heard teachers don't get paid much, so after benefits and taxes and so on I'm guessing a teacher costs about $70,000 per year. How big are class sizes these days, 30 kids? By the spherical cow principle, that's about $2,000 per child, per year on teachers' salaries. But there are lots of other expenses: buildings, transport, materials, support staff, etc. And maybe some money goes to private schools or other organizations. Rather than estimate all those things, I'm just going to guess that about $10,000 is spent per child, per year.
If that's right, then New York spends $50 billion per year on K-12 education.
Last chance to make your own estimate!
…
…
…
Before I did the Fermi estimate, I had Julia Galef check Google to find this statistic, but she didn't give me any hints about the number. Her two sources were Wolfram Alpha and a web chat with New York's Deputy Secretary for Education, both of which put the figure at approximately $53 billion.
Which is definitely within a factor of 10 from $50 billion. :)
Example 4: How many plays of My Bloody Valentine's "Only Shallow" have been reported to last.fm?
Last.fm makes a record of every audio track you play, if you enable the relevant feature or plugin for the music software on your phone, computer, or other device. Then, the service can show you charts and statistics about your listening patterns, and make personalized music recommendations from them. My own charts are here. (Chuck Wild / Liquid Mind dominates my charts because I used to listen to that artist while sleeping.)
My Fermi problem is: How many plays of "Only Shallow" have been reported to last.fm?
My Bloody Valentine is a popular "indie" rock band, and "Only Shallow" is probably one of their most popular tracks. How can I estimate how many plays it has gotten on last.fm?
What do I know that might help?
I would guess that track plays obey a power law, with the most popular tracks getting vastly more plays than tracks of average popularity. I'd also guess that there are maybe 10,000 tracks more popular than "Only Shallow."
Next, I simulated being good at math by having Qiaochu Yuan show me how to do the calculation. I also allowed myself to use a calculator. Here's what we do:
P is the exponent for the power law, and C is the proportionality constant. We'll guess that P is 1, a common power law exponent for empirical data. And we calculate C like so:
So now, assuming the song's rank is 10,000, we have:
That seems high, but let's roll with it. Last chance to make your own estimate!
…
…
...
And when I check the answer, I see that "Only Shallow" has about 2 million plays on last.fm.
My answer was off by less than a factor of 10, which for a Fermi estimate is called victory!
Unfortunately, last.fm doesn't publish all-time track rankings or other data that might help me to determine which parts of my model were correct and incorrect.
Further examples
I focused on examples that are similar in structure to the kinds of quantities that entrepreneurs and CEOs might want to estimate, but of course there are all kinds of things one can estimate this way. Here's a sampling of Fermi problems featured in various books and websites on the subject:
Play Fermi Questions: 2100 Fermi problems and counting.
Guesstimation (2008): If all the humans in the world were crammed together, how much area would we require? What would be the mass of all 108 MongaMillions lottery tickets? On average, how many people are airborne over the US at any given moment? How many cells are there in the human body? How many people in the world are picking their nose right now? What are the relative costs of fuel for NYC rickshaws and automobiles?
Guesstimation 2.0 (2011): If we launched a trillion one-dollar bills into the atmosphere, what fraction of sunlight hitting the Earth could we block with those dollar bills? If a million monkeys typed randomly on a million typewriters for a year, what is the longest string of consecutive correct letters of *The Cat in the Hat (starting from the beginning) would they likely type? How much energy does it take to crack a nut? If an airline asked its passengers to urinate before boarding the airplane, how much fuel would the airline save per flight? What is the radius of the largest rocky sphere from which we can reach escape velocity by jumping?
How Many Licks? (2009): What fraction of Earth's volume would a mole of hot, sticky, chocolate-jelly doughnuts be? How many miles does a person walk in a lifetime? How many times can you outline the continental US in shoelaces? How long would it take to read every book in the library? How long can you shower and still make it more environmentally friendly than taking a bath?
Ballparking (2012): How many bolts are in the floor of the Boston Garden basketball court? How many lanes would you need for the outermost lane of a running track to be the length of a marathon? How hard would you have to hit a baseball for it to never land?
University of Maryland Fermi Problems Site: How many sheets of letter-sized paper are used by all students at the University of Maryland in one semester? How many blades of grass are in the lawn of a typical suburban house in the summer? How many golf balls can be fit into a typical suitcase?
Stupid Calculations: a blog of silly-topic Fermi estimates.
Conclusion
Fermi estimates can help you become more efficient in your day-to-day life, and give you increased confidence in the decisions you face. If you want to become proficient in making Fermi estimates, I recommend practicing them 30 minutes per day for three months. In that time, you should be able to make about (2 Fermis per day)×(90 days) = 180 Fermi estimates.
If you'd like to write down your estimation attempts and then publish them here, please do so as a reply to this comment. One Fermi estimate per comment, please!
Alternatively, post your Fermi estimates to the dedicated subreddit.
Update 03/06/2017: I keep getting requests from professors to use this in their classes, so: I license anyone to use this article noncommercially, so long as its authorship is noted (me = Luke Muehlhauser).