Use and misuse of models: case study

by Stuart_Armstrong2 min read27th Apr 20174 comments


Personal Blog

Some time ago, I discovered a post comparing basic income and basic job ideas. This sought to analyse the costs of paying everyone a guaranteed income versus providing them with a basic job with that income. The author spelt out his assumptions and put together a two models with a few components (including some whose values were drawn from various probability distributions). Then he ran a Monte Carlo simulation to get a distribution of costs for either policy.

Normally I should be very much in favour of this approach. It spells out the assumptions, it uses models, it decomposes the problem, it has stochastic uncertainty... Everything seems ideal. To top it off, the author concluded with a challenge aiming at improving reasoning around this subject:

How to Disagree: Write Some Code

This is a common theme in my writing. If you are reading my blog you are likely to be a coder. So shut the fuck up and write some fucking code. (Of course, once the code is written, please post it in the comments or on github.)

I've laid out my reasoning in clear, straightforward, and executable form. Here it is again. My conclusions are simply the logical result of my assumptions plus basic math - if I'm wrong, either Python is computing the wrong answer, I got really unlucky in all 32,768 simulation runs, or you one of my assumptions is wrong.

My assumption being wrong is the most likely possibility. Luckily, this is a problem that is solvable via code.

And yet... I found something very unsatisfying. And it took me some time to figure out why. It's not that these models are helpful, or that they're misleading. It's that they're both simultaneously.

To explain, consider the result of the Monte Carlo simulations. Here are the outputs (I added the red lines; we'll get to them soon):

The author concluded from these outputs that a basic job was much more efficient - less costly - than a basic income (roughly 1 trillion cost versus 3.4 trillion US dollars). He changed a few assumptions to test whether the result held up:

For example, maybe I'm overestimating the work disincentive for Basic Income and grossly underestimating the administrative overhead of the Basic Job. Lets assume both of these are true. Then what?

The author then found similar results, with some slight shifting of the probability masses.


The problem: what really determined the result

So what's wrong with this approach? It turns out that most of the variables in the models have little explanatory power. For the top red line, I just multiplied the US population by the basic income. The curve is slightly above it, because it includes such things as administrative costs. The basic job situation was slightly more complicated, as it includes a disabled population that gets the basic income without working, and a estimate for the added value that the jobs would provide. So the bottom red line is (disabled population)x(basic income) + (unemployed population)x(basic income) - (unemployed population)x(median added value of jobs). The distribution is wider than for basic income, as the added value of the jobs is a stochastic variable.

But, anyway, the contribution of the other variables were very minor. So the reduced cost of basic jobs versus basic income is essentially a consequence of the trivial fact that it's more expensive to pay everyone an income, than to only pay some people and then put them to work at something of non-zero value.


Trees and forests

So were the complicated extra variables and Monte Carlo runs for nothing? Not completely - they showed that the extra variables were indeed of little importance, and unlikely to change the results much. But nevertheless, the whole approach has one big, glaring flaw: it does not account for the extra value for individuals of having a basic income versus a basic job.

And the challenge - "write some fucking code" - obscures this. The forest of extra variables and the thousands of runs hides the fact that there is a fundamental assumption missing. And pointing this out is enough to change the result, without even needing to write code. Note this doesn't mean the result is wrong: some might even argue that people are better off with a job than with the income (builds pride in one's work, etc...). But that needs to be addressed.

So Chris Stucchio's careful work does show one result - most reasonable assumptions do not change the fact that basic income is more expensive than basic job. And to disagree with that, you do indeed need to write some fucking code. But the stronger result - that basic job is better than basic income - is not established by this post. A model can be well designed, thorough, filled with good uncertainties, and still miss the mark. You don't always have to enter into the weeds of the model's assumptions in order to criticise it.

Personal Blog