Response To: Who Likes Simple Rules?
Epistemic Status: Working through examples with varying degrees of confidence, to help us be concrete and eventually generalize.
Robin Hanson has, in his words, “some puzzles” that I will be analyzing. I’ve added letters for reference.
Some of these examples don’t require explanation beyond why they are bad ideas for rules. They wouldn’t work. Others would, or are even obviously correct, and require more explanation. I do think there is a enough of a pattern to be worth trying to explain. People reliably dislike the rule of law, and prefer to substitute the rule of man.
You can read his analysis in the original post. His main diagnosis is that people promote discretion for two reasons:
While I do think these are related to some of the key reasons, I do not think these point at the central things going on. Below I tackle all of these cases and their specifics. Overall I think the following are the five key stories:
I’ll now work through the examples. I’ll start with C, since it seems out of place, then go in Robin’s order.
Long post is long. If a case doesn’t interest you, please do skip it, and it’s fine to stop here.
The odd policy out here is C, the failure to have the government tell citizens what it already knows about citizens’ tax returns. This results in many lost hours searching down records, and much money lost paying for tax preparation. As a commentator points out, the argument that ‘this allows one to pay less than they owe’ doesn’t actually make sense as an explanation. The government still knows what it knows, and still cross-checks that against what you say and pay. In other countries one can still choose to adjust the numbers shared with you by the government.
In theory, one could make a case similar to those I’ll make in other places, that telling people what information the government knows and doesn’t know allows people to hide anything that the government doesn’t know about. But that seems quite minor.
What’s going on here is simple regulatory capture, corruption, rent seeking and criminal theft. Robin’s link explains this explicitly. Tax preparation corporations like H&R Block are the primary drivers here, because the rules generate more business for them. There is also a secondary problem that fanatical anti-tax conservatives like that taxes annoy people.
But I’ve never heard of a regular person who thinks this policy is a good idea, and I never expect to find one. We’re not this crazy. We have a dysfunctional government.
Robin’s explanations don’t fit case A. If you’re choosing randomly, no one can benefit from discretion. If you choose the same thing for everyone, again no one can benefit from discretion. If anything, the random system allows participants to potentially cheat or find a way around a selection they dislike, whereas a universal system makes this harder. Other things must be at work here.
This is the opposite of C, a case where people do oppose the change, but the change would be obviously good.
I call this the “too important to know” problem.
To me this is a clear case of The Copenhagen Interpretation of Ethics and Asymmetric Justice interacting with sacred values.
An experiment interacts with the problem and in particular interacts with every subject of the experiment, and with every potential intervention, in a way sufficient to render you blameworthy for not doing more, or not doing the optimal thing.
The contrast between the two cases is clear.
Without an experiment, we’re forced to make a choice between options A and B. People mostly accept that one must do their best, and potentially sacred values are up against other potentially sacred values, and one must guess and try their best.
In the cases in the study, it’s even more extreme. We’re choosing to implement A or implement B, in a place where normally one would do nothing. So we’re comparing doing something about the situation to doing nothing. It’s no surprise that ‘try to reduce infections’ comes out looking good.
With an experiment, the choice is between experimentation and non-experimentation. You are choosing to prioritize information over all the sacred values the non-objectionable choices are trading off. Even if the two choices are fully non-objectionable, choosing between them still means placing the need to gather information over the needs of the people in the experiment.
The needs of specific people are, everywhere and always, a sacred value. Someone, a particular real person, is on the line. When put up against “information” what chance does this amorphous information have?
Copenhagen explains why it is pointless to say that the experiment is better for these patients than not running one. Asymmetric Justice explains why the benefits to future patients doesn’t make up for it.
There are other reasons, too.
People don’t like their fates coming down to coin flips. They don’t like uncertainty.
People don’t like asymmetry or inequality – if I get A and you get B, someone got the better deal, and that’s not fair.
If you choose a particular action, that provides evidence that there was a reason to choose it. So people instinctively adjust some for the fact that it was chosen. Whereas in an experiment, it’s clear you don’t know which choice is better (unless you do know and are simply out to prove it, in which case you are a monster). That doesn’t inspire confidence.
A final note is that if you look at the study in question, it suggests another important element. If you choose A, you’re blameworthy for A and for ~B, but you’re certainly not blameworthy for ~A or for B! Whereas if you choose (50% A, 50% B) then you are blameworthy for A, ~A, B and ~B, plus experimentation in general. That’s a lot of blame.
Remember Asymmetric Justice. If any element of what you do is objectionable, everything you do, together, is also objectionable. A single ‘problematic’ element ruins all.
So if we look at Figure 1 in the study, we see in case C that the objection score for the A/B test is actually below what we’d expect if we thought the chances of objecting to A and B were independent, and people were objecting to the experiment whenever they disliked either A or B (or both). In cases B and D, we see only a small additional rate of objection. It’s only in case A that we see substantial additional objection. Across the data given, it looks like this phenomenon explains about half of the increased rate of objection to the experiments.
It also looks like a lot of people explicitly cited things like ‘playing with people’s lives’ via experiment, and object to experimentation as such at least when the stakes are high.
I do not think Robin’s story of expectation of personal benefit is the central story here, either. The correlation isn’t even that high in his poll.
If police have discretion IN GENERAL regarding who they arrest, do you think they will on average use that discretion to arrest those who actually do more net social harm? Do you think that you will tend to be favored by this discretion?
49% Yes to both
13% No to both
10% Yes re net harm, No re me
28% No re net harm, Yes re me
— Robin Hanson (@robinhanson) May 6, 2019
If you think net harm is reduced (59%), you’re (49/59) 84% to think you’ll benefit. If you think net harm is not reduced, you are (28/41) 68% to think you’ll benefit. Given that you’d expect models to give correlated returns to the two questions – if discretion is used wisely, it should tend to benefit both most people and a typical civilian looking to avoid doing social harm, and these are Robin’s Twitter followers – I don’t think personal motivation is explaining much variance here.
The question also asking about only part of the picture. Yes, we would hope that police (and prosecutors and others in the system) would use discretion, at least in part, to arrest those who do more net social harm over those who do less net social harm.
But that’s far from the only goal of criminal justice, or of punishment. We would also hope that authorities would use discretion to accomplish other goals, as well.
Some of these goals are good. Others, not so much.
What are some other goals we might have? What are other reasons to use discretion?
I think there are five broad reasons, beyond using it to judge social harm.
To this we would add Robin’s explanations, that one might want to benefit from this directly, and/or one might want to signal support for such authorities. And the more discretion they have, the more one would want to signal one’s support – see reason 1.
A case worth grappling with can be made for or against each of these five justifications being net good. So one could argue in favor like this with arguments like (I am not endorsing these, nor do they necessarily argue for today’s American level of discretion):
Or, to take the opposite stances:
The best arguments I know about against discretion have nothing to do with the social harm caused by punished actions. They are arguments for rule of law, and to guard against what those with discretion will do with that power. These effects are rather important and problematic even when the system is working as designed.
The best arguments I know about in favor of discretion also have nothing to do with the social harm caused by punished actions. They have to do with the system depending on discretion in order to be able to function, and in order to ensure cooperation. A system without discretion by default makes the spread of any local information everyone’s enemy, and provides no leverage to overcome that. If we didn’t have discretion, we would have to radically re-examine all of our laws and our entire system of enforcement, lest everything fall apart.
My model says that we currently give authorities too much discretion, and (partly) as a result have punishments that are too harsh. And also that the authorities have so much discretion partly because punishments have been made too harsh. Since discretion and large punishments give those with power more power, it would be surprising if this were not the case.
The National Health Service gets criticized constantly because it is their job to deny people health care. There is not enough money to provide what we would think of as an acceptable level of care under all circumstances, because our concept of acceptable level of care is all of the health care. In such a circumstance, there isn’t much they could do.
Using deterministic rules based on numbers is the obviously correct way to ration care. Using human discretion in each case will mean either always giving out care, since the choice is between care or no care – which is a lot of why health care costs are so high and going higher – or not always giving out care when able to do so, which will have people screaming unusually literal bloody murder.
Deterministic rules let individuals avoid blame, and allow health care budgets to be used at all. But that doesn’t mean people are going to like it. If anything, they’re going to be mad about both the rules and the fact that they don’t have a human they can either blame or try to leverage to get what they want. There’s also the issue of putting a value on human life at all, which is bad enough but clearly unavoidable.
More than that, once you explicitly say what you value by putting numbers on lives and improvements in quality of life, you’re doing something both completely necessary and completely unacceptable. The example of someone in a wheelchair is pretty great. If you don’t provide some discount in value of quality of life for physical disability, then you are saying that physical disabilities don’t decrease quality of life. Which has pretty terrible implications for a health care system trying to prevent physical disabilities. If you do say they decrease quality of life, you’re saying people with disabilities have less value. There are tons of places like this.
Another way to view this is that the only way for one to make health care decisions to ration care or otherwise sacrifice sacred values to stay on budget, without blame, is to have all those decisions be seen as out of your control and not your choice. The only known way to do that is to have a system in place, and point to that. That system then becomes a way to not interact with the system, avoiding blame. Whereas proposing or considering any other system involves interaction, and thus blame.
If you are caught making a trade-off between a sacred value (life) and a non-sacred value (money), it’s not going to go well. Of course a company doing an explicit calculation here is going to get punished, as is a government policy making an explicit comparison. Humans don’t care about the transitive property.
Thus, firms and governments, who obviously need to value risk to human life at a high but finite numerical cost, will need to do this without writing the number down explicitly in any way. This is one of the more silly things one cannot consider, that one obviously must consider. In a world where we are blameworthy (to the point of being sued for massive amounts) for doing explicit calculations that acknowledge trade-offs or important facts about the world, firms and governments are forced to make their decisions in increasingly opaque ways. One of those opaque preferences will be to favor those who rely on opaqueness and destroy records, and to get rid of anyone who is transparent about their thinking or otherwise, and keeps accurate records.
Tenure is about evaluating what a potential professor would bring to the university. No matter what extent politics gets involved, this is someone you’ll have to work with for decades. After this, rule of law does attach. You won’t be able to fire them afterwards unless they violate one of a few well-defined rules – or at least, that’s how it’s supposed to work, to protect academic freedom, whether or not it does work that way. You’ll be counting on them to choose and do research, pursue funding, teach and advise, and help run the school, and be playing politics with them.
That’s a big commitment. There are lots more people who want it and are qualified on paper than there are slots we can fund. And there are a lot more things that matter than how much research one can do. Some of them are things that are illegal to consider, or would look bad if you were found to be considering them. Others simply are not research done. You can’t use a formula, because people bring unique strengths and weaknesses, and you’re facing other employers who consider these factors. Even if a simple system could afford to mostly ‘take its licks’ you would face massive adverse selection, as everyone with bad intangibles would knock at your door.
You need to hold power over the new employees, so they’ll do the work that tenured employees don’t want to do, and so they’ll care about all aspects of their job, rather than doing the bare technical minimum everywhere but research.
Then there’s the Goodhart factors on the papers directly. One must consider how the publications themselves would be gamed. If there was a threshold requirement for journal quality, the easiest journals that count would be the only place anything would be published. If you have a point system, they’d game that system, and spend considerable time doing it. If you don’t evaluate paper quality or value, they won’t care at all about those factors, focusing purely on being good enough to make it into a qualifying journal. Plus, being able to evaluate these questions yourself without an outside guide or authority will be part of the job you’re trying to get. We need to test that, too.
What you’re really testing for when you consider tenure, ideally, is not only skill but also virtue. You want someone who is naturally driven to scholarship and the academy, to drive forward towards important things. While also caring enough to do a passable job with other factors. Otherwise, once they can’t be fired, you won’t be able to get them to do anything. Testing for virtue isn’t something you can quantify. You want someone who will aim for the spirit rather than the letter, and who knows what the spirit is and cares about it intrinsically. If you judge by the letter, you’ll select for the opposite, and if you specify that explicitly, you’ll lose your signal that way as well.
I’d write this one up to power and exploitation of those lower on the totem pole, the need to test for factors that you can’t say out loud, the need to test for virtue, and the need to test for knowing what is valuable.
People rightfully don’t think this number will tell us much, even now when it is not being gamed and vulnerable to Goodhart. Robin seems to be assuming that one should think that a previous win percentage should be predictive of a lawyer’s ability to win a particular case, rather than being primarily a selection effect, or a function of when they settle cases.
I doubt this is the case, even with a relatively low level of adversarial Goodhart effects.
Most lawyers or at least their firms have great flexibility in what cases they pursue and accept. They also have broad flexibility in how and when they settle those cases, as clients largely rely on lawyers to tell them when to settle. Some of them will mostly want cases that are easy wins, and settle cases that likely lose. Others, probably better lawyers for winning difficult cases, will take on more difficult cases and be willing to roll the dice rather than settle them.
I don’t even know what counts as a ‘win’ in a legal proceeding. In a civil case you strategically choose what to ask for, which might have little relation to realistic expectations for a verdict, so getting a lesser amount might or might not be a ‘win’ and any settlement might be a win or loss even if you know the terms, and often the terms are confidential.
Thus, if I was looking for a lawyer, I would continue to rely on personal recommendations, especially from lawyers I trust, rather than look at overall track records, even if those track records were easily available. I don’t think those track records are predictive. Asking questions like someone’s success in similar style cases, with richer detail in each case, seems better, but one has to pay careful attention.
If people started using win-loss records to choose lawyers, and lawyers started optimizing their win-loss records, what little information those records might have gets even less useful. You would mostly be measuring which lawyers prioritize win-loss records, by selecting winners and forcing them to verdict, while avoiding, settling or pawning off losers, and by getting onto winning teams, and so on. By manipulating the client and getting them to do what was necessary. It’s not like lawyers don’t mostly know which cases are winners. By choosing a lawyer with too good a win-loss record, you’d be getting someone who cares more about how they look in a statistic than doing what’s right for their clients, and also who has the flexibility to choose which cases they have to take.
The adverse selection here, it burns.
That’s what I’d actually expect now. Some lawyers do care a lot about their track records, they’ll have better track records, and they’re exactly who you want to avoid. I’d take anyone bragging about their win rate as a very negative sign, not a positive one.
So I don’t think this is about simple rules, or about people’s cognitive errors, or anything like that. I think Robin is just proposing a terrible measure that is not accurate, not well-defined and easily gamed, and asking why we aren’t making it available and using it.
Contrast this with evaluations of doctors or hospitals for success rates or death rates from particular surgeries. That strikes me as a much better place to implement such strategies, although they still have big problems with adversarial Goodhart if you started looking. But you can get a much better idea of what challenges are being tackled and about how hard they are, and a much better measure of the rate of success. I’d still worry a lot about doctors selecting easy cases and avoiding hard ones, both for manipulation and because of what it would do to patient care.
A general theme of simple rules is that when you reward and punish based on simple rules, one of the things you are rewarding is a willingness to prioritize maximizing for the simple rule over any other goal, including the thing you’re trying to measure. Just like any other rule you might use to reward and punish. The problem with simple rules is that they explicitly shut out one’s ability to notice such optimization and punish it, which is the natural way to keep such actions in check. Without it, you risk driving out anyone who cares about anything but themselves and gaming the system, and creating a culture where gaming the system and caring about yourself are the only virtues.
If all you care about is the ‘productivity’ of the asset and/or the revenue raised, then of course you use an auction. Easy enough, and I think people recognize this. They don’t want that. They want a previously public asset to be used in ways the public prefers, and think that we should prefer some uses to other uses because of the externalities they create.
It seems reasonable to use the opportunity of selling previously public goods to advance public policy goals that would otherwise require confiscating private property. Private sellers will also often attach requirements to sales, or choose one buyer over another, sacrificing productivity and revenue for other factors they care about.
We can point out all we like how markets create more production and more revenue, but we can’t tell people that they should care mostly about the quantity of production and revenue instead of other things. When there are assets with large public policy implications and externalities to consider, like the spectrum, it makes sense to think about monopoly and oligopoly issues, about what use the assets will be put to by various buyers, and what we want the world to look like.
That doesn’t mean that these good factors are the primary justifications. If they were, you’d see conditional contracts and the like more often, rather than private deals. The real reason is usually that other mechanisms allow insiders to extract public resources for private gains. This is largely a story of brazen corruption and theft. But if we’re going to argue for simple rules because they maximize simple priorities, we need to also argue for why those priorities cover what we care about, or we’ll be seen as tone deaf at best, allowing the corrupt to win the argument and steal our money.
Low fee index funds are growing increasingly popular each year, taking in more money and a greater share of assets. Their market share is so large that being included in a relevant index has a meaningful impact on share prices.
Managed funds are on the decline. Most of these funds are not especially prestigious and most people invested in them don’t brag about them, nor do they have much special faith in those running the funds. They’re just not enough on the ball to realize they’re being taken for a ride by professional thieves.
Nor do I think most people care about associating with high status hedge funds or anything like that. I don’t see it, at all.
Also, those simple rules? You can find them in active funds, too. A lot of them are pretty popular. Simple technical analysis, simple momentum, simple value rules, and so on. What counts as simple? That’s a matter of perspective. Index providers are often doing staggeringly complex things under the hood. And indexing off someone else’s work is a magician’s trick, free riding off the work of others in a way that gets dangerous if too many start relying on it.
Most regular investors who think about what they’re doing at all, know they should likely be in index-style funds, and increasingly that’s where they are. If there’s a mystery at all it’s at least contained at the high end, in hedge funds with large minimums.
One can split the remaining ‘mystery’ into two halves. One is, why do some people think there exist funds that have sufficient alpha to justify their fees? Two is, why do some people think they’ve found one of those funds?
The first mystery is simple. They’re right. There exist funds that have alpha, and predictably beat the market. The trick is finding them and getting your money in (or the even better trick is figuring out how to do it yourself). I don’t want to get into an argument over efficient markets here and won’t discuss it in the comments, but the world in which no one can beat the market doesn’t actually make any sense.
The second mystery is also simple. Marketing, winners curse, fooled by randomness and adverse selection, and the laws of markets. Of course a lot more people think they’ve found the winner than have actually found one.
This is a weird case in many ways, but my core take here is that the part of this that does belong on this list, is an example of complexity as justification for theft.
Google is the auction company. They were uniquely qualified to run an auction and bypass the banks, and did it (as I understand it) largely because it was on brand and they’d have felt terrible doing otherwise. A more interesting case is Spotify, who recently simply let people start trading its stock without an IPO at all. Although they still paid the banks fees, which I find super weird and don’t understand. There never was a rebellion.
How do banks extract the money?
My model is something like this, coming mostly from reading Matt Levine. The banks claim that they provide essential services. They find and line up customers to buy the stock, they vouch for the stock, they price the stock properly to ensure a nice bump so everyone feels happy, they backstop things in case something goes wrong, they handle a ton of details.
What they really do are two things. Both are centered around the general spreading by banks of FUD: Fear, Uncertainty and Doubt.
First, they prevent firms from suddenly having to navigate a legally tricky and potentially risky, and potentially quite complex, world they know nothing about, where messing up could be a disaster. One does not simply sell the company or take it public, as much as it might look simple from the outside. And while the bank’s marginal costs are way, way lower than what they charge, trying to get that expertise in house in a confident way is hard.
Second, they are what people are comfortable with. You’re not blameworthy for paying the bank. It’s the null action. If you do it, no one says ‘hey they’ve robbed us all of a huge amount of money.’ Instead, they say ‘good on you for not being too greedy and trying to maximize the price while risking the company’s future.’
They’re doing this at the crucial moment when how you look is of crucial importance, when you’re about to get a huge windfall for years or a lifetime of work and give the same to everyone who helped you. When you’re spending all your energy negotiating lots of other stuff. A disruption threatens to unravel all of that. What’s a few percent in that situation? So what if you don’t price your IPO as high as you could have so that bankers can enjoy their bounce?
Banks are conspiring with the buyers to cheat the sellers out of the value of what they bring to the table. Buyers who object are threatened with ostracism and being someone no one is comfortable with, with the other side walking away from the table after buyers put in the work to get here.
Is this all guillotine-worthy highway robbery? Hell yes. Completely.
Banks (and the buyers who are their best customers and allies) are colluding with this pricing, and that’s the nicest way to put this. Again, this is theft. Complexity is introduced to allow rent seeking and theft, exploiting a moment of vulnerability.
Interesting that Robin says the system ‘appears stable.’ To me it does not seem stable. We just had a huge college admissions scandal that damaged faith in the system and a quite-well justified lawsuit against Harvard. We have the SAT promising to introduce ‘adversity scores.’ We have increasingly selective admissions eating more and more of childhood, and the rule that what can’t go on forever, won’t. This calls for some popcorn.
What’s causing the system to be complex? We see several of the answers in play here.
We see the ‘factors you can’t cite explicitly’ problem and the ‘we don’t want something we can be sued or blamed for’ here. Admissions officers are trying to pick kids who will be successful at school and in life, as well as satisfy other goals. A lot of the things that predict good outcomes in life are not things you would want to be caught dead using as a determinant in admissions even if they weren’t illegal to use in admissions. The only solution is to make the system complex and opaque, so no one can prove what you were thinking.
We also see complexity as a way for the rich and powerful to expropriate resources, in the sense that the rich and powerful and their children are likely to be more successful, and more likely to give money to the school. And of course, if the school has discretion, that gives the school power. It can extract resources and prestige from others who want to get their kids in. Employees, especially high-up ones, can extract things even without illegal bribes. Why pass that up?
We see the Goodhart’s Law and adverse selection problems. If you admit purely on the basis of a test, and the other schools admit on the basis of a range of factors, you don’t get the best test scorers unless you’re Harvard. You get the kids who are an epic fail at those other factors.
If you give kids an explicit target, they and their parents will structure their entire lives around it. They’ll do that even with a vague implicit target, as they do now. If it’s explicit, you get things like you see in China, where (according to an eyewitness who once came to dinner) many kids are pulled from school and do nothing but cram facts into their heads for the college admissions exam for years. And why shouldn’t they?
So you get kids whose real educations are crippled, who have no life experience and no joy of childhood. The only alternative is to allow a general sense of who the kid is and what they’ve done to matter. To be able to holistically judge kids and properly adjust.
As always, the more complex and hard to understand the game, the greater the expert’s advantage. The rich and powerful who understand the system and can make themselves look good will have a large edge from that. And the more we explicitly penalize them for those advantages, but not for their gaming of the system, the more we force them to game the system even harder. If you use an adversity score to set impossibly high standards for rich kids, they’re going to use every advantage they have to make up for that even more than they already do.
And of course, part of the test is seeing how you learn to game the test and what approach you take. Can you do it with grace? Do you do too much of it, not enough or the right amount?
This is all an anti-inductive arms race. The art of gaming the system is in large part the art of making it look like you’re not gaming the system, which is an argument for simpler rules. At this point, what portion of successful admissions is twisting the truth? How much is flat out lying? How much is presenting yourself in a misleading light? To what extent are we training kids from an early age to have high simulacrum levels and sacrifice their integrity? A lot. Integrity being explicitly on the test just makes it one more thing you need to learn to fake.
I hate the current situation, and the educational system in general, but I think the alternative of a simple, single written test, with the system otherwise unchanged, would be worse. But of course, we’d never let it be that simple. That’s all before the fights over how to adjust those scores for ‘adversity’ and ‘diversity,’ and how to quantify that, and the other things we’d want to factor in. Can you imagine what happens to high schools if grades don’t matter? What if grades did matter in a formulaic matter and students and teachers were forced to confront the incentives? The endless battles over what other life activities should score points, the death of any that don’t, and the twisting into point machines of those that do?
So here we have all the Goodhart problems, and the theft problems, and the power problems, and the blameworthy considerations and justifications problems and lawsuit problems with their incentive to destroy all information. The gang’s all here.
I love me a prediction market, but you have to do it right. Would enough people and money participate? If they did, would they have the right incentives? If both of those, would you want that to be how you make decisions?
I think the answer to the first question is yes, if you structure it right. If there are only two possibilities and one of them will happen, you can make it work.
The answer to the second question is, no.
We can consider two possibilities.
In scenario one, this acts as an advisory for the board, to help them decide what to do.
In scenario two, this is the sole thing looked at, and CEOs are fired if and only if they are judged to be bad for the stock price, or can otherwise only be fired for specific causes (e.g. if found shooting a man on Fifth Avenue or stealing millions in company funds and spending them on free to play games, you need to pull the plug without stopping to look at the market).
The problem with scenario one is that you’re trading on how much the company is worth given that the CEO was fired. That’s very different from what you think the company would be worth if we decided to fire the CEO. The scenarios where the CEO is fired are where the board is unhappy with them, which is usually because of bad things that would make us think the stock is likely to be less valuable, like the stock price having gone down or insiders knowing the CEO has done or will do hard to measure long term damage. That doesn’t mean it won’t also take into account other things like whether the CEO is paying off the board, but the correlation we’re worried about is still super high. Giving the board discretion, that market participants would expect the board to use, hopelessly muddles things.
You could try to solve that problem by having the market trade only very close in time to the board decision. You kind of have to do that anyway, to avoid having a lame duck CEO. But it still depends on a lot of private information, and the decision will still reveal a lot about the firm. So I think that realistically this won't work.
The problem with scenario two is that you’ve taken away any ability to punish or reward the CEO for anything other than future stock prices. This effectively gives the CEO absolute power, and allows them to get away with any and all bad behavior of all kinds. Even if past behavior lowers the stock price, it only matters to the extent that it predicts future actions which would further lower the stock price. So CEOs don’t even have to care about the stock price. They only need to care about the stock price predictions in relation to each other. So the best thing the CEO can do is make getting rid of them as painful as possible. Even more than now, they want to make sure that losing them destroys the company as much as possible. Their primary jobs now are to hype themselves as much as possible to outsiders, and to spend capital manipulating these prediction markets.
Again, we’re seeing Goodhart problems, we’re seeing reinforcement of power (in this case, of the board over the CEO, so it’s a balance of power we likely welcome), and the ability to take things into consideration without needing to make them explicit or measurable, as companies both care about things they’re not legally allowed to care about and which we wouldn’t like hearing they cared about, especially explicitly, and they need to maintain confidentiality.
In the health care system hard patients are usually send to hospital that have a reputation for being able to deal with hard cases well, and thus I'm not sure how well simply statistics go.
I wrote the prediction-based medicine article to propose a system that would actually be able to allow you to compare the skill of two medical practioners for your individual issue.
You could do the same thing with lawyers. You shop around and tell a few lawyers about your case and every lawyer has to give you the likelihood of them winning your case for you. When there's a system that tracks the predictions in the background and tells you how good the lawyer would be at predicting, you would suddenly be able to shop for the best lawyer.
If implemented fully that gets some interesting things to happen and might be an improvement (I'm not sure, the system is too complex to be confident one way or another). But I don't see this as a remotely realistic path. We're talking about making predictive accuracy central to status, success and compensation, in places where getting it right is notoriously hard and everyone's bad at it, and where distortions are everywhere. For starters, you're predicting your success rate conditional on the treatment being chosen, and the treatment is chosen largely based on your conditional success rate and those quoted by others. This is great fun to think about but a nightmare to actually implement. And of course, it's fully incompatible with other existing legal requirements, ethical standards, professional practices, and so on.
A better variation on this is to let providers put in bids in some form to provide the service, that pay out to them conditional on some form of success, in a way that's a lot simpler for humans to handle and doesn't serve as a giant tax on people who don't have graduate-level probability skills. Pretty much any true market implementation would end up being a huge upgrade, and while it's outside the Overton window, it seems like it might only be a Shut-Up-And-Do-The-Impossible level impossible task.
Uber managed to be a marketplace that allows people to be more effective at buying taxi services. I see no reason why should be impossible for someone to create a new marketplace that provides a more effective way to buy medical services or legal services.
You would need to run the system at a jurisdiction that gives some freedom, but it's worth keeping in mind that most Westerns jurisdictions provide at the moment enough freedom for acupuncturists and homeopaths being in business.
You wouldn't start by going directly after mainstream medicine but with more alternative providers like hypnotists.
A hypnotists who can cure an allergy in a few hours of work would benefit a lot from a central website funneling him clients that know that he can actually cure patients reliably of their allergies. A highly effective hypnotists is going to make a lot more money when he's payed by the outcomes he produces instead of him being payed by the hour.
I believe that there are enough people currently who can produce great outcomes for their clients but who don't have any way to trustworthly inform prospective buyers about their skills and distinguish themselves from the noise empty marketing promises that a startup who provides a marketplace where those people could sell their services would be viable.
Vassar speaks in The legend of healthcare about how much value a person who provides the mirror hand treatment for amputees can provide to patients. If a person in a city specializes in doing the mirror hand treatment and could advertises their services to the patients who need them via such a marketplace the person could create an obsene amount of value while being able to charge prices that reflect that the buyer knows that this is his way to stop the immense pain.
in a way that's a lot simpler for humans to handle and doesn't serve as a giant tax on people who don't have graduate-level probability skills.
I don't think you need high level probability skills on the buyer side. If a website lists treatments and says treatment A has 60% chance of curing you and costs 500$ while treatment B has 80% chance of curing you and costs 1000$ that doesn't need graduate level math.
Unfortunately, you need some statistics skills on the provider side but teaching healthcare providers the ability to make accurate predictions about the effects of their intervention, will likely lead to them making better treatment decisions.
Taxis are Playing on Easy Mode. Taxi rides are easy to evaluate and hard to fake, mostly reliable and mostly fungible, if they mess up the costs are low. Providers don't make predictions or put in bids, in fact they agree to do any task assigned to them and take the system's price for it. There are no sacred values at stake, and while nasty regulations often raise prices, we're talking about a different order of magnitude of regulatory issues and costs. It's certainly a good sign that taxi markets can work, if they didn't we could likely safely say that the medical/legal proposals were dead on arrival. But it doesn't make me confident. And my experience at MetaMed says that people don't want the thing you're pitching here, even if they're wrong to not want it.
In terms of the tax on skills I was referring to providers needing those skills, more than consumers. Consumers have the far easier job, although it is still too hard for most people, since most cases won't see one thing dominate on all fronts.
When I understand the MetaMed pitch it's:
You can pay a lot of money (high 4 figures or 5 figures) and then we give will research the best treatments for you and there's no promise of effectiveness.
It's my impression that people usually don't want to pay for information directly. People are not willing to pay for a ConsumerReports report before buying a new washing mashine. On the other hand people are quite willing to read Amazon reviews about what's the best washing machine and make their buying decision based on that information. They are also willing to go to the WireCutter and let them get their affiliate commission (even through most of the people likely don't think about the commission).
I think the success of various expensive alternative medicine treatments suggests that people are quite willling to pay money for treatments that they hope will help them with a medical issues for which they currently don't have a solution.
If a person starts answering a questionaire about their pain, it's straightforward to be able to funnel them to the person who does the mirror hands for amputees.
In Germany the person who does mirror hands for amputees would need to spend ~1 year for an education as a "Heilpraktiker" to be allowed to heal people and then a bit of time to learn how the mirror handing works and how to estimate the probabilities.
It's a quite different skillset then the expensive skillset of doctors.
As far as the skillset of making good predictions about treatment success goes, I think it will help in many cases with the clinical decision making. David Burns (who was one of the people who popularized CBT) wrote a bit about how important testing and calibration is in his essay of why he now advocates TEAM instead of CBT. Working with calibration does require additional skills that are a barrier but it pays off. Working that way while being supported by software will be easier then with photocopying sheets the way Burns does.
Promoted to curated: I think this post got a lot less attention that it deserves, probably partially because it was a bit hard to follow because I had to reasonably frequently switch back and forth between various parts of the post and Robin's original post.
However, overall I think this post is quite excellent and I would love to see more posts like this.
(Sorry for the shorter and less informative curation notice than usual, the whole LW team is at a retreat, so we are a bit more time-constrained than usual)
This is another great response post from Zvi.
It takes a list of issues that Zvi didn't get to cherry pick, and then proceeds to explain all them with a couple of core tools: Goodhart's Law, Asymmetric Justice/Copenhagen Interpretation of Ethics, Forbidden Considerations, Power, and Theft. I learned a lot and put a lot of key ideas together in this post. I think it makes a great follow-up read to some of the relevant articles (i.e. Asymmetric Justice, Goodhart Taxonomy, etc).
The only problem is it's very long. 8.5k words. That's about 4% of last year's book, IIRC. I think it's worth a lot, but I think probably a bit less than that. So I'd like it to be shortened if it makes it in. That said I think Zvi's probably up for that if it's getting published.
I expect to vote on this between +3 and +6.
I can confirm that if this post does make the cut I will spend time working to make it shorter. This wasn't a 'toss it off quickly' post but it was definitely not given the 'as short as possible' treatment either.
The truth is, I really like Zvi response-posts. They feel like a version of Scott's book reviews but for posts, where I really get both the other person's perspective and also get a very different perspective on the same subject.
So I feel like I learned a lot here because Robin gave me an interesting perspective on all of these cases, then Zvi gave me a bunch more perspectives on them all.
The CEO proposal is to fire them at the end of the quarter if the prices just before then so indicate. This solves the problem of the market traders expecting later traders to have more info than they. And it doesn't mean that the board can't fire them at other times for other reasons.
We could expect prices prior to end of quarter to be strange, then, and potentially containing very strange information, but can also argue it shouldn't matter. So this is like the two-stage proposal. In stage 1 board decides whether to fire or not anyway, in stage 2 the prediction market decides whether to fire him anyway with a burst of activity, which has the advantage that you get your money back fast if it doesn't happen, and if it does happen you can just be long/short the stock. then if the board decides to fire him because it was 'too close' then the trades are void. And ideally the board will fire him if they see him doing something to manipulate the conditional values in his favor, although we'd worry the board would mostly abdicate firing responsibilities and justify it by pointing at the market.
Definitely seems better than either of my listed scenarios. The perverse effects are at least less perverse, although I suspect they're still quite perverse. Trading in a market that only matters if it is later trading in a certain range is super weird. As would be the CEO response behavior. I wonder what one says and does to maximize share price when not fired relative to share price when fired, instead of maximizing share prices...
wonder what one says and does to maximize share price when not fired relative to share price when fired, instead of maximizing share prices...
Sabotage succession planning and otherwise maximize key person risk focused on oneself?
How much less do you expect this to happen under the current system?
Yes, that's definitely level 1, but I'm guessing the rabbit hole goes much deeper...
I still really like this post, but also still think it really could use some cleaning up. I think a cleaned up version of this post could probably make my top 10 of posts from 2019, and so it seems worth nominating for the review.
Bumping to remind Habryka to followup with Zvi’s question
Might be helpful to say more about what it would mean to clean up this particular post?
Minor quibble on G. Where trial lawyers work on a cab-rank principle (eg barristers in England & Wales and I think advocates in Scotland) win percentages may be significantly less susceptible to being gamed. In such a situation there seems to me to be a closer analogy between trial lawyers and eg surgeons.
D is based on a serious misunderstanding of how private health insurance works.
The only limiting factor chosen by the NHS (undertaken by the NICE commitee) is to determine which specific investigations and treatments are 'worth' funding.
For treatments, they use a value function called a "Quality Adjusted Life-Year" (QALY), and compare that to the cost of the treatment. At the time of writing, it's automatically approved if the cost is shown to be under £10,000 per QALY gained, more efficacious at the same price than an already-approved equivalent, or cheaper at the same efficacy.
If it's more expensive then it goes through a slower and more in-depth process to allow public and private argument about both the price and efficacy.
Thus an investigation or treatment that is extremely expensive but is proven to offer extraordinary results will be funded, while one that works but not very well or that is cheap but ineffective are denied.
All approved treatments are approved for everyone.
In the NHS, denials are only ever of specific treatments, and never specific individuals.
In the NHS, doctors are legally required to make decisions based on the needs of the patient regardless of monetary cost. If a treatment is 'on the list' and medically indicated, it is provided.
In the NHS, the cost of treating any individual person is considered irrelevant, and in most cases the doctor does not even have any knowledge of the cost.
The systematic pressure on treatment manufacturers is thus to be more effective than existing treatments, to charge less than competitors for similar efficacy, to charge £9,999 per QALY, or to be really efficacious so that NICE will choose their product. Thus the NHS often gets really, really good prices!
The pressure on the doctors and hospitals is to give you the best treatment on the menu, because it reflects badly on them if people die too often.
You could view this as the NHS giving all doctors and patients a menu.
Private Health Insurance:
Private health insurance also decides which investigations and treatments that they will fund and under which circumstances. This part is almost exactly the same - in some cases they even follow the NICE decisions, as it's a convenient way of avoiding appearing to decide.
The difference is that private health insurance also denies health care to individuals, by stating that the insurance will not pay for treatment of specific ailments (eg pre-existing conditions or effects caused by 'dangerous' activities), by refusing to cover those individuals at all, or by setting premiums outside their ability to pay (effectively the same as denial, but easier to square in their own minds).
So you, personally, may not even be permitted all the approved treatments. Or indeed, any treatments at all.
The systematic pressure is for all providers to charge as much as possible and for the insurers themselves to pass the amortised cost onto their customers and eject any customer deemed likely to want expensive payouts.
This is still a menu, except now there's a bouncer on the door who can decide not to let you in, and the waiter can decide to rip out some of the pages of your particular menu.
Pure Private Health:
You can have anything you can pay for, regardless of efficacy.
The systematic pressure on all providers is to charge the entire wealth of all patients - a sane individual is unlikely to refuse to pay if they or their loved one would otherwise die.
This is a personal chef who takes your wallet.
Note that all other schemes automatically have this as the ultimate backstop, unless explicitly prohibited by law. (eg laws regarding claims of efficacy, licencing of practitioners etc.)
Both the NHS and private health insurance systems limit the available treatments, the difference between them is that private health insurance futher limits which of the 'master list' of treatments are available to individual people.
A purely private health system does not limit the treatments, but does apply extreme limits to individual people, and is always available regardless of other systems.
I think the USA antipathy to a general health service likely stems from this irrational argument:
"If I don't do anything bad, I will not become poor, lose my job, or have a chronic illness that causes me to lose my health insurance.
Thus anyone who is poor or has a chronic illness must deserve to be so.
If they deserve it, then I should not have to pay towards their care and so they should lose their health insurance."
This is of course backed up and encouraged by the insurance and private health providers who benefit greatly from the excessive fees they can charge.
For L], what would be the effect of scenario 1.5 - CEOs are fired if (but not only if) they are judged to be bad for the stock price?
There would be an option that if the CEO is fired for other reasons than the prediction market that the market doesn't pay out and all bets are refunded - not sure if this would help or hinder!
Note: There's an unfinished sentence in this section, end of 3rd to last paragraph
So I think that realistically
On note: Ah, thanks for the catch. It finishes "this won't work" as you likely guessed.
You'd have to draw that distinction cleanly. So you'd have to do this in two stages, maybe. First, you decide whether or not you want to fire the CEO for non-predictive causes. Then, you hold a prediction market, and fire the CEO if and only if the prediction market says so, perhaps with a threshold rule set during phase 1. Otherwise, the board can look at the market, and it will impact their decision otherwise, especially if they prefer one way of firing to the other (e.g. they want the credit or to avoid the blame, or the payouts are different, etc).
Also, if you include cases where the market exists but gets refunded, that decreases market interest and increases distortions.