I agree this is better (the system rewards only a subset of what it does), but it is still overgeneralizing. Systems could have a different purpose but fallible reward structure (also Goodharting). You could be analyzing the purpose-reward link at the wrong level: political parties want change, so they seek power, so they need donations. This makes it look like the purpose is to just get donations because of rewards to bundlers, but it ignores the rewards at other levels and confuses local rewards with global purpose. Just as a system does a lot of things, so it rewards a lot of things.
Good points. I think it's fine and reasonable for a system to reward leading metrics like "are we raising money" in the service of a higher goal. But if you aren't also doing something to reward the higher goal, or including counter-metrics to catch the obvious ways you could be moving the leading metric without serving the higher goal, then I don't think it's reasonable to claim that the higher goal is actually your system's purpose.
This is of course where things get subtle - but I think this part is important.
Do the counter-metrics have to be measurable&measured or do you see any way how organizations can make room for humane intangible interactions like "Thank you!" without optimization pressure to capture those as net promoter score?
Good question. Sometimes the counter metric is inherently tricky to measure and the best available metric is simply “does a reasonable person thing this is causing harm”.
Even if you measure it in a way that’s totally subjective, you can still make it part of the way you reward people and thus part of the systems purpose.
It just feels like a semantics debate around a teleological vs a more empirical mindset. I think the concept of that sentence is more that if a system meant to do A does in practice B, after long enough of this fact being transparent to everyone involved, and plenty of chances arising to either dismantle or thoroughly rearrange the system, if it's still allowed to do B, it's as good as a stamp of approval for B.
In practice I think this is a bit harsh in terms of how the people involved are judged, because a lot of the time this will happen due to inertia, lack of imagination, lack of accountability, general confusion about who should do what and on whose authority, etcetera, rather than genuine malice. But also, at some point, you kind of lose patience with people fumbling around that way and decide that ineptitude is as bad as malice if it leads to the same results.
There is definitely some truth in what you say.
Where it gets tricky is that it can be hard to see from the outside whether a system continues to do B because they don't care about B, because it's impossible hard to do A without also doing B, or because they are trying really hard to avoid doing B but haven't really worked out how yet.
Looking at what a system rewards lets you see which of these situations the system is actually in. If they are actively rewarding people for not doing B, then B is not the purpose of the system.
One interesting subtlety is what we should say about a system whose purpose is A, that rewards A, and yet ends up doing a lot of B. I think that B isn't the purpose unless it actively rewards B, but you could say the purpose is to "do A, and tolerate doing B in course of doing A".
There is :
Going through your examples, Meta is an organization whose public claim is different from its real goals because its real goals are unpopular. Hospitals are organization whose purpose is what they says it is, but some of their people have to optimize so hard for some instrumental goals (e.g. collecting money) that these instrumental goals get a life of their own. Political parties started like hospitals but ended more like Meta : the instrumental goals (e.g. power-seeking) took so much importance than now there are only power-seekers left in the leadership, and thus the stated goals of the parties have become different from their private goal.
I don’t think this works for general systems. For POSIWID the purpose of a car is to to get you for A to B. I don’t think POSIWIR returns a result - cars aren’t rewarded.
Having said that, I do think it’s much more interesting than POSIWID and, for human systems, more likely to yield sensible answers.
My take would be that purpose comes from conscious minds - the purpose of a system is what the designer intends. This is true even if the designer is rubbish at their job and their system doesn’t achieve the purpose they intend. When we don’t know what the designer intends we try to work it out by looking at what the system does or what it rewards.
This declares by fiat that non-engineered systems have no purpose but I think I’m ok with that. The purpose of a rabbit is not to make baby rabbits - rabbits just are. Saying rabbits have a purpose is to anthropomorphise evolution.
Saying that the purpose of a system is what the designer intends seems much less insightful in the cases where it matters. Who counts as the designer of the California high speed rail system? The administrators who run it? The CA legislature? I get the vibe that the HSR is mostly a jobs program, which is a conclusion you'd get from POSIWID or POSIWIR, but it's less clear if you think through it form the "designer" point of view.
Like the whole point of these other perspectives is that it helps you notice when outcomes are different from intentions. Maybe you'll object and say "well you need intentions to use the word 'purpose'," but then it's like, ok, there's clearly a cluster in concept-space here connecting HSR->jobs program, and imo it's fine for the word "purpose" to describe multiple things.
Edit: Yeah ok I agree that HSR is a bad example.
Saying that the purpose of a system is what the designer intends seems much less insightful in the cases where it matters. Who counts as the designer of the California high speed rail system? The administrators who run it? The CA legislature? I get the vibe that the HSR is mostly a jobs program, which is a conclusion you'd get from POSIWID or POSIWIR, but it's less clear if you think through it form the "designer" point of view.
I'd say the California high speed rail system pretty clearly doesn't have a single purpose.
As a general matter, "what's the purpose of X?," in cases where X is the result of compromise between, deliberation by, or decision-making by multiple actors, is simply the wrong question.[1] It reifies a concept ("purpose") that doesn't carve reality at the joints, and you're much better off simply abandoning it and relieving yourself of the needless confusion it causes.
imo it's fine for the word "purpose" to describe multiple things
Beyond the fact that it doesn't carve reality at the joints, which makes it generally improper for epistemic rationality purposes, it's also bad instrumentally/in practice. The fact that you have so many seemingly interminable debates over semantics is a bug, not a feature. If people[2] aren't capable of coming to terms with what "purpose" refers to, the situation seems less than "fine" to me. Maybe we should stop caring about this whole "purpose" thing in these specific contexts and talk differently about what's going on.
There is no canonical way of aggregating (or, in this case, disaggregating) the competing desires of mutliple agents
Including rationalists, who are supposed to be better than average at this whole philosophy of language stuff
Yeah, I think that clarifies my thoughts - IMO using the word purpose is not ok in these other perspectives unless you actually mean a person had that purpose. it brings in connotations that someone has this purpose for which we then assign moral blame when the real question may be competence. And then, when there is moral blame, separating that from a competence issue is harder.
Much better IMO to say “this system isn’t achieving its purpose” than to say “this system has this other purpose” unless that’s what you mean - you are claiming that some designer(s) have this other purpose.
I’m not familiar specifically with HSR but my guess is that there are multiple purposes from multiple people. Some designers have the purpose to improve transport links, some want a big project to make their name and some want a jobs program. Systems can have multiple purposes and trying to narrow it down to a single one oversimplifies things.
A crux of the discussion around this topic seems to be the exact definition of "purpose" being used.
Is the purpose of a system defined as:
The original intent of those who designed and/or implemented it
Or
The various intents of those incentivized to maintain the system in its realized form
Or
Some combination of those?
Many times the original intent of a system will have little resemblance to its results, (hence the popular appeal of POSIWID), but it can be true that:
1. Most systems are not deliberately designed to be bad
and
2. Many systems create unintended bad incentives ala Goodhart
and
3. Systems that create bad incentives are likely to have their purposes co-opted by those benefiting from the incentives, making it very difficult to iteratively improve said systems.
See: Unnecessarily complex tax code creates necessity for tax-help services and software. Subsequent movements to simplify tax code are resisted by entrenched interests who benefit from the complexity.
TL;DR: the Original Intended Purpose and the Intended Purposes of Current Proponents can be dramatically different, proportional to how well the system fulfilled its original intent and how strong of perverse incentives it created.
If you want to steer results, you have to keep an eye on your effects. You need closed-loop control. This is true at all levels of a system. In particular, once you have created a system, you have to pay attention to whether it is achieving its goal, or whether it needs to be redesigned. If you aren't doing that, you are doing a lousy job steering. You don't have as much optimization power as might appear at first glance. Or maybe you are doing that, but for some other goal.
I was really impressed by the Kevin Simler tweet (n) that everything that exists is because of a positive feedback loop. It persists because it plays some role in renewing itself. You could call that positive feedback the purpose of the structure. Some things come into existence and then just wind down, failing to gain any traction. For those it is hard to isolate a purpose distinct from that of the creator. But for things that persist, they have not just one creator, but myriad forces that renew them. The purpose of the system is not what it does, but what it does that contributes to its continuing existence. I think this meaningfully distinguishes partial successes from implicit goals. The bus system emitting CO2 does not produce buses next year. Successfully transporting people does. It also creates jobs. If these jobs are form a coherent lobby, they may contribute to the existence of the system, in contrast to the CO2.
This approach ("the purpose of a system is the positive feedback loop that sustains itself") is a fascinating angle and feels like it has a lot of truth to it.
The weakness is that it's easy to tell just so stories about why some negative thing some organization does is necessary to sustain it and thus actually it's purpose. Plus this feels disjoint from the conventional human meaning of the word "purpose" which implies that it is something humans and doing intentionally.
One strength of "The purpose of a system is what it rewards" is that what a system rewards is often something that's concretely available (provided you can get access to performance evaluation criteria) and something humans can be held accountable for and pressured into changing.
Or to put it another way, I think Simler's definition is true and fascinating, but mine is probably more useful.
One quirk about the political donation problem is that activists respond to negative messaging but the broader base -- people who may donate/volunteer/vote based on the candidate not the party -- responds to positive messaging. I think it's fair to say this creates a "poisoning the well" problem, where the fundraising emails are doom, activists respond to doom, but the base is cut off from the party. Relevant to this forum, something like "ensure AI benefits everyone" might message better with the base, and "prevent AI from killing everyone" might message better with the activists.
Another step is assuming that : " the purpose of a system is self-preservation and other things it does "
It reminds me of the Chinese Communist Party which is known for rejecting ideas which weaken their authority even if those ideas are good.
Many systems need to optimize for self-preservation and growth, especially in a very competitive environnement. Otherwise they will never be able to survive and accomplish their " more ethical purpose " (this have a " Meditation on Moloch " vibe ).
I am sure a good doctor will still be rewarded, even in a an hospital with a " money-oriented mindset ".
And in your Google example, will brillant engineers be rewarded if they always come with great technical ideas which barely bring money ?
A balance is needed and it is tricky. Of course many times it's not just self-preservation/growth, it is pure greed.
It’s become fashionable recently to say that the purpose of a system is what it does - the true purpose of an institution is often different from what it publicly claims, and is better determined by observing what it does. Scott Alexander wrote a thoughtful takedown of this, claiming “Obviously The Purpose Of A System Is Not What It Does” - often an institution is genuinely trying to do a good thing (e.g. treat patients) and the fact they do so imperfectly (e.g. some patients die) does not mean such imperfections are the goal.
I think a better framing is that the purpose of a system is what it rewards people for doing.
Meta says that their purpose is “to build the future of human connection” and used to say their mission was to “make the world more open and connected”. But Meta employees aren’t actually judged by how well they serve those high valued goals. Employees are judged on their ability to get people to spend more time staring at their screens clicking on ads (source: I worked there). So I think it’s fair to say that “get people to spend time storing at screens clicking on ads” is the purpose of Meta.
Someone I know works as a hospital doctor. Her performance isn’t judged based on patient outcomes, but on whether patients feel cared for, whether patients get diagnosed with treatable conditions, and how much the hospital can bill insurers for treating patients. So I think it’s fair to say that the purpose of that hospital is to make people feel cared for, diagnose them with things, and charge as much as they can to insurance. Some health benefits obviously get provided along the way, but since they aren’t what people are rewarded for, they aren’t the purpose.
If you work on a political campaign, your salary depends on bringing in lots of money from donors, but is mostly disconnected from whether the country improves as a result of your party’s policies. Thus I think it’s fair to say the purpose of a political party is to get people to donate money - often by making people angry and afraid.
On the other hand, when I worked at Google Search, it was genuinely true that the primary thing people were rewarded for was making search useful for users. The ads team of course has a separate purpose of making money by getting people to click on the ads they placed on the results.
Similarly, I’m pretty sure no police department rewards police officers for shooting innocent people. However I assume they do reward people for bringing down crime statistics in ways favorable to whoever is mayor, and I can imagine they may under-reward building good relationships with the community.
The purpose of a system definitely isn’t what it says it is, and it’s not fair to say all negative outcomes are intentional,, but I think it is fair to judge a system by what it rewards, and what a system rewards is often actually quite legible.