Economic AI Safety

by jsteinhardt7 min read16th Sep 20213 comments


Ω 17

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Crossposted from my blog, Bounded Regret.

There is a growing fear that algorithmic recommender systems, such as Facebook, Youtube, Netflix, and Amazon, are having negative effects on society, for instance by manipulating users into behaviors that they wouldn’t endorse (e.g. getting them addicted to feeds, leading them to form polarized opinions, recommending false but convincing content).

Some common responses to this fear are to advocate for privacy or to ask that users have greater agency over what they see. I argue below that neither of these will solve the problem, and that the problem may get far worse in the future. I then speculate on alternative solutions based on audits and “information marketplaces”, and discuss their limitations.

Existing discourse often mistakenly focuses on individual decisions made by individual users. For instance, it is often argued that if a company is using private data to manipulate a user, that user should have the right to ban the company’s use of the data, thereby stopping the manipulation. The problem is that the user benefits greatly from the company using their private data---Facebook’s recommendations, without use of private information, would be mostly useless. You can see this for yourself by visiting Youtube in an incognito tab. Most users will not be willing to use severely hobbled products for the sake of privacy (and note that privacy is not itself a guaranteed defense against manipulation).

A second proposal is to provide users with greater agency. If instead of passively accepting recommendations, we can control what the algorithm shows us (perhaps by changing settings, or by having recommendations themselves come with alternatives), then we can eventually bend it to satisfy our long-term preferences. The problems here are two-fold. First, even given the option, users rarely customize their experience; a paper on Netflix’s recommender system asserts:

Good businesses pay attention to what their customers have to say. But what customers ask for (as much choice as possible, comprehensive search and navigation tools, and more) and what actually works (a few compelling choices simply presented) are very different.

Thus to the extent that user agency is helpful, it would be from providing information about a small and atypical subset of power users, which must then be extrapolated to other users. In addition, even those power users are at the mercy of information extracted from other sources. This is because while, in theory, an algorithm could completely personalize itself to a user’s whims, this would take an infeasibly long time without some external prior. If you’ve shown me Firefly and Star Trek, how will you guess whether I like Star Wars without e.g. information from other users or from movie reviews? Thus, while we can provide users with choices, most of the decisions--determining the user’s choice set--have already been made implicitly at the outset.

It helps to put these issues in the context of the rest of the economy, to understand which parts of this story are specific to recommender systems. In the economy at large, businesses offer products that they sell in stores or in online marketplaces. Users visit these stores and marketplaces and choose the items that most appeal to them relative to price. Businesses put some effort into getting their products into stores, into creating appealing packaging, and to spreading the word about how great their product is. These all implicitly affect customers’ choice sets, which in any case is restricted to the finite number of items seen in a store or on the first page or so of online results. At the same time, many default choices (such as clean drinking water) are made already by governments, at least in economically developed countries.

Under this system, while branding and marketing can influence user behavior, it is difficult for businesses to trick customers into buying clearly suboptimal products. Even if such a product gains market share, eventually news of a superior product will spread through word of mouth. There are exceptions when quality is hard to judge (as in the case of medical advice) or when negative effects are subtle or delayed (such as lead poisoning). Even in these cases, if a company is sufficiently shady then customers may decide to boycott it, although boycotts often occur on the basis of sketchy information and mood affiliation, so it isn’t clear how useful this mechanism is. Finally, some products may take advantage of psychological needs or weaknesses of customers, for instance helping them cope with sadness or anxiety in an unhealthy and self-perpetuating way (e.g. binge-watching TV episodes, eating tubs of ice cream, doing drugs). While competing products (gym memberships, meditation classes) can push in the other direction, in practice the more exploitative products seem to often win out.

Returning to algorithmic recommender systems, we can see that many of the properties we were worried about are already present in the economy at large. Specifically, most decisions are already made for customers, with a small finite choice set dictating the remaining options. Businesses do try to manipulate customers and in some cases these manipulations are successful.

There are a few differences, however. First, in a typical marketplace users can choose between different versions of the same item. This is less true for recommender systems---there is only one Facebook, and collectively a small number of companies are in charge of most social media. While my choices of what to click on influence Facebook’s recommendations, I have no obvious recourse if the recommendations remain persistently misaligned with my preferences. A second, complementary issue is that Facebook’s business model is not to make me happy, but to produce value for advertisers. This exacerbates the lack of recourse from outside options. Even for a company that is trying to produce value for users, outside options are important in case the company’s assumptions on what generates value are wrong; but when the company is not even trying to produce value, outside options are crucial.

This points to one potential solution, which is to create more competition among recommender systems. If many products could generate alternative versions of the Facebook feed, allowing users to choose among them, then Facebook would have to produce a product that users wanted more than those alternatives. Even if its business model remained ad-based, it would have to compete with other services that, for instance, offered a monthly subscription fee in exchange for a higher quality feed. (I'm ignoring the many obstacles to this--since recommender systems benefit from network effects, you would probably have to enforce compatibiltiy or data sharing across different recommender systems to create actual competition, which is an important but nontrivial problem.)

While competition would help, it wouldn’t solve the problem entirely. On the positive side, products would succeed by convincing people to use and pay money for them, and would not survive in the long-run if they eschewed obvious and accessible improvements. But the effects of recommender systems, like medical advice, are difficult to fully ascertain. They could induce the psychological equivalent of lead poisoning and it would take a long time to identify this. This is particularly worrying for recommender systems that affect our information diet, which itself strongly affects our choice sets. It will be even more worrying when algorithmic optimizers affect our daily environment, as is beginning to be the case with services like Alexa and Nest.

Our environment is the strongest determiner of our choice sets and so mis-aligned optimization of our environment may be difficult to undo. In the short run, this likely won’t be an (apparent) concern: the immediate effect of optimized environments is that most people’s environments will become substantially better. Perhaps this will also be true in the long run: environments will be better, and optimizers don’t learn to adversarially manipulate them. However, given the ease of using environments to manipulate decisions, I don’t see what existing mechanisms would prevent such manipulation from happening.

Here’s one attempt at designing a mechanism. To recap, the problem is that people have a difficult time understanding how algorithmic optimizers affect their decisions (and so can’t provide a negative reward signal in response to being manipulated). But people certainly want to understand this, so there should be a market demand for “auditors” that examine these systems and report undesirable effects to users. So perhaps we should seek to create this market?

However, I’m not sure most users could understand these audits, or distinguish between trustworthy and untrustworthy auditors. At least today, most people seem confused about what exactly is wrong with recommender systems, and news articles--arguably a weak form of auditing--often contribute to that confusion. Is there any robust way of incentivizing useful audits? Has this ever worked out in other industries, such as medicine or food safety? It’s unclear to me. I think we want some sort of information market, consisting of both auditors and counterauditors (who expose issues with the auditors), and to think carefully about how to design incentives that converge to truthful outcomes.

In conclusion, we are running towards a future in which more and more of our choice sets will be subject to strong optimization forces. Perhaps robust agency within those choice sets will offer a way out, but we should keep in mind that most of the action is elsewhere. Optimizing these other parts--our environment and our information diet--could lead to great good, but could also lead to irreversible manipulation. None of the solutions currently discussed keep us safe from the latter, and more work is needed.


Ω 17

3 comments, sorted by Highlighting new comments since Today at 4:03 AM
New Comment

Thanks very much, I really liked reading this essay. I concur with your arguments about why more optionality and privacy don’t solve the problem. I also came up with the idea of more competition. That sounds like the sort of solution the market is good at, but I can’t think of a schema for getting around the network effects that you talk about.

(Actually, I just thought of one. I think that if the recommender systems were open so that anyone could write an algorithm, that could lead to pretty good competition. I’d be excited to see alt FB or Twitter algorithms. That said it sounds like more optionality I.e. most people wouldn’t use it. So not sure.)

The auditing is an idea I hadn’t heard before. It reminds me of what Zvi Mowshowitz did for me when he “audited” the FB algorithm ( ). That was very helpful. I’d love to see more work like that.

If there was a feasible way to make the algorithm open, I think that would be good (of course FB would probably strongly oppose this). As you say, people wouldn't directly design / early adopt new algorithms, but once early adopters found an alternative algorithm that they really liked, word of mouth would lead many more people to adopt it. So I think you could eventually get widespread change this way.

You could have a meta-recommender system that aggregates recommendations from multiple algorithms, and shows which algorithm each recommendation came from. By default, when the user reinforces a recommendation's algorithm, the meta-recommender system's algorithm would also be shifted towards the reinforced approach.