But under these assumptions, combining evidence always gives the right answer. Compare with the example in the post: "vote on a, vote on b, vote on a^b" which just seems strange. Shouldn't we try to use methods that give right answers to simple questions?
a) "Everyone does Bayesian updating according to the same hypothesis set, model, and measurement methods" strikes me as an extremely strong assumption, especially since we do not have strong theory that tells us the "right" way to select these hypothesis sets, models, and measurement instruments. I would argue that this makes Aumann agreement essentially useless in "open world" scenarios.
b) Why should uniquely consistent aggregation methods exist at all? A long line of folks including Condorcet, Arrow, Sen and Parfit have pointed out that when you start aggregating beliefs, utility, or preferences, there do not exist methods that always give unambiguously "correct" answers.
I think if you have a set of coefficients for comparing different people's utilities (maybe derived by looking into their brains and measuring how much fun they feel), then that linear combination of utilities is almost tautologically the right solution.
Sure, but finding the set of coefficients for comparing different people's utilities is a hard problem in AI alignment, or political economy generally. Not only are there tremendous normative uncertainties here ("how much inequality is too much?") but the problem of combining utilities a minefield of paradoxes even if you are just summing or averaging.
You can peek into everyone's heads, gather all the evidence, remove double-counting, and perform a joint update. That's basically what Aumann agreement does - it doesn't vote on beliefs, but instead tries to reach an end state that's updated on all the evidence behind these beliefs.
Right, this is where strong Bayesianism is required. You have to assume, for example, that everyone agrees on the set of hypotheses under consideration and the exact models to be used. This is not just an abstract plan for slicing the universe into manageable events, but the actual structure and properties of the measurement instruments that generate "evidence." If we wish to act as well we also have to specify the set of possible interventions and their expected outcomes. These choices are well outside the scope of a Bayesian update (see e.g. Gelman and Shalizi or John Norton).
Also, I do not have a super-intelligent AI. I'm working on narrow AI alignment, and many of these systems have social choice problems too, for example recommender systems.
Imagine that after doing the joint update, the agents agree to cooperate instead of fighting, and have a set of possible joint policies. Each joint policy leads to a tuple of expected utilities for all agents. The resulting set of points in N-dimensional space has a Pareto frontier.
The Pareto frontier is a very weak constraint, and lots of points on it are bad. For a self-driving car that wants to drive both quickly and safely, both not moving at all and driving as fast as possible are on the frontier. For a distribution of wealth problem, "one person gets everything" is on the frontier. The hard problem is choosing between points on the frontier, that is, trading off one person's utility against another. There is a long tradition of work within political economy which considers this problem in detail. It is, of course, partly a normative question, which is why norm-generation processes like voting are relevant.
Aumann agreement isn't an answer here, unless you assume strong Bayesianism, which I would advise against.
I have to say I don't know why a linear combination of utility functions could be considered ideal. There are some pretty classic arguments against it, such as Rawls' maximin principle, and more consequentialist arguments against allowing inequality in practice.
If you liked this post, you will love Amartya Sen's Collective Choice and Social Welfare. Originally written in 1970 and expanded in 2017, this is a thorough development of the many paradoxes in collective choice algorithms (voting schemes, ways to aggregate individual utility, and so on.)
My sense is the AI alignment community has not taken these sorts of results seriously. Preference aggregation is non-trivial, so "aligning" an AI to individual preferences means something much different than "aligning" an AI to societal preferences. Different equally-principled ways of aggregating preferences will give different results, which means that someone somewhere will not get what they want. Hence an AI agent will always have some type of politics if only by virtue of its preference aggregation method, and we should be investigating which types we prefer.
I thought Incomplete Contracting and AI Alignment addressed this situation nicely:
Building AI that can reliably learn, predict, and respond to a human community’s normative structure is a distinct research program to building AI that can learn human preferences. Preferences are a formalization of a human’s subjective evaluation of two alternative actions or objects. The unit of analysis is the individual. The unit of analysis for predicting the content of a community’s normative structure is an aggregate: the equilibrium or otherwise durable patterns of behavior in a group. [...] The object of interest is what emerges from the interaction of multiple individuals, and will not be reducible to preferences. Indeed, to the extent that preferences merely capture the valuation an agent places on different courses of action with normative salience to a group, preferences are the outcome of the process of evaluating likely community responses and choosing actions on that basis, not a primitive of choice.
How do we make a choice about the "right" politics/preference aggregation method for an AI? I don't think there is or can be an a-priori answer here, so we need something else to break the tie. One strategy is to ask what the consequences of that each type of political system will be in the actual world, rather than an abstract behind-the-veil scenario. But more fundamentally I don't know that we can do better than what humans have always done, which is group discussions with the intention of coming to a workable agreement. Perhaps an AI agent can and should participate in such discussions. It's not just the formal process that makes voting systems work, but the perceived legitimacy and therefore good-faith participation of the people who will be governed by it, and this is what such discussion creates.
So I was very surprised when I learned that a single general method in deep learning (training an artificial neural network on massive amounts of data using gradient descent) led to performance comparable or superior to humans’ in tasks as disparate as image classification, speech synthesis, and playing Go. I found superhuman Go performance particularly surprising—intuitive judgments of Go boards encode distillations of high-level strategic reasoning, and are highly sensitive to small changes in input.
I think it may be important to recognize that AlphaGo (and AlphaZero) use more than deep learning to solve Go. They also use tree search, which is ideally suited to strategic reasoning. Neural networks, on the other hand, are famously bad at symbolic reasoning tasks, which may ultimately have some basis in the fact that probability does not extend logic.
We could look at donors' public materials, for example evaluation requirements listed in grant applications. We could examine the programs of conferences or workshops on philanthropy and see how often this topic is discussed. We could investigate the reports and research literature on this topic. But I don't know how to define enough concern.
While Bayesian statistics are obviously a useful method, I am dissatisfied with the way "Bayesianism" has become a stand-in for rationality in certain communities. There are well-developed, deep objections to this. Some of my favorite references on this topic:
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory
Finally, I'd note that explicitly Bayesian calculation is rarely used as the top level inference framework in practical decision-making, even when the stakes are high. I worked for a decade as a data journalist, and you'd think that if Bayesianism is useful anywhere then data journalists would use it to infer the truth of situations. But it is very rarely useful in practice. Nor is Bayesianism the primary method used in e.g. forecasting and policy evaluation. I think it's quite instructive to ask why not, and I wish there was more serious work on this topic.
In short: Bayesianism is certainly foundational, but it is not a suitable basis for a general theory of rational action. It fails on both theoretical and practical levels.
My sense is that donors do care about evaluation, on the whole. It's not just GiveWell / Open Philanthropy / EA who think about this :P
See for example https://www.rockpa.org/guide/assessing-impact/
Well said. And this middle ground is exactly what I am worried about losing as companies add more AI to their operations -- human managers can and do make many subtle choices that trade profit against other values, but naive algorithmic profit maximization will not. This is why my research is on metrics that may help align commercial AI to pro-social outcomes.
Because central planning is so out of fashion, we have mostly forgotten how to do it well. Yet there are little known historical methods that could be applicable in the current crisis, such as input-output analysis, as Steve Keen writes:
One key tool that has fallen out of use in economics is input-output analysis. First developed by the non-orthodox economist Wassily Leontief (Leontief 1949; Leontief 1974), it used matrix mathematics to quantify the dependence of the production of one commodity on inputs of other commodities. Given its superficial similarity to Walras’ model of tatonnement, it was rapidly adopted by Neoclassical economists, and it became an integral part of mainstream macroeconomics back when “general equilibrium” meant “equilibrium of all the markets in a multi-market economy at one point in time”. It was known as CGE: “Computable General Equilibrium”. When I did my PhD on modelling Minsky’s “Financial Instability Hypothesis” (Keen 1995), virtually every other Economics PhD student at the University of New South Wales was building a CGE model of his or her national economy.
If economists were still skilled today in CGE analysis, then they could easily have answered some vital questions for policymakers during the Coronavirus crisis. The most pressing economic is, “if we identify which products are critical, and the level of output needed to sustain the population during 4-8 weeks of lockdown, can you tell us which other products are critical to maintaining that output, and how many workers are needed?”