Value theory is the study of what people care about. It’s the study of our goals, our tastes, our pleasures and pains, our fears and our ambitions.
That includes conventional morality. Value theory subsumes things we wish we cared about, or would care about if we were wiser and better people—not just things we already do care about.
Value theory also subsumes mundane, everyday values: art, food, sex, friendship, and everything else that gives life its affective valence. Going to the movies with your friend Sam can be something you value even if it’s not a moral value.
We find it useful to reflect upon and debate our values because how we act is not always how we wish we’d act. Our preferences can conflict with each other. We can desire to have a different set of desires. We can lack the will, the attention, or the insight needed to act the way we’d like to.
Humans do care about their actions’ consequences, but not consistently enough to formally qualify as agents with utility functions. That humans don’t act the way they wish they would is what we mean when we say “humans aren’t instrumentally rational.”
￼Theory and Practice
Adding to the difficulty, there exists a gulf between how we think we wish we’d act, and how we actually wish we’d act.
Philosophers disagree wildly about what we want—as do psychologists, and as do politicians—and about what we ought to want. They disagree even about what it means to “ought” to want something. The history of moral theory, and the history of human efforts at coordination, is piled high with the corpses of failed Guiding Principles to True Ultimate No-Really-This-Time-I-Mean-It Normativity.
If you’re trying to come up with a reliable and pragmatically useful specification of your goals—not just for winning philosophy debates, but (say) for designing safe autonomous adaptive AI, or for building functional institutions and organizations, or for making it easier to decide which charity to donate to, or for figuring out what virtues you should be cultivating—humanity’s track record with value theory does not bode well for you.
Mere Goodness collects three sequences of blog posts on human value: “Fake Preferences” (on failed attempts at theories of value), “Value Theory” (on obstacles to developing a new theory, and some intuitively desirable fea- tures of such a theory), and “Quantified Humanism” (on the tricky question of how we should apply such theories to our ordinary moral intuitions and decision-making).
The last of these topics is the most important. The cash value of a normative theory is how well it translates into normative practice. Acquiring a deeper and fuller understanding of your values should make you better at actually fulfilling them. At a bare minimum, your theory shouldn’t get in the way of your practice. What good would it be, then, to know what’s good?
Reconciling this art of applied ethics (and applied aesthetics, and applied economics, and applied psychology) with our best available data and theories often comes down to the question of when we should trust our snap judgments, and when we should ditch them.
In many cases, our explicit models of what we care about are so flimsy or impractical that we’re better off trusting our vague initial impulses. In many ￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼￼ other cases, we can do better with a more informed and systematic approach. There is no catch-all answer. We will just have to scrutinize examples and try to notice the different warning signs for “sophisticated theories tend to fail here” and “naive feelings tend to fail here.”
Journey and Destination
A recurring theme in the pages to come will be the question: Where shall we go? What outcomes are actually valuable?
To address this question, Yudkowsky coined the term “fun theory.” Fun theory is the attempt to figure out what our ideal vision of the future would look like—not just the system of government or moral code we’d ideally live under, but the kinds of adventures we’d ideally go on, the kinds of music we’d ideally compose, and everything else we ultimately want out of life.
Stretched into the future, questions of fun theory intersect with questions of transhumanism, the view that we can radically improve the human condition if we make enough scientific and social progress. Transhumanism occasions a number of debates in moral philosophy, such as whether the best long-term outcomes for sentient life would be based on hedonism (the pursuit of pleasure) or on more complex notions of eudaimonia (general well-being). Other futurist ideas discussed at various points in Rationality: From AI to Zombies include cryonics (storing your body in a frozen state after death, in case future medical technology finds a way to revive you), mind uploading (implementing human minds in synthetic hardware), and large-scale space colonization.
Perhaps surprisingly, fun theory is one of the more neglected applications of value theory. Utopia-planning has become rather passe—partly because it smacks of naiveté, and partly because we’re empirically terrible at translating utopias into realities. Even the word utopia reflects this cynicism; it is derived from the Greek for “non-place.”
Yet if we give up on the quest for a true, feasible utopia (or eutopia, “good place”), it’s not obvious that the cumulative effect of our short-term pursuit of goals will be a future we find valuable over the long term. Value is not an inevitable feature of the world. Creating it takes work. Preserving it takes work.
This invites a second question: How shall we get there? What is the relationship between good ends and good means?
When we play a game, we want to enjoy the process. We don’t generally want to just skip ahead to being declared the winner. Sometimes, the journey matters more than the destination. Sometimes, the journey is all that matters.
Yet there are other cases where the reverse is true. Sometimes the end-state is just too important for “the journey” to factor into our decisions. If you’re trying to save a family member’s life, it’s not necessarily a bad thing to get some enjoyment out of the process; but if you can increase your odds of success in a big way by picking a less enjoyable strategy . . .
In many cases, our values are concentrated in the outcomes of our actions, and in our future. We care about the way the world will end up looking— especially those parts of the world that can love and hurt and want.
How do detached, abstract theories stack up against vivid, affect-laden feelings in those cases? More generally: What is the moral relationship between actions and consequences?
Those are hard questions, but perhaps we can at least make progress on determining what we mean by them. What are we building into our concept of what’s “valuable” at the very start of our inquiry?