Deminatalist Total Utilitarianism

by Vanessa Kosoy5 min read16th Apr 202049 comments

49

World Optimization
Frontpage

TLDR: I propose a system of population ethics that arguably solves all major paradoxes, and formalizes some intuitions that prior systems have not. AFAIK this is original but I'm not an expert so maybe it's not.

This idea was inspired by a discussion in the "EA Corner" discord server, and specifically by a different proposal by discord user LeoTal.

Foreward: The Role of Utilitarnism

I don't believe utilitarianism is literally the correct system of ethics (I do endorse consequentialism). Human values are complex and no simple mathematical formula can capture them. Moreover, ethics is subjective and might different between cultures and between individuals.

However, I think it is useful to find simple approximate models of ethics, for two reasons.

First, my perspective is that ethics is just another name for someone's preferences, or a certain subset of someone's preferences. The source of preferences is ultimately intuition. However, intuition only applies to the familiar. You know that you prefer strawberries to lemons, just because. This preference is obvious enough to require no analysis. But, when you encounter the unfamiliar, intuition can fail. Is it better to cure a village of malaria or build a new school where there is none? Is it better to save one human or 1000 dogs? Can a computer simulation be worthy of moral consideration? What if it's homomorphically encrypted? Who knows?

In order to extrapolate your intuition from the familiar to the unfamiliar, you need models. You need to find an explicit, verbal representation that matches your intuition in the familiar cases, and that can be unambiguously applied to the unfamiliar case. And here you're justified to apply some Occam razor, despite the complexity of values, as long as you don't shave away too much.

Second, in order to cooperate and coordinate effectively we need to make our preferences explicit to each other and find a common ground we can agree on. I can make private choices based on intuition alone, but if I want to convince someone or we want to decide together which charity to support, we need something that can be communicated, analyzed and debated.

This is why I think questions like population ethics are important: not as a quest to find the One True Formula of morality, but as a tool for decision making in situations that are unintuitive and/or require cooperation.

Motivation

The system I propose, deminatalist total utilitarianism (DNT) has the following attractive properties:

  • It avoids the repugnant conclusion to which regular total utilitarianism falls prey, at least the way it is usually pictured.
  • It avoids the many problems of average utilitarianism: the incentive to kill people of below-average happiness, the incentive to create people of negative happiness (that want to die) when the average happiness is negative, the sadistic conclusion and the non-locality (good and evil here depends on moral patients in the Andromeda galaxy).
  • It avoids the problem with both totalism and averagism that killing a person and creating a different person with equal happiness is morally neutral.
  • It captures the intuition many people have that the bar for when it's good to create a person is higher than the bar for when it's good not to kill one.
  • It captures the intuition some people have that they don't want to die but they would rather not have been born.
  • It captures the intuition some people have that sometimes living too long is bad (my dear transhumanist comrades, please wait before going for rotten tomatoes).

Formalism

I am going to ignore issues of time discounting and spatial horizons. In an infinite universe, you need some or your utilitarian formulas make no sense. However, this is, to first approximation, orthogonal to population ethics (i.e. the proper way to aggregate between individuals). If you wish, you can imagine everything constrained to your future light-cone with exponential time discount.

I will say "people" when I actually mean "moral patients". This can include animals (and does include some animals, in my opinion).

The total utility of a universe is a sum over all people that ever lived or will live, like in vanilla totalism. In vanilla totalism, the contribution of each person is

where is the time of birth, is the time of death, and is happiness at time (for now we think of it as hedonistic utilitarianism, but I propose a preference utilitarianism interpretation later).

On the other hand, in DNT the contribution of each person is

  • is a constant with dimensions of time that should probably be around typical natural lifespan (at least in the case of humans).
  • is a constant with dimensions of happiness, roughly corresponding to the minimal happiness of a person glad to have been born (presumably a higher bar that not wanting to die).
  • is a constant with dimensions of utility that it's natural (but not obviously necessary) to let equal .

Of course the function was chosen merely for the sake of simplicity, we can use a different function instead as long as it is monotonically increasing from at to at on a timescale of order .

Analysis

For a person of constant happiness and lifespan , we have

It is best to live forever when , it is best to die immediately when and in between it is best to live a lifespan of

We can imagine the person in the intermediate case becoming "tired of life". Eir life is not good. It is not so bad as to warrant an earlier suicide, but there is only so much of it ey can take. One could argue that this should already be factored into "happiness", but well, it's not like I actually defined what happiness is. More seriously, perhaps rather than happiness it is better to think of as the "quality of life". Under this interpretation, the meaning of the second correction in DNT is making explicit a relationship between quality of life and happiness.

Creating a new person is good if and only if , that is

Creating a new immortal person is good when and bad when . Assuming , creating a person of happiness below is bad even if ey have optimal lifespan. Lower values of produce lower thresholds (there is no closed formula).

DNT is a form of total utilitarianism, so we also get a form of the repugnant conclusion. For vanilla utilitarianism the repugnant conclusion is: for any given population, there is a better population in which every individual only barely prefers life over death. On the other hand, for DNT, the "repugnant" conclusion take the form: for any given population, there is a better population in which every individual is only barely glad to have been born (but prefers life over death by a lot). This seems to me much more palatable.

Finally, again assuming , killing a person and replacing em by a person of equal happiness is always bad, regardless of the person's happiness. If exactly, then the badness of it decreases to zero as the age of the victim during the switch goes to infinity. For larger it retains badness even in the limit.

From Happiness to Preferences

I believe that preference utilitarianism is often a better model than hedonistic utilitarianism, when it comes to adults and "adult-like" moral patients (i.e. moral patients that can understand and explain eir own preferences). What about DNT? We can take the perspective it corresponds to "vanilla" total preference utilitarianism, plus a particular model of human preferences.

Some Applications

So far, DNT made me somewhat more entrenched in my beliefs that

  • Astronomical waste is indeed astronomically bad, because of the size of future supercivilization. Of course, in averagism the argument still has weight because of the high quality and long lifespan of future civilization.

  • Factory farming is very bad. Although some may argue factory farmed animals have , it is much harder to argue they have .

DNT made me somewhat update away from

  • The traditional transhumanist perspective that living forever is good unless life quality is extremely bad. Of course, I still believe living forever is good when life quality is genuinely good. (Forever, or at least very long: I don't think we can fully comprehend the consequences of immortality from our present perspective.)

  • The belief that the value of humanity so far has been net positive in terms of terminal values. I think a random person in the past had a rather miserable life, and "but ey didn't commit suicide" is no longer so convincing. However, I'm still pretty sure it is instrumentally net positive because of the potential of future utopia.

DNT might also be useful for thinking about abortions, although it leaves open the thorny question of when a fetus becomes a moral patient. It does confirm that abortions are usually good when performed before this moment.

49

49 comments, sorted by Highlighting new comments since Today at 3:39 PM
New Comment

Can you give a narrative summary of this theory? I'm having trouble understanding what those constants actually mean WRT ethical decisions about creating and terminating lives, and especially in comparing lives (when is it better to destroy one life in order to create two different ones, and/or when is it better to reduce h for some time in a life in order to increase h in another (or to bring another into existence).

What are you actually trying to model with the birth penalty (-u0 before the sum over lifetime), and with the weird (to me) assumption of decreasing individual utility over time? Something about joy in anticipation or reduction in fear of impending termination? Why isn't that already included in h(t)?

I'm also very unsure about the assertion that "happy to exist" and "prefer not to die now" is an important difference, unless you're trying to bring the past in. Someone who leads an unhappy life for N years, but expects a small number of happy years at the end strongly prefers not to die at the inflection point, but at the beginning may have preferred not to exist or to die then. At any point in time, "prefer to continue to live from this point" is equal to "happy to come into existence at this point", right?

The birth penalty fixes a lot of unintuitive products of the classic total uti. For example, if you treat every "new" person as catching up to the penalty (which can only be achieved if you at least live with minimal acceptable happiness for your entire life, aka h0), then killing a person and replacing him with someone of equal happiness is bad. Cause the penalty that was not yet caught up with in the killed person remains as a negative quantity in the total utility, a debt, if you will. In total uti, this doesn't apply and it logically follows that there's nothing wrong with killing a person and replacing him with a new person of equal happiness, which is unintuitive.

"I'm also very unsure about the assertion that "happy to exist" and "prefer not to die now" is an important difference [...]" - this is important because there are people that feel they are not happy with existence, and would rather to not have been born at all, but don't want to die now that they do in fact exist. If you don't have this difference you can't capture that intuition. I'm not sure how the N unhappy years argument is relevant to this or how it renders the difference moot. In particular:

" "prefer to continue to live from this point" is equal to "happy to come into existence at this point" "

is in fact false, for a significant amount of people.

Huh. I guess my intuitions are different enough that we're just going to disagree on this.

I don't think it's problematic to replace a being with an equally-happy one (presuming painless and not happiness-reducing in those around them). And I don't understand how one can prefer not to die _AND_ not be happier to exist than not.

If you don't think killing is in itself bad then you are not on par with the intuition of almost everybody. Legit.

I personally would rather to have never been born but don't want to commit suicide. There are numerous reasons. Hurting the people who care about me (and wouldn't have if I was not born in the first place), fearing pain or the act of suicide itself, fearing death (both are emotional axioms that a lot of people have, there's no point in debating them rationally) and many other.

killing is in itself bad

To be clear, I didn't say anything about killing. I said "replace". This isn't possible with humans, but picture the emulation world, where an entity can be erased with no warning or sensation, and a fully-developed one can be created at will. Even then, practically it would be impermissible to do a same-value replacement, both due to uncertainty and for negative effects on other lives.

In the human world, OF COURSE killing (and more generally, dieing) is bad. My point is that the badness is fully encoded in the reduction in h of the victim, and the reduced levels of h of those who survive the victim. It doesn't need to be double-counted with another term.

I personally would rather to have never been born but don't want to commit suicide.

I'm extremely saddened to know this. And it makes me feel mean to stick to my theme of "already included in h, no need for another term". The fear of death, expectation of pain, and impact on others are _all_ differences in h which should not be double-counted.

Also, I very much hope that in a few years or decades, you'll look back and realize you were mistaken in wishing you hadn't been born, and are glad you persevered, and are overall glad you experienced life.

The "replace" in the original problem is ending one human and creating (in whatever way) another one. I don't think you understand the scenario.

In total uti (in the human world), it is okay to:

kill someone, provided that by doing so you bring into the world another human with the same happiness. For the sake of argument, lets assume happiness potential is genetically encoded. So if you kill someone, you can always say "that's ok guys, my wife just got pregnant with a fetus bearing the same genetic code as the guy I just murdered". In a model where all you do is sum up the happiness of every individual in the population, this is ok. In Vannesa's model it isn't, and what makes sure it isn't is the penalty.

" I'm extremely saddened to know this. And it makes me feel mean to stick to my theme of "already included in h, no need for another term". The fear of death, expectation of pain, and impact on others are _all_ differences in h which should not be double-counted."

It might be double counted, that's not what I was talking about when I said the model captures this intuition. The existence of h0 does that, it might be that other parts of the model do that as well (I don't think so though). Also, I'm always up for an intelligent discussion and you were not being mean :)

" Also, I very much hope that in a few years or decades, you'll look back and realize you were mistaken in wishing you hadn't been born, and are glad you persevered, and are overall glad you experienced life."

My prior for this is low, since I've been feeling this way for my entire adult life, but one can always hope. Plus, I've actually met and talked to many like minded individuals so I wouldn't discount this intuition as "not worth capturing since its just some small anomaly".

I just want to note here for readers that the following isn't correct (but you've already made a clarifying comment, so I realise you know this):

In total uti (in the human world), it is okay to:
kill someone, provided that by doing so you bring into the world another human with the same happiness.

Total uti only says this is ok if you leave everything else equal (in terms of total utility). In almost all natural situations you don't: killing someone influences the happiness of others too, generally negatively.

Is the intuition about killing someone and replacing them with someone who will experience equal total happiness assuming that killing someone directly causes a large drop in total happiness, but that the replacement only has total happiness equal to what the killed moral patient would have had without the killing?

Because my intuition is that if the first entity had expected future happiness of 100, but being killed changed that to -1000, their replacement, in order for them to result in 'equal happiness' must have expected future happiness of 1100, not 100. Intuitively, the more it sucks to be killed, the more benefit is required for it to be not wrong to kill someone.

Being killed doesn't change your expected happiness, knowing you will be killed does. That's different. If you want to separate variables properly think about someone being gunned down randomly with no earlier indication. Being killed just means ending you prematurely, and denying you the happiness you would have had were you alive. A good model will reflect why that's bad even if you replace the killed person with someone that would compensate for future loss in happiness.

Pragmatically speaking, killing people causes unhappiness because it hurts the people who lost them, but that is reflected in the happiness values of those individuals, and a good model will reflect that killing someone is bad even if know one knows about it.

Being killed changes your actual happiness, compared to not being killed. I should not have used 'expected happiness' to refer to h|"not killed".


I'm counting 'the act of being gunned down' as worth -1000 utility in itself, in addition to cancelling all happiness that would accumulate afterwards, and assuming that the replacement person would compensate all of the negative happiness that the killing caused.


Basically, I'm saying that I expect bleeding out after a gunshot wound to suck, a lot. The replacement compensating for loss in happiness starts from a hole the size of the killing.


I'm assuming that whatever heuristic you're using survives the transporter paradox; killing Captain Kirk twice a day and replacing him with an absolutely identical copy (just in a different location) is not bad.

It doesn't change your actual happiness, just the future one. If you are literally shot with a sniper rifle while walking in the street with no warning, there is no time in which you are saddened by your death. You just are, and then aren't. What is lost is all the happiness that you would have otherwise experienced. Assume the guy is shot in the head, so there's no bleeding out part.

I'm not sure where the -1000 number comes from. There is no point in which the shot down person feels 1000 less happiness than before. Saying "the act itself is worth -1000" is adding a rule to the model. A hard coded rule that killing someone is -1000. First of all, such a rule doesn't exist in the total uti, and this model fixes it. Second of all, not all killings are equally bad, so you have to come up with a model for that now. Instead, in this model, when someone is killed the total moral utility of the population is reduced by an amount equal to, at least, minimal "life worth living happiness" for every year the killed man had left. That is pretty intuitive and solves things without hard coded rules.

Plus, nobody said "an absolutely identical copy", the problem in total uti is that it follows it is ok to murder someone and replace him with someone of EQUAL HAPPINESS, not equal everything. The same heuristic won't work (because it deals with identity issues like "how do we define who is captain kirk"). In this model, this problem doesn't occur anymore.

So it IS okay to kill someone and replace them with an absolutely identical copy, as long as the deceased feels no pain and nobody notices?

Is it okay for someone to change their mind about what they were going to do, and produce equal happiness doing something else?

Is it okay to kill someone and replace them with an absolutely identical copy, where nobody notices including the deceased, if the new person changes their mind about what they were going to do and ends up producing equal happiness doing something else?

" So it IS okay to kill someone and replace them with an absolutely identical copy, as long as the deceased feels no pain and nobody notices? "

In total uti it is ok. This is counter-intuitive, so this model fixes it, and its no longer ok. Again, that's the reason the penalty is there.

The absolute identical copy trick might be ok, and might not be ok, but this is besides the point. If a completely identical copy is defined as being the same person, then you didn't replace anybody and the entire question is moot. If its not, then you killed someone, which is bad, and it ought to be reflected in the model (which it is, as of now).

In order to penalize something that probably shouldn't be explicitly punished, you're requiring that identity be well-defined.

There's still the open question of "how bad?". Personally, I share the intuition that such replacement is undesirable, but I'm far from clear on how I'd want it quantified.

The key situation here isn't "kill and replace with person of equal happiness", but rather "kill and replace with person with more happiness".

DNT is saying there's a threshold of "more happiness" above which it's morally permissible to make this replacement, and below which it is not. That seems plausible, but I don't have a clear intuition where I'd want to set that threshold.

Can you give a narrative summary of this theory?

I did not create this theory from a particular narrative, I just looked for a mathematical model that fits certain special cases, which is a fine method in my book. But, if you wish, we can probably think of the term as an extra penalty for death and the age-dependent term as "being tired of life".

I'm having trouble understanding what those constants actually mean WRT ethical decisions about creating and terminating lives, and especially in comparing lives (when is it better to destroy one life in order to create two different ones, and/or when is it better to reduce h for some time in a life in order to increase h in another (or to bring another into existence).

I'm not sure I understand the question. For any case you can do the math and see what the model says. I already gave some examples in which you can see what the constants do.

...Why isn't that already included in h(t)?

Like I said in the text, the age-dependent penalty can be included in h(t) if you wish. Then we get a model in which there is no age-dependent penalty but there is still the death penalty (no pun intended). Looking from this angle, we get a repugnant conclusion with many very long-lived people who only barely prefer life to death. But, the separation of "effective happiness" into "quality of life" and "age-dependent penalty" paints a new picture of what such people look like. The reason they only barely prefer life to death is not because they are suffering so much. It is because they lived for very long and are sort of very sated with life.

At any point in time, "prefer to continue to live from this point" is equal to "happy to come into existence at this point", right?

No. Many people have the opposite intuition, especially people whose life is actually bad.

I think I understand the desire for a death penalty (or an early-termination penalty). However, for me, it should be more an acknowledgement that death depresses h in the subject prior to death and in many others both prior to and following death, (and that a new life will take time to increase h around them) than a penalty inherent in the individual value.

And the "prefer to continue but wish not to have been created" case really seems like an error in intuition to me. Evolutionarily useful, but evolution has different goals than thinking individuals do. I understand the purpose, though, so thanks for the explanations!

This seems to capture contemporary intuitions, which is good for an ethical theory. But I wonder about those constants, τ0, h0, and u0. Each is set to a baseline that we are used to: ordinary lifespan, ordinary happiness, and ordinary total utility of a life. Already, τ0 is over twice what it would have been hundreds of years ago. Likewise, I can't judge whether the happiness or total utility of a peasant of hundreds of years ago was much different than today's average, but that is at least plausible.

In short, this builds utility on human rather than transhuman foundations. Maybe that is a good thing.

I'm not sure that should be interpreted as "ordinary happiness", at least that's not how I described it. Regarding , human preferences are the product of evolution plus maybe to some extent culture. So, it should not be surprising if some of the parameters come either from the ancestral environment in which evolution happened or from the "memetically ancestral" environment in which the culture evolved. And, I am talking about a model of (modern?) human preferences, not about some kind of objective morality (which doesn't exist).

I'm curious - would you say DNT is a good approximate model of what we ought to do (assuming we were ideally virtuous), or of what you would actually want done? Where 'should' selfishness come into things?

For instance, let's say we're in a universe with a finite limit on computation, and plan (a) involves setting up an optimal reachable-universe-wide utopia as fast as possible, with the side effect of killing all current humans. Plan (b) involves ensuring that all current humans have utopian futures, at the cost of a one second delay to spreading utopia out into the universe.

From the point of view of DNT or standard total utilitarianism, plan (a) seems superior here. My intuition says it's preferable too: that's an extra second for upwards of 10^35 patients. Next to that, the deaths (and optimised replacement) of a mere 10^10 patients hardly registers.

However, most people would pick plan (b); I would too. This amounts to buying my survival at the cost of 10^17 years of others' extreme happiness. It's a waste of one second, and it's astronomically selfish.

It's hard to see how we could preserve or convert current human lives without being astronomically selfish moral monsters. If saving current humans costs even one nanosecond, then I'm buying my survival for 10^8 years of others' extreme happiness; still morally monstrous.

Is there a reasonable argument for plan (b) beyond, "Humans are selfish"?

Of course time discounting can make things look different, but I see no moral justification to discount based on time. At best that seems to amount to "I'm more uncertain about X, so I'm going to pretend X doesn't matter much" (am I wrong on this?). (Even in the infinite case, which I'm not considering above, time discounting doesn't seem morally justified - just a helpful simplification.)

Excellent question! I think that my actual preferences are some combination of selfish and altruistic (and the same is probably true of most people), and DNT only tries to capture the altruistic part. It is therefore interesting to try writing down a model of how selfish utility aggregates with altruistic utility. A simple "agnostic" formula such as a linear combination with fixed coefficients works poorly, because for any given coefficients it's easy to come up with a hypothetical where it's either way too selfish or way too altruistic.

I think that it's more reasonable to model this aggregation as bargaining between two imaginary agents: a selfish agent that only values you and people close to you, and an altruistic agent with impartial (DNT-ish) preferences. This bargaining can work, for example, according to the Kalai-Smorodinksy solution, with the disagreement point being "purely selfish optimization with probability and purely altruistic optimization with probability ", where is a parameter reflecting your personal level of altruism. Of course, the result of bargaining can be expressed as a single "effective" utility function, which is just a linear combination between the two, but the coefficients depend on the prior and strategy space.

It's interesting to speculate about the relation between this model and multiagent models of the mind.

Something of the same nature should apply when a group of people act cooperatively. In this case we can imagine bargaining between an agent that only cares about this group and an impartial agent. Even if the group includes all living people, the two agents will be different since the second assigns value to animals and future people as well.

Of course time discounting can make things look different, but I see no moral justification to discount based on time.

Actually I think time discount is justified and necessary. Without time discount, you get a divergent integral over time and utility is undefined. Another question is, what kind of time discount exactly. One possibility I find alluring is using the minimax-regret decision rule for exponential time discount with a half-life that is allowed to vary between something of the order of to .

That bargaining approach is indeed interesting, thanks.

On discounting, I need to read more. I'm currently looking through Pareto Principles in Infinite Ethics (other useful suggestions welcome). While I can see that a naive approach gives you divergent integrals and undefined utility, it's not yet clear to me that there's no approach which doesn't (without discounting).

If time discounting truly is necessary, then of course no moral justification is required. But to the extent that that's an open question (which in my mind, it currently is - perhaps because I lack understanding), I don't see any purely moral justification to time discount. From an altruistic view with a veil of ignorance, it seems to arbitrarily favour some patients over others.

That lack of a moral justification motivates me to double-check that it really is necessary on purely logical/mathematical grounds.

If I'm running a simulation of a bunch of happy humans, it's entirely possible for me to completely avoid your penalty term just by turning the simulation off and on again every so often to reset all of the penalty terms. And if that doesn't count because they're the same exact human, I can just make tiny modifications to each person that negate whatever procedure you're using to uniquely identify individual humans. That seems like a really weird thing to morally mandate that people do, so I'm inclined to reject this theory.

Furthermore, I think the above case generalizes to imply that killing someone and then creating an entirely different person with equal happiness is morally positive under this framework, which goes against a lot of the things you say in the post. Specifically:

It avoids the problem with both totalism and averagism that killing a person and creating a different person with equal happiness is morally neutral.

It seems to do so in the opposite direction that I think you want it to.

It captures the intuition many people have that the bar for when it's good to create a person is higher than the bar for when it's good not to kill one.

I think this is just wrong, as like I said it incentives killing people and replacing them with other people to reset their penalty terms.


I do agree that whatever measure of happiness you use should include the extent to which somebody is bored, or tired of life, or whatnot. That being said, I'm personally of the opinion that killing someone and creating a new person with equal happiness is morally neutral. I think one of the strongest arguments in favor of that position is that turning a simulation off and then on again is the only case I can think of where you can do actually do that without any other consequences, and that seems quite morally neutral to me. Thus, personally, I continue to favor Solomonoff-measure-weighted total hedonic utilitarianism.

If I'm running a simulation of a bunch of happy humans, it's entirely possible for me to completely avoid your penalty term just by turning the simulation off and on again every so often to reset all of the penalty terms.

No. First of all, if each new instance is considered a new person then the result of turning off and back on would be negative because of the term. Assuming (like I suggest in the text) means the loss from is always greater than the gain from avoiding the age-dependent penalty.

Second, like I said in the text, I'm talking about an approximate model, not the One True Formula of morality. This model has limited scope, and so far I haven't included any treatment of personal identity shenanigans in it. However, now that you got me thinking about it, one way to extend it that seems attractive is:

  • Consider the term as associated with the death of a person. There can be partial death which gives a partial penalty if the person is not entirely lost. If the person is of age at the time of death, and ey have a surviving clone that split off when the person was of age , then it only counts as of a death so the penalty is only . If the person dies but is resurrected in the future, then we can think of death as producing a penalty and resurrection as producing a reward. This is important if we have time discount and there is a large time difference. Imperfect resurrection will produce only a partial resurrection reward. You cannot fully resurrect the same person twice, but a good resurrection following a bad resurrection awards you the difference. No sequence of resurrections can sum to more than 1, and a finite sequence will sum to strictly less than 1 unless at least one of them is perfect. Having amnesia can be counted as dying with a living clone or as dying fully with a simultaneous partial resurrection, which amounts to the same.

  • Consider the age-dependent penalty as referring to the subjective age of a person. If you clone a person, the age counter of each continues from the same point. This is consistent with interpreting it as a relation between "true happiness" and "quality of life".

I think that this extension avoids the repugnant conclusion as well as the original, but it would be nice to have a formal proof of this.

Ah, I see—I missed the term out in front, that makes more sense. In that case, my normal reaction would be that you're penalizing simulation pausing, though if you use subjective age and gradually identify unique personhood, then I agree that you can get around that. Though that seems to me like a bit of a hack—I feel like the underlying thing that you really want there is variety of happy experience, so you should just be rewarding variety of experience directly rather than trying to do use some sort of continuous uniqueness measure.

I don't understand why the underlying thing I want is "variety of happy experience" (only)? How does "variety of happy experience" imply killing a person and replacing em by a different person is bad? How does it solve the repugnant conclusion? How does it explain the asymmetry between killing and not-creating? If your answer is "it shouldn't explain these things because they are wrong" then, sorry, I don't think that's what I really want. The utility function is not up for grabs.

Say you are in a position to run lots of simulations of people, and you want to allocate resources so as to maximize the utility generated. Of course, you will design your simulations so that h >> h0. Because all the simulations are very happy, u0 is now presumably smaller than hτ0 (perhaps much smaller). Your simulations quickly overcome the u0 penalty and start rapidly generating net utility, but the rate at which they generate it immediately begins to fade. Under your system it is optimal to terminate these happy people long before their lifespan reaches the natural lifespan τ, and reallocate the resources to new happy simulations.

The counterintuitive result occurs because this system assigns most of the marginal utility to occur early in a person's life.

[This comment is no longer endorsed by its author]Reply

No. It is sufficient that (notice it is there, not ) for killing + re-creating to be net bad.

The penalty doesn't reset when you create a new human. You are left with the negative value that the killed human left behind, and the new one starts off with a fresh amount of -u0[new person] to compensate for. If the original human would have been left alive, he would have compensated for his own, original -u0[original person], and the entire system would have produced a higher value.

To the contrary- turning the simulation on adds up all the -u_0 terms for all of the moral patients in the simulation, meaning that the first tick of the simulation is hugely negative.

I don't share the opinion that numbers are moral patients within a context where they are visible as numbers, because I don't think it's supererogatory or required to run DFEDIT and set every dwarf's happiness value to MAXINT, and in any context where "I" am visible to an entity running a simulation of me as a number or analogous concept, changing the value which corresponds with what "I" "call" "my happiness" within that simulation is not a thing that can be understood in any context that I have access to.

I am thinking that a bounded reasoner trying to decide whether killing and replacing is a thing to do might need a method of factoring in taht they can't understand the system completely. Ontological inertia will mean that a system preserved will retain value that you do not understand also but recreating imports only the kinds of value you do know to value. Thus a policy of assuming that there is unknowable value in keeping a system intact balances against making known improvements and the balance between how big an improvement can be before it circumvents the need to keep fences standing. An agent that recreates a system for neglible improvement in effect assumes to have near infallible knowledge.

This is interesting – thanks!

There seems to be some struggle here with defining u0 in a way that intuitively represents morality, so I'll take a jab at it. As I see it, u0 can be viewed as "the utlity debt of creating a life", that is to say the amount of utility a person has to generate over the course of their lifetime in order to morally justify their existence (with all of its effects on the UDNTs of others already accounted for in their calculation), reasonably assuming they get to live a full t0 life.

An intuitive (albeit extreme) proof of concept: Imagine a fetus (i.e. a pre-person at a stage where hypothetically everyone agrees they are not a moral patient) which is reliably predicted not to develop limbs (Tetra-Amelia Syndrome) and age rapidly (Progeria). It is widely morally agreed that this person should be pre-terminated, as their life would be short and full of misery. The suggested model reflects this intuition: this person's expected UDNT is significantly lower than u0, considering their expected Tdeath is low and their expected average h is mostly below h0.

As such, subtracting u0 from the formula makes sure that every life added to the population is worth living. Does this click with people's intuitions?


P.S.

I slightly worry that a development of this concept dabbles in eugenics in a way that may defeat the purpose of modeling ethics. The balance that should prevent eugenics is essentially another one of the many aspects reflected in h0, which is where we tend to dump most of the unsolved part of this model.

On the practical side, figuring out the -u0 penalty for non-humans is extremely important for those adopting this sort of ethical system. Animals that produce lots of offspring that rarely survive to adulthood would rack up -u0 penalties extremely quickly while barely living long enough to offset those penalties with hedonic utility. This happens at a large enough of scale that, if -u0 is non-negligible, wild animal reproduction might be the most dominant source of disutility by many orders of magnitude.

When I try to think about how to define -u0 for non-humans, I get really confused -- more-so than I do when I reason about how animals suffer. The panpsychist approach would probably be something like "define death/life or non-existence/existence as a spectrum and make species-specific u0s proportional to where species fall on that spectrum." Metrics of sapience/self-awareness/cognition/other-things-in-that-cluster might be serviceable for this though.

Objection:

h(t) already accounts for boredom and 'tired of life' effects.

You've anticipated this:

> One could argue that this should already be factored into "happiness", but well, it's not like I actually defined what happiness is. More seriously, perhaps rather than happiness it is better to think of h as the "quality of life".

But previously:

> h(t) is happiness at time t (for now we think of it as hedonistic utilitarianism, but I propose a preference utilitarianism interpretation later).

I can't find any place where you did define h(t), but I think the way to follow the intention that you set is to set h(t) such that h(t) equals zero when the moral patient is ambivalent about whether they prefer not being born to a life that is always at that level of happiness. In order to set h(t) equal to quality of life, then happiness would have to equal to quality of life minus the 'tired of life' correction.


Objection:

The "−u_0" term and the " −h_0([age factor])" term double-count the concept that there are some lives that are worth living but not worth causing to exist. u_0 is the net lifetime utility necessary for a person to be worth creating, and h_0 is the instantaneous quality of life required for existing to be better than not existing; for someone with h(t)-h_0([age factor]) of positive epsilon for their entire life, they are constantly slightly happier to have been born than otherwise, but their net utility over their lifetime is approximately -u_0.


Note that 'dying sucks' is already included in h(t), because h(dying) is expected to be very negative. h(suicide) is often even lower than h(death), as revealed by a dive into psychological literature of behavior among people for whom h(suicide)<h(now)<h(death).


Objection: There is no protection against utility monsters, people who have h(t) so high that their existence and increasing their h dominates every other consideration. If that is patched to cap the amount of utility that any one individual can have, there is no protection against utility monster colonies, such that each member of the colony has maximum h(t) and the colony is so numerous that its collective utility dominates every other consideration. (this is a weak objection, since it applies equally well to competing theories of utilitarianism)


A better utilitarianism model might include a weight factor for 'how much' of a moral patient an entity is. A rock would have a weight of 0, the scale would be normalized such that the intended audience had a weight of 1. That does allow individual utility to have an upper cap without creating the utility colony problem, since the total moral patient weight of such a colony would have a finite value based on how much its welfare actually dominated the total utility of all moral patients.

h(t) already accounts for boredom and 'tired of life' effects.

See my reply to Dagon.

Note that 'dying sucks' is already included in h(t), because h(dying) is expected to be very negative.

If really you want, you can reinterpret as some negative spike in at the time of death, that occurs even if the person died instantly without any awareness of death. I think that maybe my use of the word "happiness" was confusing due to the nebulous nature of this concept and instead I should have talked about something like "quality of life" or "quality of experience" (still nebulous but a little less so, maybe).

There is no protection against utility monsters, people who have h(t) so high that their existence and increasing their h dominates every other consideration. If that is patched to cap the amount of utility that any one individual can have, there is no protection against utility monster colonies, such that each member of the colony has maximum h(t) and the colony is so numerous that its collective utility dominates every other consideration.

I only intended to address particular issues, not give a full theory of ethics (something that is completely infeasible anyway). I think is already bounded (ofc we cannot verify this without having a definition of in terms of something else, which is entirely out of scope here). Regarding the "utility monster colony" I don't see it as a problem at all. It's just saying "the concerns of a large population dominate the concerns of a small population" which is fine and standard in utilitarianism. The words "utility monster" are not doing any work here.

A better utilitarianism model might include a weight factor for 'how much' of a moral patient an entity is.

I agree something like this should be the case, like I said I had no intention to address everything.

Does it matter whether or not, when I go to sleep, that which makes me a moral patient ends, and a new moral patient exists when I wake up?

If a mad scientist ground my body into paste each night and replaced me with a copy that nobody (including me) could tell was different in the morning, how much would that actually suck over the long term?

Having a term outside the integral means caring about uncertain continuity questions, and while the assumed answer to those questions is clear, assuming an answer to the pure question just shifts the question to the new edge case- does a concussion with momentary blackout count as death? Is there no level of temporary brain damage that counts as death? Is deleting a transhumanist brain scan as bad as killing a meat body? (No, it can't be, unless it's the last copy- but does the existence of a copy make killing the meat body less bad?)

Does it matter whether or not, when I go to sleep, that which makes me a moral patient ends, and a new moral patient exists when I wake up?

Yes, it matters (btw the answer is "no"). Sure, it is highly non-trivial to pinpoint exactly what is death in reductionist terms. The same is true of "happiness". But, nobody promised you a rose garden. The utility function is not up for grabs.

Btw, I think there is such a thing as "partial death" and it should be incorportated into the theory.

> If a mad scientist ground my body into paste each night and replaced me with a copy that nobody (including me) could tell was different in the morning, how much would that actually suck over the long term?


Why can the utility differ with no upper bound between two worlds with epsilon observed difference? If I did die every time I slept, there would be certain physical differences that could in theory be measured with a sleep EEG... but I've never had a sleep EEG done. In a less convenient world there would be humans whose person died every night to be replaced by a new person, and unless they are happier every day than some people are in their entire life, those humans should be shot permanently dead right away, because they are so damaging to total utility.

If a mad scientist ground you into paste and replaced you by a literally indistinguishable copy, then it doesn't suck, the copy is still you in the relevant sense. The more different is the copy from the original, the more it sucks, until some point of maximal suckiness where it's clearly a different person and the old you is clearly dead (it might be asymptotic convergence rather than an actual boundary).

I'm more completely confused- based on that, it would not matter if I died every time I fell asleep, and was replaced by someone identical to me. And If I did decide to make a major personality change, that would be intrinsically bad.

If you "died" in your sleep and were replaced by someone identical to you, then indeed it wouldn't matter: it doesn't count as dying in the relevant sense. Regarding a major personality change, I'm not sure what you have in mind. If you decide to take on a new hobby, that's not dying. If you take some kind of superdrug that reprograms your entire personality then, yes, that's pretty close to dying.

Why have a strong preference against a world with a few humans who take such a drug repeatedly?


If someone is in a rut and could either commit suicide or take the reprogramming drug (and expects to have to take it four times before randomizing to a personality that is better than rerolling a new one), why is that worse than killing them and allowing a new human to be created?

Ethics is subjective. I'm not sure what argument I could make that would persuade you, if any, and vice versa. Unless you have some new angle to approach this, it seems pointless to continue the debate.

Interesting. One issue DNT doesn't seem to fix is the worst part of the very repugnant conclusion.

Specifically, while in the preferred world the huge population is glad to have been born, you're still left with a horribly suffering population.

Considering that world to be an improvement likely still runs counter to most people's intuition. Does it run counter to yours? I prefer DNT to standard total utilitarianism here, but I don't endorse either in these conclusions.

My take is that repugnant conclusions as usually stated aren't too important, since in practice we're generally dealing with some fixed budget (of energy, computation or similar), so we'll only need to make practical decisions between such equivalents.

I'm only really worried by worlds that are counter-intuitively preferred after we fix the available resources.

With fixed, limited energy, killing-and-replacing-by-an-equivalent is already going to be a slight negative: you've wasted energy to accomplish an otherwise morally neutral act (ETA: I'm wrong here; a kill-and-replace operation could save energy). It's not clear to me that it needs to be more negative than that (maybe).

Specifically, while in the preferred world the huge population is glad to have been born, you're still left with a horribly suffering population.

This conclusion seems absolutely fine to me. The above- population has positive value that is greater than the negative value of the horribly suffering population. If someone's intuition is against that, I suppose it's a situation similar to torture vs. dust specks: failure to accept that a very bad thing can be compensated by a lot of small good things. I know that, purely selfishly, I would prefer a small improvement with high probability over something terrible with sufficiently tiny probability. Scaling that to a population, we go from probabilities to quantities.

With fixed, limited energy, killing-and-replacing-by-an-equivalent is already going to be a slight negative: you've wasted energy to accomplish an otherwise morally neutral act. It's not clear to me that it needs to be more negative than that (maybe).

I strongly disagree (it is not morally neutral at all) but now sure how to convince you if you don't already have this intuition.

Oh sure - agreed on both counts. If you're fine with the very repugnant conclusion after raising the bar on h a little, then it's no real problem. Similar to dust specks, as you say.

On killing-and-replacement I meant it's morally neutral in standard total utilitarianism's terms.

I had been thinking that this wouldn't be an issue in practice, since there'd be an energy opportunity cost... but of course this isn't true in general: there'd be scenarios where a kill-and-replace action saved energy. Something like DNT would be helpful in such cases.

Can I have a link to the discord server you're talking about?