Many of us are familiar with the marshmallow test.If you are not, here.
It is predictive of success, income, level of education, and several other correlated measures.
I'm here to argue for the marshmallow eaters, as a devil's advocate. Contra Ainslie, for instance. I do it out of genuine curiosity, real suspicion, and maybe so that smart people get me back to my original position, pro-long term.
There is also the e-marshmallow test (link is not very relevant), in which children have to face the tough choice between surfing an open connected computer with games, internet etc... and waiting patiently for the experimenter to get back. Upon the experimenter's arrival, they get a pile of marshmallows. I presume it also correlates with interesting things, though haven't found much on it.
I have noticed that rationalists, LessWrongers, Effective Altruists, Singularitarians, Immortalists, X-risk worried folk, transhumanists, are all in favor of taking the long view. Nick Bostrom starts his TED by saying: "I've been asked to take the long view"
I haven't read most of Less Wrong, but did read the sequences, the 50 top scoring posts and random posts. The overwhelming majority view is that the long view is the most rational view. The long term perspective is the rational way for agents to act.
Lukeprog, for instance, commented:
"[B]ut imagine what one of them could do if such a thing existed: a real agent with the power to reliably do things it believed would fulfill its desires. It could change its diet, work out each morning, and maximize its health and physical attractiveness."
To which I responded:
I fear that in this phrases lies one of the big issues I have with the rationalist people I've met thus far. Why would there be a "one" agent, with "its" desires, that would be fulfilled. Agents are composed of different time-spans. Some time-spans do not desire to diet. Others do (all above some amount of time). Who is to say that the "agent" is the set that would be benefited by those acts, not the set that would be harmed by it.
My view is that picoeconomics is just half the story.
In this video, I talk about picoeconomics from 7:00 to 13:20 I'd suggest to take a look at what I say at 13:20-18:00 and 20:35-23:55, a pyramidal structure of selfs, or agents.
So you don't have to see the video, let us design a structure of selfhood.
First there is intertemporal conflict, conflict between desires that can be fulfilled at different moments of time. Those reliably fall under a hyperbolic characterization, and the theory that described this is called Picoeconomics, mostly developed by George Ainslie in his Breakdown of Will and elsewhere.
But there is also time-length, or time-span conflict. The conflict that arises from the fact that you are, at the same time, the entity that will last 200milliseconds, the entity that will last one second, and the entity that will last a year, or maybe, a thousand years.
What do we (humanity) know about personal identity at this point in history? If mainstream anglophone philosophical thought is to be trusted, we have to look for Derek Parfit's work Reasons and Persons, and posterior related work, to get that.
I'll sum it up very briefly: As far as we are concerned, there are facts about continuity of different mental classes. There is continuity of memory, continuity of conscious experience, continuity of psychological traits and tendencies, continuity of character, and continuity of inferential structure (the structure that we use to infer things from beliefs we acquire or access).
For each of these traits, you can take an individual at two points in time and measure how related It1 and It2 are with respect to that psychological characteristic. This is how much I at T2 is like himself at T1.
Assign weights for traits according to how much you care (or how important each is in the problem at hand) and you get a composed individual, for which you can do the same exercise, using all of them at once and getting a number between 0 and 1, or a percentage. I'll call this number Self-Relatedness, following the footsteps of David Lewis.
This is our current state of knowledge on Personal Identity: There is Trait-Relatedness, and there is Self-Relatedness. After you know all about those two, there is no extra fact about personal identity. Personal Identity is a confused concept, and when we decompose it into less confused, but more useful, sub-sets, there is nothing left to be the meta-thing "Personal Identity".
Back to the time-length issue, consider how much more me the shorter term selves are (that is how much more Self-Relatedness there is between any two moments within them).
Sure if you go all the way down to 10 milliseconds, this stops being true, because there are not even traits to be found. Yet, it seems straightforward that I'm more like me 10 seconds ago than like me 4 months ago, not always, but in the vast majority of cases.
So when we speak of maximizing my utility function, if we overlook what me is made of, we might end up stretching ourselves to as long-term as we possibly can, and letting go of the most instantaneous parts, which de facto are more ourselves than those ones.
One person I met from the LessWrong Singinst cluster claimed: "I see most of my expected utility after the singularity, thus I spend my willpower entirely in increasing the likelihood of a positive singularity, and care little about my current pre-singularity emotions"
Is this an amazing feat of self-control, a proof that we can hope to live according to ideal utility functions after all? Or is it a defunct conception of what a Self is?
I'm not here to suggest a canonical curve of time-lengths of which the Self is composed. Different people are different in this regard. Some time-lengths are stretchable, some can be shortened. Different people will also value the time-lengths differently.
It would be unreasonable for me to expect that people would, from now on, put on a disclaimer on their writings "I'm assuming 'rational' to mean 'rational to time-lenghts above the X treshold' for this writing". It does, however, seem reasonable to keep an internal reminder when we reason about life choices, decisions, and writings, that not only there are the selves which are praised by the Rationalist cluster, the long term ones, but also, the short term ones.
A decision to eat the marshmallow can, after all, be described as a rational decision, it all depends on how you frame the agent, the child.
So when a superintelligence arises that, despite being Friendly and having the correct goals, does the AGI equivalent of scrolling 9gag, eating Pringles and drinking booze all day long, tell the programmers that the concept of Self, Personal Identity, Agent, or Me-ness was not sufficiently well described, and vit cares too much for vits short-term selves. If they tell you: "Too late, vit is a Singleton already" you just say "Don't worry, just make sure the change is ve-e-e-ery slow..."
This is improper anthropomorphising of something we could presumably write down the math for.
Think for instance of Omohundro's classic AI drives. Would they arise the same way if the AGI 1) Had a self-model 2)Had the same layered different lenght structure of self that we do?
You may be arguing that vit would not have a self-like structure at all. Which is indeed possible. It may be that its utility function is sufficiently different from ours that that kind of problem doesn't arise at all. The one thing that is worrisome is that if you ask that function to consult people's function, to extract some form of extrapolation, then you don't have the problem in the meta level, but still have it on the level of what the AGI thinks people are (say, because it scrutinized short-terms)
I'm also fine with this not applying for different reasons, in which case take the text to be only about humans, and ignore the last paragraph.
That's a good point. One could imagine a method of getting utility functions from human values that, maybe due to improper specification, returned some parts from short-term desires and some other parts from long-term desires, maybe even inconsistently. Though that still wouldn't result in the AI acting like a human - it would do weirder things.
Future solidarity is a preference. Different strengths of that preference lead to different time preferences.
Having said that, I think it's true that people generally have a strong future solidarity preference with themselves, at least in theory, so that this is not a valid argument against the long view as the best strategy for winning, given the preferences that people have.
A better argument comes from uncertainty about events and preferences in the future.
In the marshmallow case, how is the child supposed to know that he can trust the Marshmallow Man? As the seconds tick by, is he sure he heard Marshmallow Man right? Maybe he misunderstood. Is he sure he remembers what the Marshmallow Man said?
Have they factored trust out of the marshmallow experiment? Analyzed correlations between trust and holding off on eating the marshmallow?
For grown up problems, we don't know know what the future holds. Remember when a college education was supposed be your key to a secure future? Ha! The world is changing faster than ever. And how good it will be is largely out of your hands. It's unlikely that you, personally, will be the one to defeat Death. You don't know what the world will be like, and you don't even know what you will be like. How have your preferences for money, experience, leisure, people, and places changed in the past? How much do you think they'll change in the future? How will this interact with changes in the state of the world?
In the face of such uncertainty, "do what you love" is likely the most self negating you should be, while "Live Now" is looking more and more reasonable.
A final thought. It occurs to me that backward solidarity is likely as important as future solidarity for long term thinking. Maybe more so. If you feel free to create yourself anew each day, is it likely you'll have the commitment to follow through on a project?
Nice, thanks. It's what I'd expect.
I'm thinking further, and considering the general priors for trust that the kids walk into the room with. I can somewhat see the problem with my own behavior. Lacking trust, you take the marshmallow in the hand, instead of the two in a contingent trustworthy future.
It reminds me of some "children of alcoholics" book I read once. It said that alcoholics often make promises to their children that they later do not fulfill. And then those children have long-term problem with trust and self-discipline, even decades later.
If that is true, I would expect such children to do worse in the Marshmallow test and to have worse outcomes later in life.
That was the kind of thing I was getting at.
The flip side of the unreliable parent is the French Parent - I bought some book about French parenting as compared to US parenting to give to a friend. The basic thrust was that the parent, when it came to discipline, should be something like a force of nature. Sure, swift, serene, no bargaining, no upset. Caring, but imperturbable. This is the way it is. You do this, this happens. Gravity, with a hug.
Which leads me to a very disturbing thought. At first, I thought a test to differentiate trust in people versus self discipline was possible. Factor the person out of the marshmallow scenario, and see how the kids do.
But that doesn't really prove anything. What if trust is learned as a whole, when young? Your parents are a force of nature. They are the universe, when you're a baby. If they're capricious, unpredictable, and worse, malevolent, then that's your emotional estimate of the universe. It's not that you don't have self discipline, it's that you live in a malevolent, unpredictable universe that you rightly don't trust. Or so it seems to you.
It would be interesting to correlate parental behavior and one's picture of God. Maybe that feeling of God some people have is the psychic after image of how the universe, mainly through mom and dad, appeared to them when young.
Yes. People bring many aliefs from their childhood; predictability of the universe is probably one of them.
If your model says that one marshmallow is sure, but two marshmallows have probability smaller than 50%, then choosing one is better. If your model says that you cannot trust anything, including yourself, then following short-term pleasures is better than following long-term goals.
How can this model be fixed? It would probably require a long-term exposure to some undeniable regularity. Either living in a strict environment (school? prison?) or maintaining long-term records about something important.
I think an extended period of working with your hands helps. Do some projects where you're interacting with agentless reality. Garden. Build a fence. Fix your car. The fewer words involved, the better.
With regards to the singularity, and given that we haven't solved 'morality' yet, one might just value "human well-being" or "human flourishing" without referring to a long-term self concept. I.e. you just might care about a future 'you', even if that person is actually a different person. As a side effect you might also equally care about everyone else in to future too.
I'm bothered by the apparent assumption that morality is something that can be "solved".
What about "decided on"?
If we haven't decided what morality to use yet, then how are we making moral decisions now, and how are we going to decide this later? I think that what you might call "the function that we'll use to decide our morality later on" is what I call "my morality both now and later".
Or you might simply mean our morality will keep changing over time (because we will change, and the environment and its moral challenges will also change). That's certainly true.
Very interesting post.
Our selves post-singularity may well be to us as we are to our ten years of age version. Even now, looking back a few years can feel like having been implanted with the memories of a stranger.
But at least there's physical continuity, right? Talk about brain uploads, our molecules getting switched out all the time anyways, we seem to care only about the pattern, which is fine. The physical continuity is an illusion, and the perceptual continuity broken up every time you go through a deep sleep cycle. So if we solely value the pattern, should we identify more with our peers in a similar situation than with our future selves? (Taken to the extreme, discounting short term benefits may lead to continually sacrificing in the present so that last present moment we experience - on our deathbed? - is supposed to be healthy bliss?)
Well, hopefully the superintelligence can sort out this mess ... deus ex machine indeed.
Your term "piconomics" is non-standard and generates largely irrelevant hits in search. Please stick with the somewhat more standard term "picoeconomics".
Additionally, I'm not sure a new term is useful, given that the issue is pretty much covered in hyperbolic discounting.
Fixed, it was just a confusion with the term.
I think the interesting question is why we care for our future selves at all.
As kids, we tend not to. It's almost a standard that a child has a holiday, and a bit of homework to do during that holiday, then they will decide not to do the work at the beginning of the break. The reason is they care about their current selves, and not about their future self. Of course in due time the future becomes the present, and that same child has to spend the entire time at the end of their holiday working furiously on everything that's been left to the last minute. At that point, they wish that their past self had chosen an alternative plan. This is still not really wisdom, as they don't much care about their past self either - they care about their present self who now has to do the homework.
Summarising - if your utility function changes over time, then you will, as you mentioned, have conflict between your current and future self. This prevents your plans for the future from being stable - a plan that maximises utility when considered at one point no longer maximises it when considered again later. You cannot plan properly - and this undermines the very point of planning. (You may plan to diet tomorrow, but when tomorrow comes, dieting no longer seems the right answer....)
I think this is why the long view becomes the rational view - if you weight future benefits equally to your present ones, assuming (as you should) that your reward function is stable, then a plan you make now will still be valid in the future.
In fact the mathematical form that works is any kind of exponential - it's OK to have the past be more important than the future, or the future more important than the past as long as this happens as an exponential function of time. Then as you pass through time, the actual sizes of the allocated rewards change, but the relative sizes remain the same, and planning should be stable. In practice an exponential rise pushes all the importance of reward far out into the indefinite future, and is useless for planning. Exponential decays push all the important rewards into your past, but since you can't actually change that, it's almost workable. But the effect of it is that you plan to maximise your immediate reward to the neglect of the future, and since when you reach the future you don't actually think that it was worthwhile that your past self enjoyed these benefits at the expense of your present self, this doesn't really work either as a means of having coherent plans.
That leaves the flat case. But this is a learned fact, not an instinctive one.
If you actually implemented an exponential decay, you would think that it was worthwhile that your past self enjoyed these benefits at the expense of your present self. The inconsistency is if you implement an exponential decay for the future but flat for the past.
The worse I remember something, the less I care about it, so it is something like a reversed exponential curve for the past too.
You don't need to go that far for an example. When a child is assigned homework for tomorrow, they often won't do it (unles forced to by their parents), because they care more about not doing it now than they do about not having done it tomorrow.
It seems to me that you said something near to: "Assume planning is good or desirable" therefore I can show that long-term -> rational.
To which I say: True. But planning is only good or desirable if you need to be an agent of commitment, long term, trustworthiness, etc... which a more short-termed person might not be.
Say a post "iluminated" buddhist monk, for instance.
What about kids who value internet more than marshmallows? A better test would be to promise them more time if they waited.