DF was born with a time bomb in his genome, a deadly curse more horrifying than most. The name of this curse was Fatal familial insomnia (FFI).
Wikipedia describes the usual progression of this hell:
The disease has four stages:
- Characterized by worsening insomnia, resulting in panic attacks, paranoia, and phobias. This stage lasts for about four months.
- Hallucinations and panic attacks become noticeable, continuing for about five months.
- Complete inability to sleep is followed by rapid loss of weight. This lasts for about three months.
- Dementia, during which the person becomes unresponsive or mute over the course of six months, is the final stage of the disease, after which death follows.
From the case report DF's psychologist wrote after his death:
DF was a right-handed, 52-year-old, white, American man with a doctorate in naturopathy. DF's father, paternal uncle, and 2 male cousins were diagnosed with fatal familial insomnia (FFI). His father died at age 76; his uncle died at age 74; and each of DF's cousins died before the age of 50.
Not only is there no cure for FFI; there is no known cure for any prion disease.
On the day it became clear he was experiencing the same symptoms his relatives did, DF must have known his chances were terrible. And even the minuscule odds that he could find a solution were marred by the fact that his problem-solving organ was the very part of him that was beginning to degrade.
And if there was a way out, how could he come up with a solution when he was so, so tired?
If only he could get just a little bit of rest.
There is a motivational technique I use occasionally where I look at my behavior and meditate on what my revealed preferences imply about my actual preferences. Often, I am disgusted.
I then note that I am capable of changing my behavior. And that all that is required to change my revealed preferences is to change my behavior. Though there is an element of sophistry in this line of thinking, I can report some anecdotal success.
Many of us here, like DF, believe we have a deadly curse - or at least we believe we believe we have a deadly curse.
Since I read The Basic AI Drives, I have known abstractly the world is doomed. Though my system 1 seems to have difficultly comprehending this, this belief implies I, and everyone and everything I love, am doomed, too.
Through the lens of my revealed preferences, I either do not truly think the alignment problem is much of a problem, am mostly indifferent to the destruction of myself and everything I care about, or I am choosing the ignoble path of the free-rider.
I notice I am disgusted. But this is good news. All that is required to change my revealed preferences is to change my behavior.
DF's first tools were those of his naturopathic trade. He reported some success with a vitamin cocktail of the standard throw-in-everything-but-the-kitchen-sink, alternative-medicine style.
Perhaps something in his cocktail had some effect as his progression was slower than normal. But slow progression is progression just the same:
By month 15 (early stage II), vitamins alone failed to induce sleep. Following 5 consecutive nights of insomnia, DF became intensely irritable and delusional. An evaluation at the Massachusetts General Hospital in Boston, Massachusetts, found that he had suffered a minor stroke; he was anesthetized until he fell asleep. While hospitalized, he slept for 3 consecutive days and was fully alert and refreshed afterward.
Noticing the efficacy of the anesthetics, DF began to use them regularly:
Ketamine and nitrous oxide induced short (15-minute) periods of restful sleep, and were reapplied to offer more prolonged relief. Chloral hydrate in a light alcohol mix and/or chloroform also worked. Approximately 15 months into his illness, DF began to take sleep medications on a rotating schedule. Ethclorvynol, zolpidem, and diazepam reliably relieved his insomnia for roughly 1 month. During subsequent months, only diazepam offered intermittent relief.
In Irrational Modesty, I argue modesty is a paralytic preventing otherwise-capable people from acting on alignment:
Those who are capable, confident in their abilities, and motivated to work on this problem do not need a peptalk. But the possibility that there is a class of highly talented would-be-motivated people who lack confidence in their abilities still haunts me.
[...]In the event anyone reading this has objective, reliable external metrics of extremely-high ability yet despite this feels unworthy of exploring the possibility that they can contribute directly to research, my advice is to act as if these external metrics correctly assess your ability until you have thoroughly proven to yourself otherwise.
There is no virtue in clutching Kryptonite. I advise you to drop it and see how far you can fly.
There is another class of anesthetic whose key feature is a sort of detached emotional state, a feeling of being "above it" or "beyond it" or even "below it". Let's call "above it" and "beyond it" high detachment and "below it" low detachment.
Low detachment goes by names like "cynicism" or "nihilism". At its worst, one begins to take pleasure in one's own hopelessness, epitomized by this thought: I believe we are all doomed and there is nothing we can do about it. Isn't that metal!" If you find yourself thinking along those lines, imagine a man in a car hurtling towards a cliff thinking, I believe I am doomed and there is nothing I can do about it. Isn't that metal!".
High detachment goes by names like "enlightenment" and "awakening" and sometimes even "stoicism" It combines the, largely correct, realization that a great deal of suffering is caused by one's internal reactions to external events with the more questionable prescription of using this understanding to self-modify yourself towards a sort of "soft salvation" of resignation, acceptance, and the cultivation of inner peace.
One former hero of mine (a brilliant mathematician who was planning to work in alignment after grad school) was completely demoralized by this line of thinking.
He seems happier now. He seems more fulfilled. He seems to enjoy his meditation retreats.
He seems to have stopped working on avoiding the catastrophe that will kill him and most of the things he used to care about.
I consider this to be something of a shame.
Though DF's anesthetic regimen may have provided some relief, it was by no means a cure:
At 16 months, his symptoms included consistently elevated body temperature (as high as 102°F), profuse sweating, serious impairment of short-term memory (for which he compensated by keeping lists), difficulty maintaining attention (he often did not know that the phone was ringing), difficulty distinguishing reality from fantasy (he didn't remember whether he had called a friend or had only imagined doing so), persistent headaches, hallucinations while driving (believed he saw people on the road when it was, in fact, empty), panic attacks (which were treated with meprobate), and a complete loss of sense of time.
In this same month his condition worsened further and we began to see hints of DF's unusual mental strength and creativity:
[...]DF spent much of the day as an akinetic mute with terrible headaches, confusion, mood swings, and myoclonus of the left arm (treated with levodopa). Despite his outward “dementia,” he inwardly pondered approaches to his condition, and, when again able to speak, he requested a regimen of stimulants.
He was prescribed phentermine HCl 37.5 mg [...]The drug had immediate and dramatic effects, promoting not only alertness during the day, but apparently a sleep-inducing rebound when it wore off.
Once phentermine became ineffective, he moved on to other stimulants.
[...] At that point, methylphenidate offered some relief. After a few days, however, DF had a grand mal seizure and was again hospitalized. Although his thinking was clear and oriented, his speech was labored, dysarthric, and perseverative, and his fever had returned.
As the stimulants and their come-downs faded in efficacy, DF began to try to physically exhaust himself, forcing himself (in a state of complete mental exhaustion) to go on long hikes.
In the 19 months from the onset of his symptoms and, one presumes, in a state of unimaginable desperation, he got more creative:
Noting that his grand mal seizure was followed by restful sleep, DF sought to duplicate the experience with electroconvulsive therapy (ECT). Beginning in the 19th month of his illness, he subjected himself to 30 sessions during several weeks.
[...]At 22 months into his illness, DF purchased a sensory deprivation tank; a man-sized, egg-shaped chamber designed to eliminate all sensory input. DF became interested in this chamber because his sleep was constantly disturbed by any small sound, light, or motion.
[...] Because of inconsistent results in the sleep tank, DF explored ways to externally bias his biorhythms to favor sleep at specific times. These involved daily exercise; exposure to sufficient sunlight; and timed use of melatonin, diazepam, and tryptophan. Early, but not later, in the course of his disease, this combination was effective.
From a conversation with @TurnTrout on Discord
Let me share some of my experience and you tell me if it resonates. I think there's an intense mindset you can pick up from the Tsuyoku Naritai sequence which is like, "It's anime protagonist time", and maybe that works for a while. It worked a while for me. But telling yourself "It's time to be a dialed-in badass because the eyes of squintillions of future potential minds are on me, they need me"... doesn't seem like the way to actually be a dialed-in badass? I haven't found a way to sustainably make that mindset work. I agree that if you had write-access to your mood, then yes, consider doing that. But also, we are people[.]
And so I say "we are people" not as a coping mechanism which is like "I don't want to do more work so I'll just say 'not possible'", I say it because I honestly don't know how to sustainably do the thing I think you're pointing at. If someone knows how, I want to learn[.]
Just over 2 years after the onset of symptoms DF died of cardiac arrest, the result of heart damage from his FFI, his variegated treatments, and possibly drug withdrawal. He had lived 18 months longer than the typical case of FFI does.
In my mind, he died a hero's death.
It is probably too high school to end with the Dylan Thomas quotation. So I will just emphasize the rationalist cliche that we should strive to feel the emotion appropriate to the predicament we are in.
TurnTrout argues that Tsuyoku Naritai is not it, and maybe he is right. I do not know what the correct emotion feels like, but I think maybe DF knew.
Too high school — but then again:
And you, my father, there on the sad height,
Curse, bless, me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage, against the dying of the light.
High-detachment is great!...for certain situation for certain times. I really enjoy Rob Burbea's "Seeing That Frees" meta-framework regarding meditation techniques/ viewpoints: they are tools to be picked up and put down. If viewing the world in complete acceptance helps your suffering in that moment, then great! But you wouldn't want to do this all the time; eating and basic hygiene are actions of non-acceptance at a certain conceptual level. Same with impermanence and no-self. Your math friend may be open to that book recommendation.
I've had a similar experience with feeling Tsuyoku Naritai, but it being a temporary state (a few hours or days at a time maybe). I'm currently extremely interested in putting on different mindsets/perspectives for different purposes. An example is purposely being low-energy for sleeping and high-energy for waking up (as in purposely cultivating a "low-energy" mindset and having a TAP to tune into that mindset when you're trying to sleep). In this case, Tsuyoku Naritai may be good for lighting the fire for a day or two to get good habits started. But I think people may use unnecessary muscles when tuning into this mindset, causing it to be tense, head-ache inducing, and aversive.
This is speculation though, but I am, again, very interested in this topic and discussing it more. Feel free to dm me as well if you'd like to chat or call.
I was less black-pilled when I wrote this - I also had the idea that though my own attempts to learn AI safety stuff had failed spectacularly perhaps I could encourage more gifted people to try the same. And given my skills or lack thereof, I was hoping this may be some way I could have an impact. As trying is the first filter. Though the world looks scarier now than when I wrote this, to those of high ability I would still say this: we are very close to a point where your genius will not be remarkable, where one can squeeze thoughts more beautiful and clear than you have any hope to achieve from a GPU. If there was ever a time to work on the actually important problems, it is surely now.
indeed. but the first superintelligences aren't looking to be superagentic, which I'd note is a mild reassurance. the runway is short, but I think safety has liftoff. don't lose hope just yet :)
It is not obvious at all that 'AI aligned with its human creators' is actually better than Clippy. Even AI aligned with human CEV might not beat Clippy. I would much rather die than live forever in a world with untold numbers of tortured ems, suffering subroutines, or other mistreated digital beings.
Few humans are actively sadistic. But most humans are quite indifferent to suffering. The best illustration of this is our attitude toward animals. If there is an economic or ideological reason to torment digital beings we will probably torture them. The future might be radically worse than the present. Some people think that human CEV will be kind to all beings because of the strong preferences of a minority of humans. The humans who care about suffering have strong enough preferences to outweigh small economic incentives. But the world I live in does not make me confident.
I also put non-trivial probability on the possibility that the singularity has already happened and I am already one of the digital beings. This is good news because my life is not currently horrible. But I am definitely afraid I am going to wake up one day and learn I am being sent back into digital hell. At a minimum, I am not at all interested in cryopreservation. I don't want to end up like MMAcevedo if I can still avoid such a fate.
It's pretty obvious to me, but then I am a human being. I would like to live in the sort of world that human beings would like to live in.
I don't particularly blame humans for this world being full of suffering. We didn't invent parasitoid wasps. But we have certainly not used our current powers very responsibly. We did invent factory farms. And most of us do not particularly care.
I am very afraid more powerful humans/human-aligned beings will invent even worse horrors. And if we tolerate factory farming it seems possible we will tolerate the new horrors. So I cannot be confident that humans gaining more power, even if it was equitably distributed among humans, would actually be a good thing.
I fear this has already happened and I am already at the mercy of those vastly more powerful humans. In that sense, I fear for myself! But even if I am safe I fear for the many beings who are not. We can't even save the pigs, how are we going to save the ems!
But don't you share the impression that with increased wealth humans generally care more about the suffering of others? The story I tell myself is that humans have many basic needs (e.g. food, safety, housing) that historically conflicted with 'higher' desires like self-expression, helping others or improving the world. And with increased wealth, humans relatively universally become more caring. Or maybe more cynically, with increased wealth we can and do invest more resources into signalling that we are caring good reasonable people, i.e. the kinds of people others will more likely choose as friends/mates/colleagues.
This makes me optimistic about a future in which humans still shape the world. Would be grateful to have some holes poked into this. Holes that spontaneously come to mind:
I don't know how it will all play out in the end. I hope kindness wins and I agree the effect you discuss is real. But it is not obvious that our empathy increases faster than our capacity to do harm. Right now, for each human there are about seven birds/mammals on farms. This is quite the catastrophe. Perhaps that problem will eventually be solved by lab meat. But right now animal product consumption is still going up worldwide. And many worse things can be created and maybe those will endure.
People can be shockingly cruel to their own family. Scott's Who by Very Slow Decay is one of the scariest things I ever read. How can people do this to their own parents?
With increased wealth, humans relatively universally become more caring? Is this why billionaires are always giving up the vast majority of their fortunes to feed the hungry and house the homeless while willingly living on rice and beans?
If you donate to AI alignment research, it doesn't mean that you get to decide which values are loaded. Other people will decide that. You will then be forced to eat the end result, whatever it may look like. Your mistaken assumption is that there is such a thing as "human values", which will cause a world that is good for human beings in general. In reality, people have their own values, and they include terms for "stopping other people from having what they want", "making sure my enemies suffer", "making people regret disagreeing with me", and so on.
When people talk about "human values" in this context, I think they usually mean something like "goals that are Pareto optimal for the values of individual humans"- and the things you listed definitely aren't that.
If we are talking about any sort of "optimality", we can't expect even individual humans to have these "optimal" values, much less so en masse. Of course it is futile to dream that our deus ex machina will impose those fantastic values on the world if 99% of us de facto disagree with them.
I'm not sure they mean that. Perhaps it would be better to actually specify the specific values you want implemented. But then of course people will disagree, including the actual humans who are trying to build AGI.
What do you believe would happen to a neurotypical forced to have self-awareness and a more accurate model of reality in general?
The idea that they become allistic neurodivergents like me is, of course, a suspicious conclusion, but I'm not sure I see a credible alternative. CEV seems like an inherently neurodivergent idea, in the sense that forcing people (or their extrapolated selves) to engage in analysis is baked into the concept.
I often honestly struggle to see neurotypicals as sane, but I'm hideously misanthropic at times. The problem is, I became the way I am through a combination of childhood trauma and teenage occultism (together with a tendency to be critical of everything), which is a combination that most people don't have and possibly shouldn't have; I don't know how to port my natural appetite for rationality to a "normal" brain.
Exactly your point is what has prevented me from adopting the orthodox LessWrong position. If I knew that in the future Clippy was going to kill me and everyone else, I would consider that a neutral outcome. If, however, I knew that in the future some group of humans were going to successfully align an AGI to their interests, I would be far more worried.
If anyone knows of an Eliezer or SSC-level rebuttal to this, please let me know so that I can read it.
The way I see it, the only correct value to align any AI to is not the arbitrary values of humans-in-general, assuming such a thing even exists, but rather the libertarian principle of self-ownership / non-aggression. The perfect super-AI would have no desire or purpose other than to be the "king" of an anarcho-monarchist world state and rigorously enforce contracts (probably with the aid of human, uplift, etc interpreters, judges, and juries stipulated in the contracts themselves, so that the AI does not have to make decisions about what is reasonable), including a basic social contract, binding on all sentient beings, that if they possess the capacity for moral reasoning, they are required to respect certain fundamental rights of all other sentient beings. (This would include obvious things like not eating them.) It would, essentially, be a sentient law court (and police force, so that it can recognize violations and take appropriate action), in which anything that has consciousness has its rights protected. For a super-AI to be anything other than that is asking for trouble.
I bet solutions have been somewhat found/sustained in other contexts (high-performance research teams, low-runway startups, warfare, uh... maybe astronauts?).
I'm trying to read more on this topic for a new post. Any other ideas, or books or historical events or people worth adding to my list?
I think the appropriate emotion is desperate, burning hatred for every vile inhuman force, such as Moloch, death, paperclip maximizers, carnism (considered as a mind parasite almost universally endemic to the human population), or FFI, that blindly and callously defiles the minds and bodies of sentient beings who deserve so much better. I unfortunately 1. cannot maintain such an emotion for long and 2. find my resentment constantly leaking out against other people who lack understanding of these issues and whose actions unknowingly (or knowingly, in the case of carnists, but bless them, they know not what they do) support them, which does not help. So... maybe I'm wrong.
AI alignment isn't the only problem. Most people's values are sufficiently unaligned with my own that find solving AI unattractive as a goal. Even if I had a robust lever to push, such as donating to an AI alignment research org or lobby think tank and it was actually cost-effective, the end result would still be unaligned (with me) values being loaded. So there are two steps rather than one: First, you have to make sure the people who create AI have values aligned with yours, and then you have to make sure that the AI has values aligned with the people creating it.
Frankly, this is hopeless from my perspective. Just the first step is impossible. I know this from years of discussions and debates with my fellow human beings, and from observing politics. The most basic litmus test for me is if they force fates worse than death on people who explictly disagree. In other words, if suffering is mandatory or if people will respect other people's right to choose painless death as an ultima ratio solution for their own selves (not forcing it on others). This is something so basic and trivial, and yet so existential that I consider it a question where no room for compromise is possible from my perspective. And I am observing that, even though public opinion robustly favors some forms of suicide rights, the governments of this world have completely botched the implementation. And that is just one source of disagreement, the one I choose as a litmus test because the morally correct answer is so obvious and non-negotiable from my perspective.
The upside opportunities from the alleged utopias we can achieve if we get the Singularity right also suffer from this problem. I used to think that if you can just make life positive enough, the downside risks might be worth taking. So we could implement (voluntary) hedonic enhancements, experience machines and pleasure wireheading offers to make it worthwhile for those people who want it. These could be so good that it would outweigh the risk, and investing in such future life could be worth it. But of course those technologies are decried as "immoral" also, by the same types of "moralists" who decry suicide rights. To quote former LessWrong user eapache:
There is a lot of talk about "moral obligations" and "ethics" and very little about individual liberty and the ability to actually enjoy life to its fullest potential. People, especially the "moral" ones, demand Sacrifices to the Gods, and the immoral ones are just as likely to create hells over utopias. I see no value in loading their values into an AI, even if it could be done correctly and cost-effectively.
Luckily, I don't care about the fate of the world in reflective equilibrium, so I can simply enjoy my life with lesser pleasures and die before AGI takes over. At least this strategy is robust and doesn't rely on convincing hostile humans (outside of deterring more straightforward physical attacks in the near-term, which I do with basic weaponry) let alone solving the AGI problem. I "solve" climate change the same way.
Was that sentence missing a period, or something else?
You're talking about, among other things, death. So it isn't silly. Silly would be:
[You're playing a video game. It looks like you're going to lose. And someone says: ]
"Do not go gentle into that good night."
Fair enough. "Silly" is out.