I went to the NYC Secular Solstice this year. It was wondrous, but something stood out as I was compiling my thoughts about it on the way home. In one of the conversations after, someone expressed that up until now they had been assuming Solstice messages were directed at heroes but that they might have to update towards that no longer being true. Solstice spoke of heroism, and we cheered heroism, and then most people went back to the mundane lives they came from. That seemed unbearable sad to me.

I've been rereading HP:MoR recent (partially as a test drive of sequence reading on LessWrong 2.0 and partially due to a habit of rereading works that I find foundational or inspirational in my thinking) and I came back from NYC to find that I was on the tail end of the Azkaban section, specifically the refusal of the phoenix. The realization that this moment has passed, that it will not be offered again, is one of the two moments in HP:MoR that never fail to wrench a sob from me.

I think I might have wanted to be a hero, long ago, when I was a child and idolized Luke Skywalker and Mathias of Redwall and Nita Callahan. If that was true, it hasn't been true for a long time, that impulse was lost somewhere in my process of adapting to the world. The role that I wanted to fit instead was villain. I drove myself forward for years by looking to Grendel, Darth Maul, Peter Wiggin and other figures willing to ignore costs and convention to achieve what they wanted. 

If Solstice is supposed to speak to heroes, then why does it speak to me?

Well, what do I mean when I call myself a villain?

I want the world to be different for my being in it. I want enough stocked power that when something in the world makes me sad or angry or upset I can reach out and make it be different. I would enjoy the thought that mankind as a whole or any given member of it owes me their lives. I like hurting people, exerting power over the world, and the feeling of winning. I’m more comfortable on the offense, causing things to happen and watching others react to me rather than waiting and reacting to what others do. I tend to act without default deference to social role- social norms form a sort of soft power that I've grown more conscious of as I've grown older, but I still can't bring myself to find "what will people think" at all a reason by itself to do or not do something. 

It's only recently that I've found my first mental urge in response to idiocy is no longer wanting to cause them pain until they stop, that “how can I help them” is my true reaction to others instead of a mask that was useful to wear.

This is a change that took place when I wasn't looking. Several years of mostly successfully suppressing anger and fury, more years before that of not needing them on a weekly or daily basis, the slow dawning that I actually can be safe and warm and fed if I want to, and perhaps most of all pretending to be a mind much like my own but slowly bettering itself- these have let me relax my deathgrip on trying to maintain the ability to hurt anyone I need to. Empathy is still strange magic to me, but compassion isn't anymore. I find myself wanting to ease the pain of others as a part of my own utility function, not just for signalling reasons.

I am a strong proponent of words having meanings, and I have to admit I am not sure how well "villain" fits anymore. Using "hero" still feels strange when I think about it- perhaps I'm the sort of villain who shows up to Endbringer fights and keeps the peace while I build resources? I Am Samwise suggests the TV Tropes title of Dragon for the strong right hand of a larger figure, which I could very circumstantially find to be an ideal relationship, but there is a time for labels and a time for recognizing that “what you have, Mr. Potter, is freedom.” I am free. What do I want to be?

I just signed a lease on my current apartment for the next year. At the end of 2018, I'll have paid off my loans from the government and from family. I had been planning to study information security, since it seemed fun and well paid, and then settle down somewhere to garden my own little piece of the world and make fantastic wads of money and then retire early and do whatever the blazes I wanted. But when I thought about terrible circumstances that could happen, slippery slopes that the world around me could slide down, I drew lines around what could constitute a call to action for me. Then AlphaZero crossed one of those lines, demanding a response.

And this also connects to something I learned at Solstice. That someone else drew a line around an uncertain danger, a threshold to demand a response, and when that threshold was passed and travel into and out of the country that they lived in was limited, they found they did not actually take exceptional action to address the danger. They did not leave, they did not revolt, they did not even (I assume, though the world would look the same if they secretly did) make contact with underground elements to secure an escape route should the need arise. When they talked about this, they talked about learning that they wouldn't actually act even when they had previously decided they would and how that was a thing worth knowing about themselves. That reminded me of the subjects in the Milgram Experiment who wrote afterward to thank Milgram for the self-knowledge, and ten-year old!me sitting on the library floor reading about that and growing cold and declaring that I would never let my morality be set by an authority, that I would never do what I was told simply because I was told to. It made school into an even deeper level of hell, but I am glad of it now.

I need time to study, and I need the financial security the next year will bring if this is to be a marathon, but I am going to start attacking the FAI problem. I wish I could say this was due to rationally considering the expected cost and benefit, and the line was drawn this way, but I must confess it is an emotionally driven decision made by two thoughts. 

I will be strong enough to act when I decide I will act

And to hear a phoenix song go unanswered is too sad to bear.

I’ll be studying machine learning and AI alignment this coming year. My goal is to be useful enough at the former that I could do it for a job, and useful enough at the latter to where any new progress would be comprehensible to me. Recommendations for study approaches or materials are appreciated. (I have read Bostrom's Superintelligance and other non-technical things, I am a decent programmer in general and in other areas of focus but haven't done any work in this one, I'm currently reading Functional Decision Theory.)

10

2 comments, sorted by Highlighting new comments since Today at 11:22 AM
New Comment

I found the last six paragraphs of this piece extremely inspiring, to the extent that I think it nonnegligably raised the likelihood that I'll be taking "exceptional action" myself. I didn't personally connect much with the first part, though it was interesting. Did you used to want to want your reaction to idiocy be “'how can I help'”, even when it wasn't?

I used to get mad when people didn't know things I thought of as basic knowledge or notice things I thought of as obvious, and thought that expressing that would make them remember or pay more attention as well as being personally satisfying. Expressing that as anger got me in trouble a few times and also didn't get them to notice related but non-identical things, so I tried different reactions before settling on sounding kind and concerned and trying to explain the problem. This slightly raised the likelyhood of most people doing better next time, and eventually got me a useful reputation of being kind which was useful.

I'd say I've been maintaining that reaction as a deliberate mask a majority of the time for the last ~8 years. I never bothered trying to change my internal reaction, but I acted (in the sense of an actor on a stage) the way that got me the best results until it became an automatic reflex. Now I find when this happens my thoughts don't flow "What an idiot, crap can't say that out loud, right okay fake being a kind person, what would a kind person say?" They flow "Hrm, that wasn't what I wish they'd done. I wonder what I can do to make this less likely to happen again, possibly by making them feel good about doing what I'd rather they'd done instead?" I don't think I could pinpoint the moment things changed over, but it's very different when I pay attention to it.

I wrote paragraphs three to nine ("I think I might have wanted. . ." through "What do I want to be?" partially because explaining what I think now is made easier by explaining what I thought then and what changed, largely because I feel very strange responding to a call for heroes and was more comfortable responding with the caveat that I am a highly noncentral example, and lastely because it's the kind of thing I wish someone else had written and I had read when I was younger. It may be that this section could have been written better- I'm working without an editor here as none of this would make sense to the people I'd usually ask- but I am glad to have written it.

Let me know if you decide to take exceptional action, and in what domain you decide to apply yourself. Being inspiring is a source of warm fuzzies, and it's possible we could help each other.