I've been reading the hardcover SSC collection in the mornings, as a way of avoiding getting caught up in internet distractions first thing when I get up. I'd read many of Scott Alexander's posts before, but nowhere near everything posted; and I hadn't before made any attempt to dive the archives to "catch up" to the seeming majority of rationalists who have read everything Scott Alexander has ever written.

(The hardcover SSC collection is nowhere near everything on SSC, not to mention Scott's earlier squid314 blog on livejournal. I'm curious how much shelf space a more complete anthology would occupy.)

Anyway, this has gotten me thinking about the character of Scott Alexander's writing. I once remarked (at a LessWrong meetup) that Scott Alexander "could never be a cult leader". I intended this as a sort of criticism. Scott Alexander doesn't write with conviction in the same way some other prominent rationalist authors do. He usually has the attitude of a bemused bystander who is merely curious about a bunch of things. Some others in the group agreed with me, but took it as praise: compared to some other rationalist authors, Scott Alexander isn't an ideologue.

(now I fear 90% of the comments are going to be some variation of "cults are bad")

What I didn't realize (at the time) was how obsessed Scott Alexander himself is with this distinction. Many of his posts grapple with variations on question of just how seriously we can take our ideas without going insane, contrasting the holy madman in the desert (who takes ideas 100% seriously) with the detached academic (who takes an intellectual interest in philosophy without applying it to life).

  • Beware Isolated Demands for Rigor is the post which introduces and seriously fleshes out this distinction. Scott says the holy madman and the detached academic are two valid extremes, because both of them are consistent in how they call for principles to be applied (the first always applies their intellectual standards to everything; the second never does). What's invalid is when you use intellectual standards as a tool to get whatever you want, by applying the standards selectively.
  • Infinite Debt forges a middle path, praising Giving What We Can for telling people that you can just give 10% to charity and be an "Officially Recognized Good Person" -- you don't need to follow your principles all the way to giving away everything, or alternately, ignore your principles entirely. By following a simple collectively-chosen rule, you can avoid applying principles selectively in a self-serving (or overly not-self-serving) way.
  • Bottomless Pits Of Suffering talks about the cases where utilitarianism becomes uncomfortable and it's tempting to ignore it.

But related ideas are in many other posts. It's a thread which runs throughout Scott's writing. (IMHO.)

This conflict is central to the human condition, or at least the WASP/WEIRD condition. I imagine most of Scott's readers felt similar conflicts around applying their philosophies in practice.

But this is really weird from a decision-theoretic perspective. An agent should be unsure of principles, not sure of principles but unsure about applying them. (Related.)

It's almost like Scott implicitly believes maximizing his own values would be bad somehow.

Some of this makes sense from a Goodhart perspective. Any values you explicitly articulate are probably not your values. But I don't get the sense that this is what's going on in Scott's writing. For example, when he describes altruists selling all their worldly possessions, it doesn't sound like he intends it as an example of Goodhart; it sounds like he intends it as a legit example of altruists maximizing altruist values.

In contrast, blogs like Minding our way to the heavens give me more of a sense of pushing the envelope on everything; I associate it with ideas like:

  • If you aren't putting forth your full effort, it probably means this isn't your priority. Figure out whether it's worth doing at all, and if so, what the minimal level of effort to get what you want is. (Or, if it really is important, figure out what's stopping you from giving it your full effort.) You can always put forth your full effort at the meta-level of figuring out how much effort to put into which things.
  • If you repeatedly don't do things in line with your "values", you're probably wrong about what your values are; figure out what values you really care about, so that you can figure out how best to optimize those.
  • If you find that you're fighting yourself, figure out what the fight is about, and find a way to best satisfy the values that are in conflict.

In more SSC-like terms, it's like, if you're not a holy madman, you're not trying.

I'm not really pushing a particular side, here, I just think the dichotomy is interesting.

New Comment
21 comments, sorted by Click to highlight new comments since: Today at 2:43 AM

It's not clear that people should be agents. Agents are means of setting up content of the world to accord with values, they are not optimized for being the valuable content of the world. So a holy madman has a work-life balance problem, they are an instrument of their values rather than an incarnation of them.

they are an instrument of their values rather than an incarnation of them.

This is a very striking statement, and I want to flag it as excellent.

[+][comment deleted]2y10

It seems to me that the agents you are considering don't have as complex a utility function as people, who seem to at least in part consider their own well being as part of their utility funciton. Additionally, people usually don't have a clear idea of what their actual utility function is, so if they want to go all-in on it, they let some values fall by the wayside. AFAIK this limitation not a requirement for an agent.


If you had your utility function fully specified, I don't think you could be considered both rational and also not a "holy madman". (This borders on my answer to the question of free will, which so far as I can tell, is a question that should not explicitly be answered, so as to not spoil it for anyone who wants to figure it out for themselves.)

Suffice it to say that optimized/optimal function should be a convergent instrumental goal, similar to self-preservation, and a rational agent should thereby have it as a goal. If I am not mistaken, this means that a problem in work-life balance, as you put it, is not something that an actual rational agent would tolerate, provided there are options to choose from that don't include this problem and have a similar return otherwise.

Or did I misinterpret what you wrote? I can be dense sometimes...^^

No, sounds right to me, at least approximately. It would be interesting to have theorems.

My position on free will is pretty developed, so I don't think you'd be spoiling anything if you DMed me with that part of the thought.

I think there are a couple of responses the holy-madman type can give:

  • The holy-madman aesthetic is actually pretty nice. Human values include truth, which requires coherent thought. And in fiction, we especially enjoy heroes who go after coherent goals. So in practice in our current world, the tails don't come apart much. At worst, people who manage to be more agentic aren't making too big of a sacrifice in the incarnation department. And perhaps they're actually better-off in that respect.
  • A coherent agent is basically what happens when you can split up the problem of deciding what to do and doing it, because most of the expected utility is in the rest of the world. An effective altruist who cares about cosmic waste probably thinks your argument is referring to something pretty negligible in comparison. Even if you argue functional decision theory means you're controlling all similar agents, not just yourself, that could still be pretty negligible.

The nice things are skills and virtues, parts of designs that might get washed away by stronger optimization. If people or truths or playing chess are not useful/valuable, agents get rid of them, while people might have a different attitude.

(Part of the motivation here is in making sense of corrigibility. Also, I guess simulacrum level 4 is agency, but humans can't function without a design, so attempts to take advantage of the absence of a design devolve into incoherence.)

Here's a model for you:

Assume that Value Alignment is a single variable. We want to maximize it by optimizing our behaviors. But we have a limited budget for object-level action and any given meta-level of strategizing. For that reason, we iteratively strategize, act, and evaluate in sprints. During the sprint, we fully commit to the strategy. After the spring, we demonstrate and evaluate the results, and plan the next sprint.

We assume that conditions and needs will be constantly changing in unpredictable ways, which we can only discover through this sort of iterated effort. A plan/sprint/review-like approach allows us to balance the need for adaptability with the need for forward motion.

From this point of view, the Holy Madman and the Detached Academic are both failing to implement an effective strategy. The Holy Madman has left out the part where you review and adapt; the Detached Academic has left out the part where you sprint. A third archetype, perhaps the Effective Altruist (?), brings the whole strategy together.

Some blogs seem to be for directing the short-term sprints. Others seem to be about long-term, tentative strategizing and noticing confusion. When you read Eliezer Yudkowsky's writing, it makes you feel like doing something right now, but you're also going to need to revisit it in a couple weeks to see if it's holding up. Eliezer is not your pal - he's your boss.

Reading Scott, there's no sense of urgency. You don't need to do anything. But his best writing sticks with you in the way that an easy friendship does.

I think that the approaches based on being a holy madman greatly underestimates the difficulty on being a value maximiser running on corrupted, basic human hardware.

I'd be extremely skeptical on anyone who claims to have found a way to truly maximise it's utility function, even if they claim to have avoided all the obvious pitfalls of burning out and so-so.

It would be extremely hard to conciliate "put forth your full effort" and staying rational enough to notice you're burning yourself out or noticing that you're getting stuck in some suboptimal route because you're not leaving yourself enough slack to notice better opportunities.

 

The detached academic seems to me an odd way to describe Scott Alexander, who seems to make a really effective effort to spread his values and live his life rationally, for him most of the issues he talks about seem to be pretty practical and relevant, even if he often takes interest on what makes him curious and isn't dropping everything to work on AI - maximise the number of competent people who would work on AI.

 

I'm currently in a now-nine-months-long attempt to move from detached-lazy-academic to make an extraordinary effort

So far every attempt to accurately predict how much of a full effort I can make without getting backlash that makes me worse at it in the next period has failed.

Lots of my plans have failed, so if I had went along with plans that required me to make sacrifices, as taking an idea Seriously would require you to do, would have left me at a serious loss.

What worked most and obtained the most result was keeping a curious attitude toward plans and subjects that are related to my goal, studying to increase my competence in related areas even if I don't see any immediate way it could be of help, and monitoring on how much "weight" I'm putting on the activities that produce the results I need.

I feel I started out being unbelievably bad at working seriously at something, but in nine months I got more results than in a lifetime (in a broad sense, not just related to my goal) and I feel like I went up a couple levels.

I try to avoid going toward any state that resembles a "holy madman" for fear of crashing hard, and I notice that what I'm doing already has me pass as one to even my most informed friends on related subjects, when I don't censor to look normally modest and uninterested. 

 

I might be just at such a low level in the skill of "actually working" that anything that would work great for a functional adult with a good work ethic is deadly to me.

But I'd strongly advise anyone trying the holy madman path to actively pump for as much "anti-holy-madmannes" as they can, since making the full effort to maximise for something seems to me the best way to make sure your ambition burns through any defence your naive, optimistic plans think you have put in place to protect your rationality and your mental health.

 

Cults are bad, becoming a one-man-cult is entirely possible and slightly worse.

It would be extremely hard to conciliate "put forth your full effort" and staying rational enough to notice you're burning yourself out or noticing that you're getting stuck in some suboptimal route because you're not leaving yourself enough slack to notice better opportunities.

I bet if you said this to Nate he'd have a pretty convincing counter. Even though Nate works some ridiculous number of hours a week (in contrast to me; I'm closer to the standard 40 hours), I suspect he has enough slack, and thinks of this as part of the optimization problem.

Part of the skill of optimizing without shooting yourself in the foot is explicitly counting slack as part of the optimization problem.

Part of the meta-skill of learning to do this is always asking yourself whether you're falling into some kind of trap (mostly, forms of Goodhart), and prioritizing steps which avoid traps. EG if you were a self-modifying AGI, you would do well to self-modify in a cautious way, rather than as soon as something looks positive-EV.

However, I'm not sure whether this caution eventually cashes out to "don't be a holy madman" vs "here's how to be the right kind of holy madman", in the terms of the post.

Lots of my plans have failed, so if I had went along with plans that required me to make sacrifices, as taking an idea Seriously would require you to do, would have left me at a serious loss.

Yeah, I feel that I can similarly look back at my history and say that in several cases, it either has been better or would have been much better to be more the detached academic.

Mh... I guess "holy madman" is a definition too vague to make a rational debate on it? I had interpreted it as "sacrifice everything that won't negatively affect your utility function later on". So the interpretation I imagined would be someone that won't leave himself an inch of comfort more than what's needed to keep the quality of his work constant.

I see slack as leaving yourself enough comfort that you'd be ready to use your free energy in ways you can't see at the moment, so I guess I was automatically assuming an "holy madman" would optimise for outputting the current best effort he can in the long term, rather than sacrificing some current effort to bet on future chances to improve the future output. 

I'd define someone who's leaving this level of slack as someone who's making a serious or full effort, but not an holy madman, but I guess this doesn't means much.

If I were to try to summarise my thoughts on what would happen in reality if someone were to try these options... I think the slack one would work better in general, both by managing to avoid pitfalls and to better exploit your potential for growth.

 

I still feel there's a lot of danger to oneself in trying to take ideas seriously though. If you start trying to act like it's your responsibility to solve a problem that's killing people, the moment you lose your grip on your thoughts it's the moment you cut yourself badly, at least in my experience.

In these days I've managed to reduce the harm that some recurrent thoughts were doing by focusing on distinguish between 1) me legitimately wanting A and planning/acting to achieve A and 2) my worries related to not being able to get A or distress for things currently being not A, telling myself that 2) doesn't helps me get what I want in the least, and that I can still make a full effort for 1), likely a better one, without paying to 2) much attention. 

(I'm afraid I've started to slightly rant from this point. I'm leaving it because I still feel it might be useful)


This strategy worked for my gender transition. 
I'm not sure how I'd react if I were to try telling myself I shouldn't care/feel bad/worry if people die because I'm not managing to fix the problem, even if I KNOW that worrying myself about people dying hinders my effort to fix the problem because feeling sick and worried and tired wouldn't in any way help toward actually working on the problem, I still don't trust my corrupted hardware to not start running some guilt trip against me because I'm trying to be, in a sense that's not utilitarian at all, callous, because I'm trying to not care/feel bad/worry about something like that.


Also, as a personal anecdote of possible pitfalls, trying to take personal responsibility for a global problem had drained my resources in ways I could't foreseen easily. When I got jumped by an unrelated problem about my gender, I found myself without the emotional resources to deal with both stresses at once, so some recurrent thoughts started blaming me because I was letting a personal problem that was in no way as bad as being dead, and didn't blipped at all on any screen in confront to a large number of deaths, screw up with my attempt of working on something that was actually relevant. I realised immediately that this was a stupid thing to think and in no way healthy, but that didn't do much to stop it, and climbing out of that pit of stress and guilt took a while.

In short, my emotional hardware is stupid and bugged and it irritates me to no end how it can just go ahead and ignore my attempts of thinking sanely about stuff.

I'm not sure if I'm just particularly bad at this, or if I just have expectations that are too high. An external view would likely tell me that it's ridiculous for me to expect to be able to go from "lazy and detached" to "saving the world (read reducing X risk), while effortlessly holding at bay emotional problems that would trip most people". I'd surely tell anyone that. On the other hand, it just feels like a stupid thing to not manage doing.

(end of the rant)

 

 (in contrast to me; I'm closer to the standard 40 hours)

Can I ask if you have some sort of external force that makes you do these hours? If not, any advice on how to do that?

I'm coming from a really long tradition of not doing any work whatsoever, and so far I'm struggling to meet my current goal of 24 hours (also because the only deadlines are the ones I manage to give myself... and for reasons I guess I have explained above).

Getting to this was a massive improvement, but again, I feel like I'm exceptionally bad at working hard.

For example, when he describes altruists selling all their worldly possessions, it doesn't sound like he intends it as an example of Goodhart; it sounds like he intends it as a legit example of altruists maximizing altruist values.

Goodharting is one thing, another thing is short-term (first-order) consequences vs long-term (second-order) consequences.

Imagine that you are the only altruist ever existing in the universe. You cannot reproduce or make your copy or spread your values. Furthermore, you are terminally ill and you know for sure that you will die in a week.

From that perspective, it would make sense to sell all your worldly possessions, spend the money to create as much good as you can, and die knowing you created the most good possible, and while it is sad that you couldn't do more, it cannot be helped.

(Note that this thought experiment does not require you to be perfectly altruistic. Not only are you allowed to care about yourself, you are even allowed to care about yourself more than about the others. Suppose you value yourself as much as the rest of universe together. That still makes it simple: spend 50% of your money to make the remaining week as pleasurable for yourself as possible, and the remaining 50% to improve the world as much as possible.)

We do not live in such situation though. There are many people who feel altruistic to smaller or greater degree, and what any specific one of them does is most likely just a drop in the ocean. The drop may be even smaller than the waves it creates. Maybe instead of becoming e.g. a lawyer and donating your entire salary to charity, you could become e.g. a teacher or a writer, and influence many other people, so that they become lawyers and donate their salaries to charity... thus indirectly contributing to charities much more than you could do alone.

Of course this approach contains its own risk of going too meta -- if literally everyone who ever feels altruistic becomes a teacher or a writer, and spends their whole salary on flyers promoting effective altruism, that would mean that the charity actually gets nothing at all. (Especially if it becomes common belief that being a meta-altruist is much better -- i.e. higher status -- than being a mere object-level altruist.)

The effect Scott probably worries about is the following: Should it become known that altruists generally live happy lives, or should it become known that altruists generally suffer a lot in order to maximize the global good? In short term, the latter creates more good -- optimizing for charity gives more to charity than optimizing for a combination of charity and self-preservation. But in long term, don't be surprised if people who are generally willing to help others, but have a strong self-preservation instict, decide that this altruism thing is not for them. A suffering altruist is an anti-advertisement for altruism. Therefore, in the name of maximizing the global good (as opposed to maximizing the good created personally by themselves) an effective altruist should strive to live a happy life! Because that attracts more people to become affective altruists, and more altruists can together create more good. But you should still donate some money, otherwise you are not an altruist.

So we have a collective problem of finding a function f such that if we make it a social norm that each altruist x should donate f(x), the total number donated to charities is maximized. It should be sufficiently high so that money actually is donated, and sufficiently low so that people are not discouraged to become altruists. And it seems like "donate 10% of your income" is a very good rule from this perspective.

Right, I agree with your distinction. I was thinking of this as something Scott was ignoring, when he wrote about selling all your possessions. I don't want to read into it too much, since it was an offhand example of what it would look like to go all the way in the taking-altruism-seriously direction. But it does seem like Scott (at the time) implicitly believed that going too far would include things of this sort. (That's the point of his example!) So when you say:

The effect Scott probably worries about is the following:

I'm like, no, I don't think Scott was explicitly reasoning this way. Infinite Debt was not about how altruists need to think long-term about what does the most good. It was a post about how it's OK not to do that all the time, and principles like altruism should be allowed to ask arbitrarily much from us. Yes, you can make an argument "thinking about the long-term good all the time isn't the best way to produce the most long-term good" and "asking people to be as good as possible isn't the best way to get them to be as good as possible" and things along those lines. But for better or worse, that's not the argument in the post.

IMO the source of this apparent conflict is that we pretend that our values and beliefs are something different from our actual (unconscious) values and beliefs. The "conflict" is either just play-acting about how we take those pretense value seriously, or an attempt to justify the contradiction between stated and revealed preferences without giving up on the pretense.

Right, I think this is a pretty plausible hypothesis. 

Here's another perspective: Scott is writing the perspective of (something like) the memes, who exert some control but don't have root access. The memes have a lot of control over when we feel good or bad about ourselves (this is a primary control mechanism they have). But the underlying biological organism has more control over what we actually do or don't do.

The memes also don't have a great deal of self-awareness of this split agency. They see themselves as the biological organism. So they're actually a bit puzzled about why the organism doesn't maximize the memetic values all the time.

One strategy which the memes use, in response to this situation, is to crank up the guilt-o-meter whenever actions don't reflect explicitly endorsed values.

Scott and Nate are both arguing against this strategy. Scott's SSC perspective is something like: "Don't feel guilty all the time. You don't have to go all the way with your principles. It's OK to apply those principles selectively, so long as you make sure you're not doing it in a biased way to get what you want."

This is basically sympathetic to the "you should feel guilty if you do bad things" idea, but arguing about how to set the threshold.

Nate's Minding Our Way perspective is instead: "Guilt isn't an emotion that a unified agent would feel. So you must be a fractured agent. You're at war with yourself; what you need is a peace treaty. Work to recognize your fractured architecture, and negotiate better and better treaties. After a while you'll be acting like a unified agent."

I’ve been reading the hardcover SSC collection in the mornings, as a way of avoiding getting caught up in internet distractions first thing when I get up. I’d read many of Scott Alexander’s posts before, but nowhere near everything posted; and I hadn’t before made any attempt to dive the archives to “catch up” to the seeming majority of rationalists who have read everything Scott Alexander has ever written.

Just a note that these are based on the SlateStarCodexAbridged edition of SSC:

https://www.slatestarcodexabridged.com/

And just to clarify what that means, from their website:

(The “abridged” in this site’s title doesn’t, by the way, mean that any of the posts have been shortened—only that the collection as a whole is a selected subset of Scott Alexander’s writing. Each individual post comes to you in its full and glorious length; not a word has been omitted from any of these essays.)

I agree this is a common thread in Scott's writing (though i bet I've read less than you did). As Tim Urban remarked recently, Scott is a master at conveying his confidence level in his writing. He knows both how to write with conviction when he's very confident, and how to convey his uncertainty when he's uncertain. It may come from confidence in his calibration about a claim instead of in the claim itself. It sounds much harder to write a post arguing that we should believe X with 80% confidence than just a post arguing that it's true. And these are exactly the sort of posts Scott is exceptionally good at.

P.S: Cults are bad :)

It may come from confidence in his calibration about a claim instead of in the claim itself.

Minding Our Way addresses this very phenomenon in Confidence All The Way Up. To my eye, Scott Alexander articulates his uncertainty with an air of meta-uncertainty; even when he sounds certain, he sounds tentatively uncertain. For example, his posts sometimes proceed in sections where each tells a strong story, but the next section contradicts the story, telling a new story from an opposite perspective. This gives a sense that no matter how strong an argument is, it could be knocked down by an even stronger argument which blindsides you. This kind of thing is actually another obsession of Scott's (by my estimation).

In contrast, Nate Soares articulates his uncertainty with an air of meta-confidence; he's uncertain, but he knows a lot about where that uncertainty comes from and what would change his mind. He can put numbers to it. If he's not sure about what would change his mind, he can tell you about how he would figure it out. And so on.

if you're not a holy madman, you're not trying

Another Insanity Wolf meme!

Thanks for writing this out. I'm more sympathetic to Nate Soares view and wish more rationalists would take action on their beliefs and this is useful to point to the distinction that exists.