"If You're Not a Holy Madman, You're Not Trying"

by abramdemski2 min read28th Feb 202114 comments


Taking Ideas SeriouslyDistinctionsRationality

I've been reading the hardcover SSC collection in the mornings, as a way of avoiding getting caught up in internet distractions first thing when I get up. I'd read many of Scott Alexander's posts before, but nowhere near everything posted; and I hadn't before made any attempt to dive the archives to "catch up" to the seeming majority of rationalists who have read everything Scott Alexander has ever written.

(The hardcover SSC collection is nowhere near everything on SSC, not to mention Scott's earlier squid314 blog on livejournal. I'm curious how much shelf space a more complete anthology would occupy.)

Anyway, this has gotten me thinking about the character of Scott Alexander's writing. I once remarked (at a LessWrong meetup) that Scott Alexander "could never be a cult leader". I indented this as a sort of criticism. Scott Alexander doesn't write with conviction in the same way some other prominent rationalist authors do. He usually has the attitude of a bemused bystander who is merely curious about a bunch of things. Some others in the group agreed with me, but took it as praise: compared to some other rationalist authors, Scott Alexander isn't an ideologue.

(now I fear 90% of the comments are going to be some variation of "cults are bad")

What I didn't realize was how obsessed Scott Alexander himself is with this distinction. Many of his posts grapple with variations on question of just how seriously we can take our ideas without going insane, contrasting the holy madman in the desert (who takes ideas 100% seriously) with the detached academic (who takes an intellectual interest in philosophy without applying it to life).

  • Beware Isolated Demands for Rigor is the post which introduces and seriously fleshes out this distinction. Scott says the holy madman and the detached academic are two valid extremes, because both of them are consistent in how they call for principles to be applied (the first always applies their intellectual standards to everything; the second never does). What's invalid is when you use intellectual standards as a tool to get whatever you want, by applying the standards selectively.
  • Infinite Debt forges a middle path, praising Giving What We Can for telling people that you can just give 10% to charity and be an "Officially Recognized Good Person" -- you don't need to follow your principles all the way to giving away everything, or alternately, ignore your principles entirely. By following a simple collectively-chosen rule, you can avoid applying principles selectively in a self-serving way.
  • Bottomless Pits Of Suffering talks about the cases where utilitarianism becomes uncomfortable and it's tempting to ignore it.

But related ideas are in many other posts. It's a thread which runs throughout Scott's writing. (IMHO.)

This conflict is central to the human condition, or at least the WASP/WEIRD condition. I imagine most of Scott's readers felt similar conflicts around applying their philosophies in practice.

But this is really weird from a decision-theoretic perspective. An agent should be unsure of principles, not sure of principles but unsure about applying them. 

I get the impression that Scott implicitly believes maximizing his own values would be bad somehow.

Some of this makes sense from a Goodhart perspective. Any values you explicitly articulate are probably not your values. But I don't get the sense that this is what's going on in Scott's writing. For example, when he describes altruists selling all their worldly possessions, it doesn't sound like he intends it as an example of Goodhart; it sounds like he intends it as a legit example of altruists maximizing altruist values.

In contrast, blogs like Minding our way to the heavens give me more of a sense of pushing the envelope on everything; I associate it with ideas like:

  • If you aren't putting forth your full effort, it probably means this isn't your priority. Figure out whether it's worth doing at all, and if so, what the minimal level of effort to get what you want is. (Or, if it really is important, figure out what's stopping you from giving it your full effort.) You can always put forth your full effort at the meta-level of figuring out how much effort to put into which things.
  • If you repeatedly don't do things in line with your "values", you're probably wrong about what your values are; figure out what values you really care about, so that you can figure out how best to optimize those.
  • If you find that you're fighting yourself, figure out what the fight is about, and find a way to best satisfy the values that are in conflict.

In more SSC-like terms, it's like, if you're not a holy madman, you're not trying.

I'm not really pushing a particular side, here, I just think the dichotomy is interesting.


14 comments, sorted by Highlighting new comments since Today at 11:26 PM
New Comment

It's not clear that people should be agents. Agents are means of setting up content of the world to accord with values, they are not optimized for being the valuable content of the world. So a holy madman has a work-life balance problem, they are an instrument of their values rather than an incarnation of them.

they are an instrument of their values rather than an incarnation of them.

This is a very striking statement, and I want to flag it as excellent.

I think there are a couple of responses the holy-madman type can give:

  • The holy-madman aesthetic is actually pretty nice. Human values include truth, which requires coherent thought. And in fiction, we especially enjoy heroes who go after coherent goals. So in practice in our current world, the tails don't come apart much. At worst, people who manage to be more agentic aren't making too big of a sacrifice in the incarnation department. And perhaps they're actually better-off in that respect.
  • A coherent agent is basically what happens when you can split up the problem of deciding what to do and doing it, because most of the expected utility is in the rest of the world. An effective altruist who cares about cosmic waste probably thinks your argument is referring to something pretty negligible in comparison. Even if you argue functional decision theory means you're controlling all similar agents, not just yourself, that could still be pretty negligible.

The nice things are skills and virtues, parts of designs that might get washed away by stronger optimization. If people or truths or playing chess are not useful/valuable, agents get rid of them, while people might have a different attitude.

(Part of the motivation here is in making sense of corrigibility. Also, I guess simulacrum level 4 is agency, but humans can't function without a design, so attempts to take advantage of the absence of a design devolve into incoherence.)

Here's a model for you:

Assume that Value Alignment is a single variable. We want to maximize it by optimizing our behaviors. But we have a limited budget for object-level action and any given meta-level of strategizing. For that reason, we iteratively strategize, act, and evaluate in sprints. During the sprint, we fully commit to the strategy. After the spring, we demonstrate and evaluate the results, and plan the next sprint.

We assume that conditions and needs will be constantly changing in unpredictable ways, which we can only discover through this sort of iterated effort. A plan/sprint/review-like approach allows us to balance the need for adaptability with the need for forward motion.

From this point of view, the Holy Madman and the Detached Academic are both failing to implement an effective strategy. The Holy Madman has left out the part where you review and adapt; the Detached Academic has left out the part where you sprint. A third archetype, perhaps the Effective Altruist (?), brings the whole strategy together.

Some blogs seem to be for directing the short-term sprints. Others seem to be about long-term, tentative strategizing and noticing confusion. When you read Eliezer Yudkowsky's writing, it makes you feel like doing something right now, but you're also going to need to revisit it in a couple weeks to see if it's holding up. Eliezer is not your pal - he's your boss.

Reading Scott, there's no sense of urgency. You don't need to do anything. But his best writing sticks with you in the way that an easy friendship does.

IMO the source of this apparent conflict is that we pretend that our values and beliefs are something different from our actual (unconscious) values and beliefs. The "conflict" is either just play-acting about how we take those pretense value seriously, or an attempt to justify the contradiction between stated and revealed preferences without giving up on the pretense.

Right, I think this is a pretty plausible hypothesis. 

Here's another perspective: Scott is writing the perspective of (something like) the memes, who exert some control but don't have root access. The memes have a lot of control over when we feel good or bad about ourselves (this is a primary control mechanism they have). But the underlying biological organism has more control over what we actually do or don't do.

The memes also don't have a great deal of self-awareness of this split agency. They see themselves as the biological organism. So they're actually a bit puzzled about why the organism doesn't maximize the memetic values all the time.

One strategy which the memes use, in response to this situation, is to crank up the guilt-o-meter whenever actions don't reflect explicitly endorsed values.

Scott and Nate are both arguing against this strategy. Scott's SSC perspective is something like: "Don't feel guilty all the time. You don't have to go all the way with your principles. It's OK to apply those principles selectively, so long as you make sure you're not doing it in a biased way to get what you want."

This is basically sympathetic to the "you should feel guilty if you do bad things" idea, but arguing about how to set the threshold.

Nate's Minding Our Way perspective is instead: "Guilt isn't an emotion that a unified agent would feel. So you must be a fractured agent. You're at war with yourself; what you need is a peace treaty. Work to recognize your fractured architecture, and negotiate better and better treaties. After a while you'll be acting like a unified agent."

I’ve been reading the hardcover SSC collection in the mornings, as a way of avoiding getting caught up in internet distractions first thing when I get up. I’d read many of Scott Alexander’s posts before, but nowhere near everything posted; and I hadn’t before made any attempt to dive the archives to “catch up” to the seeming majority of rationalists who have read everything Scott Alexander has ever written.

Just a note that these are based on the SlateStarCodexAbridged edition of SSC:


And just to clarify what that means, from their website:

(The “abridged” in this site’s title doesn’t, by the way, mean that any of the posts have been shortened—only that the collection as a whole is a selected subset of Scott Alexander’s writing. Each individual post comes to you in its full and glorious length; not a word has been omitted from any of these essays.)

For example, when he describes altruists selling all their worldly possessions, it doesn't sound like he intends it as an example of Goodhart; it sounds like he intends it as a legit example of altruists maximizing altruist values.

Goodharting is one thing, another thing is short-term (first-order) consequences vs long-term (second-order) consequences.

Imagine that you are the only altruist ever existing in the universe. You cannot reproduce or make your copy or spread your values. Furthermore, you are terminally ill and you know for sure that you will die in a week.

From that perspective, it would make sense to sell all your worldly possessions, spend the money to create as much good as you can, and die knowing you created the most good possible, and while it is sad that you couldn't do more, it cannot be helped.

(Note that this thought experiment does not require you to be perfectly altruistic. Not only are you allowed to care about yourself, you are even allowed to care about yourself more than about the others. Suppose you value yourself as much as the rest of universe together. That still makes it simple: spend 50% of your money to make the remaining week as pleasurable for yourself as possible, and the remaining 50% to improve the world as much as possible.)

We do not live in such situation though. There are many people who feel altruistic to smaller or greater degree, and what any specific one of them does is most likely just a drop in the ocean. The drop may be even smaller than the waves it creates. Maybe instead of becoming e.g. a lawyer and donating your entire salary to charity, you could become e.g. a teacher or a writer, and influence many other people, so that they become lawyers and donate their salaries to charity... thus indirectly contributing to charities much more than you could do alone.

Of course this approach contains its own risk of going too meta -- if literally everyone who ever feels altruistic becomes a teacher or a writer, and spends their whole salary on flyers promoting effective altruism, that would mean that the charity actually gets nothing at all. (Especially if it becomes common belief that being a meta-altruist is much better -- i.e. higher status -- than being a mere object-level altruist.)

The effect Scott probably worries about is the following: Should it become known that altruists generally live happy lives, or should it become known that altruists generally suffer a lot in order to maximize the global good? In short term, the latter creates more good -- optimizing for charity gives more to charity than optimizing for a combination of charity and self-preservation. But in long term, don't be surprised if people who are generally willing to help others, but have a strong self-preservation instict, decide that this altruism thing is not for them. A suffering altruist is an anti-advertisement for altruism. Therefore, in the name of maximizing the global good (as opposed to maximizing the good created personally by themselves) an effective altruist should strive to live a happy life! Because that attracts more people to become affective altruists, and more altruists can together create more good. But you should still donate some money, otherwise you are not an altruist.

So we have a collective problem of finding a function f such that if we make it a social norm that each altruist x should donate f(x), the total number donated to charities is maximized. It should be sufficiently high so that money actually is donated, and sufficiently low so that people are not discouraged to become altruists. And it seems like "donate 10% of your income" is a very good rule from this perspective.

Right, I agree with your distinction. I was thinking of this as something Scott was ignoring, when he wrote about selling all your possessions. I don't want to read into it too much, since it was an offhand example of what it would look like to go all the way in the taking-altruism-seriously direction. But it does seem like Scott (at the time) implicitly believed that going too far would include things of this sort. (That's the point of his example!) So when you say:

The effect Scott probably worries about is the following:

I'm like, no, I don't think Scott was explicitly reasoning this way. Infinite Debt was not about how altruists need to think long-term about what does the most good. It was a post about how it's OK not to do that all the time, and principles like altruism should be allowed to ask arbitrarily much from us. Yes, you can make an argument "thinking about the long-term good all the time isn't the best way to produce the most long-term good" and "asking people to be as good as possible isn't the best way to get them to be as good as possible" and things along those lines. But for better or worse, that's not the argument in the post.

I agree this is a common thread in Scott's writing (though i bet I've read less than you did). As Tim Urban remarked recently, Scott is a master at conveying his confidence level in his writing. He knows both how to write with conviction when he's very confident, and how to convey his uncertainty when he's uncertain. It may come from confidence in his calibration about a claim instead of in the claim itself. It sounds much harder to write a post arguing that we should believe X with 80% confidence than just a post arguing that it's true. And these are exactly the sort of posts Scott is exceptionally good at.

P.S: Cults are bad :)

It may come from confidence in his calibration about a claim instead of in the claim itself.

Minding Our Way addresses this very phenomenon in Confidence All The Way Up. To my eye, Scott Alexander articulates his uncertainty with an air of meta-uncertainty; even when he sounds certain, he sounds tentatively uncertain. For example, his posts sometimes proceed in sections where each tells a strong story, but the next section contradicts the story, telling a new story from an opposite perspective. This gives a sense that no matter how strong an argument is, it could be knocked down by an even stronger argument which blindsides you. This kind of thing is actually another obsession of Scott's (by my estimation).

In contrast, Nate Soares articulates his uncertainty with an air of meta-confidence; he's uncertain, but he knows a lot about where that uncertainty comes from and what would change his mind. He can put numbers to it. If he's not sure about what would change his mind, he can tell you about how he would figure it out. And so on.

Thanks for writing this out. I'm more sympathetic to Nate Soares view and wish more rationalists would take action on their beliefs and this is useful to point to the distinction that exists.