LESSWRONG
LW

2596
Max H
2684Ω73243750
Message
Dialogue
Subscribe

Most of my posts and comments are about AI and alignment. Posts I'm most proud of, which also provide a good introduction to my worldview:

  • Without a trajectory change, the development of AGI is likely to go badly
  • Steering systems, and a follow up on corrigibility.
  • "Aligned" foundation models don't imply aligned systems
  • LLM cognition is probably not human-like
  • Gradient hacking via actual hacking
  • Concrete positive visions for a future without AGI

I also created Forum Karma, and wrote a longer self-introduction here.

PMs and private feedback are always welcome.

NOTE: I am not Max Harms, author of Crystal Society. I'd prefer for now that my LW postings not be attached to my full name when people Google me for other reasons, but you can PM me here or on Discord (m4xed) if you want to know who I am.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5Max H's Shortform
2y
10
Considerations around career costs of political donations
Max H5d42

I agree / believe you that it's common for Republican staffers to have refrained from ever donating to a Democratic cause, and that this is often more of a strategic decision than a completely uniform / unwavering opposition to every Democrat everywhere.

I still think that the precise kind of optics considerations described and recommended in this post (and other EA-ish circles) are subtly but importantly different from what those staffers are doing. And that this difference is viscerally perceptible to some "red tribe"-coded people, but something of a blind spot for traditionally blue-tribe coded people, including many EAs.

I'm not really making any strong claims about what the distribution / level of caring about all this is likely to be among people with hiring authority in a red tribe administration. Hanania was probably a bad example for me to pick for that kind of question, but I do think he is an exemplar of some aspects of "red tribe" culture that are at a zenith right now, and understanding that is important if you actually want to have a realistic chance at a succeeding in a high-profile / appointee position in a red tribe administration. But none of this is really in tension with also just not donating to democrats if that's you're aspiration, so I'm not really strongly dis-recommending the advice in this post or anything.

Another way of putting things: I suspect that "refrained from donating to a democrat I would have otherwise supported because I read a LW / EAF about optics" is anti-correlated with a person's chances of actually working in a Republican administration in a high-profile capacity. But I'm not particularly confident that that's actually true in real life [edit: and not confident that the effect is causal rather than evidential], and especially not confident that the effect is large vs. the first order effect of just quietly taking the advice in the post. I am more confident that being blind to the red-tribe cultural things I gestured at is going to be pretty strongly anti-correlated, though.

Reply
Considerations around career costs of political donations
Max H5d46

Is the idea that Hanania is evidence that being very public about your contrarian opinions is helpful for policy influence?
 


No. I'm more saying that the act of carefully weighing up career capital / PR considerations, and then not donating to a democrat based on a cost-benefit analysis of those considerations, feels to me like very stereotypical democrat / blue-tribe behavior. 

And further, that some people could have a visceral negative reaction to that kind of PR sensitivity more so than the donations themselves. The Hanania post is an example of the flavor of that kind of negative reaction (though it's not exactly the same thing, I admit). 

Separately, I'm not advising people to follow in Hanania's footsteps in terms of deliberately being contrarian and courting controversy, but he is a good example of "not caring about PR / self-censoring at all" and still doing well. 
 

I would rather guess that this pivot has been really costly to his influence on the right, and if he had self-censored, he'd be more influential.
 

Sure, but if he were the kind of person who would do that, he probably would not have gotten as popular as he is in the first place.

Reply1
Considerations around career costs of political donations
Max H5d*58

I appreciate this analysis, especially as someone considering donating and who falls in the target audience in some ways, and at an opportune / time-sensitive moment.


That said, my gut reaction is that reading this analysis and then holding off on donating to a candidate you like because of these considerations feels... kinda democrat-coded, in a negative way.


It reminded me of this post by Richard Hanania. Of course, Hanania himself is a pretty controversial figure, and could probably not get an appointment in an administration of any political stripe at this point. But he has an influence and reach on the right that is the envy of many, and which has translated to direct impact on policy. Many of his takes are also well-regarded by more left-leaning / centrist public intellectuals and writers (though probably not so much among mainstream elected democrats), especially lately since he has become more anti-Trump.

Anyway, donating to a political candidate is much more tame / low-stakes than anything Hanania posts on Twitter or Substack. So, if you're interested in politics or policy work (even in a narrow / relatively non-partisan way) and are impressed by what Hanania has accomplished, consider reversing the advice in this post - make whatever donations you want, lean into any controversy / trouble it brings, and don't be afraid to wear and defend your honestly-held views because of PR / career considerations.

Or, turning it around: if you find that one day you're an elected official (or staffer / advisor in the PPO) tasked with screening and vetting potential political appointees or otherwise making these kinds of hiring decisions, consider whether taking someone's past political donations into account is giving in to a culture of lameness and cowardice and femininity, at least in the eyes of Richard Hanania and his fans.

 

[edit: Not sure if it's the source of the downvotes / solider mindset react, but to clarify, the last paragraph is the advice I would give to a Trump staffer or hypothetical Vance staffer in the PPO who is considering whether to filter out someone for a political appointment because of past political donations, couched in terms and language (from the Hanania post) that might appeal to them.]

Reply
Max H's Shortform
Max H6d22-29

Rationality should not be painful.

Putting the lessons of the Sequences into practice, reflecting on and mentally rehearsing the core ideas, making them your own and weaving them into your everyday habits of thought and action until they become a part of you - at no point should any of this cause an increase in mental anguish, emotional vulnerability, depression, psychosis, mania etc., even temporarily. The worst-case consequences of absorbing these lessons should be that you regret some of your past life choices or perhaps come to realize that you're stuck in a bad situation that you can't easily change. But rationality should also leave you strictly better-equipped to deal with that situation, if you find yourself in it.

Also, the feeling of successfully becoming more rational should not feel like a sudden, tectonic shift in your mental processes or beliefs (in contrast to actually changing your mind about something concrete, which can sometimes feel like that). Rationality should feel natural and gradual and obvious in retrospect, like it was always a part of you, waiting to be discovered and adopted.

I am using "should" in the paragraphs above both descriptively and normatively. It is partly a factual claim: if you're not better off, you're probably missing something or "doing it wrong", in some concrete, identifiable way. But I am also making a normative / imperative statement that can serve as advice or a self-fulfilling prophecy of sorts - if your experience is different or you disagree, consider whether there's a mental motion you can take to make it true. 

I am also not claiming that the Valley of Bad Rationality is entirely fake. But I am saying it's not that big of a deal, and in any case the best way out is through. And also that "through" should feel natural / good / easy.


I am not very interested in meditation or jhanas or taking psychoactive drugs or various other forms of "woo". I believe that the beneficial effects that many people derive from these things are real and good, but I suspect they wouldn't work on me. Not because I don't believe in them, but because I already get basically all the plausible benefits from such things by virtue of being a relatively happy, high-energy, mentally stable person with a healthy, well-organized mind.

Some of these qualities are a lucky consequence of genetics, having a nice childhood, a nice life, being generally smart, etc. But there's definitely a chunk of it that I attribute directly to having read and internalized the Sequences in my early teens, and then applied them to thousands of tiny and sometimes not-so-tiny tribulations of everyday life over the years.


The thoughts above are partially / vaguely in response to this post and its comment section about CFAR workshops, but also to some other hazy ideas that I've seen floating around lately.

I have never been to a CFAR workshop and don't actually have a strong opinion on whether attending one is a good idea or not - if you're considering going, I'd advise you to read the warnings / caveats in the post and comments, and if you feel like (a) they don't apply to you and (b) a CFAR workshop sounds like your thing, it's worth going? You'll probably meet some interesting people, have fun, and learn some useful skills. But I suspect that attending such a workshop is not a necessary or even all that helpful ingredient for actually becoming more rational.


A while ago, Eliezer wrote in the preface for the published version of the Sequences:

It ties in to the first-largest mistake in my writing, which was that I didn’t realize that the big problem in learning this valuable way of thinking was figuring out how to practice it, not knowing the theory. I didn’t realize that part was the priority; and regarding this I can only say “Oops” and “Duh.”

Yes, sometimes those big issues really are big and really are important; but that doesn’t change the basic truth that to master skills you need to practice them and it’s harder to practice on things that are further away. (Today the Center for Applied Rationality is working on repairing this huge mistake of mine in a more systematic fashion.)

And has also written:

Jeffreyssai inwardly winced at the thought of trying to pick up rationality by watching other people talk about it—

Maybe I just am typical-minding / generalizing from one example here, but in my case, simply reading a bunch of blog posts and quietly reflecting on them on my own did work, and in retrospect it feels like the only thing that could have worked, or at least that attending a workshop, practicing a bunch of rationality exercises from a handbook, discussing in a group setting, etc. would not have been particularly effective on its own, and potentially even detracting or at least distracting.

And, regardless of whether the caveats / warnings / dis-recommendations in the CFAR post and comments are worth heeding, I suspect they're pointing at issues that are just not that closely related to (what I think of as) the actual core of learning rationality.

Reply
1a3orn's Shortform
Max H18d120

(By "coherent" I (vaguely) understand an entity (AI, human, etc) that does not have 'conflicting drives' within themself, that does not want 'many' things with unclear connections between those things, one that always acts for the same purposes across all time-slices, one that has rationalized their drives and made them legible like a state makes economic transactions legible.)

 

Coherence is mostly about not stepping on your own toes; i.e. not taking actions that get you strictly less of all the different things that you want, vs. some other available action. "What you want" is allowed to be complicated and diverse and include fuzzy time-dependent things like "enough leisure time along the way that I don't burn out". 

This is kind of fuzzy / qualitative, but on my view, most high-agency humans act mostly coherently most of the time, especially but not only when they're pursuing normal / well-defined goals like "make money". Of course they make mistakes, including meta ones (e.g. misjudging how much time they should spend thinking / evaluating potential options vs. executing a chosen one), but not usually in ways that someone else in their shoes (with similar experience and g) could have easily / predictably done better without the benefit of hindsight.

Here are some things a human might stereotypically do in the pursuit of high ability-to-act in the world, as it happens in humans:

  • Try to get money through some means
  • Try to become close friends with powerful people
  • Take courses or read books about subject-matters relevant to their actions
  • Etc

Lots of people try to make money, befriend powerful / high-status people around them, upskill, etc. I would only categorize these actions as pursuing "high ability-to-act" if they actually work, on a time scale and to a degree that they actually result in the doer ending up with the result they wanted or the leverage to make it happen. And then the actual high ability-to-act actions are the more specific underlying actions and mental motions that actually worked. e.g. a lot of people try starting AGI research labs or seek venture capital funding for their startup or whatever, few of them actually succeed in creating multi-billion dollar enterprises (real or not). The top-level actions might look sort of similar, but the underlying mental motions and actions will look very different whether the company is (successful and real), (successful and fraud), or a failure. The actual pursuing-high-ability-to-act actions are mostly found in the (successful and real, successful and fraud) buckets.

And here are some things a human might stereotypically do while pursuing coherence.

  • Go on a long walk or vacation reflecting on what they've really wanted over time
  • Do a bucketload of shrooms
  • Try just some very different things to see if they like them
  • Etc

Taking shrooms in particular seems like a pretty good example of an action that is almost certainly not coherent, unless there is some insight that you can only have (or reach the most quickly) by taking hallucinogenic drugs. Maybe there are some insights like that but I kind of doubt it, and trying shrooms first before you've exhausted other ideas, in some vague pursuit of some misunderstood concept of coherence, is not the kind of thing i would expect to be common in the most successful humans or AIs. There are of course exceptions (very successful humans who have taken drugs and attribute some of their success to it), but my guess is that success is mostly in spite of the drug use, or at least that the drug use was not actually critical.

The other examples are maybe stereotypes of what some people think of as pursuing coherent behavior, but I would guess they're also not particularly strongly correlated with actual coherence.

Reply1
Notes on fatalities from AI takeover
Max H1mo92

It's unclear what fraction of people die due to takeover because this is expedient for the AI, but it seems like it could be the majority of people and could also be almost no one. If AIs are less powerful, this is more likely (because AIs would have a harder time securing a very high chance of takeover without killing more humans).

Yeah, this (extinction to facilitate takeover) seems like the most plausible pathway to total or near-total extinction by far. An AI that is only a little bit smarter than humanity collectively has to worry about humans making a counter-move - launching missiles or building a competing AI or various kinds of sabotage. If you're a rogue AI, engineering a killer virus (something that smart humans can already do or almost do, if they wanted to) as soon as you or humanity has built out sufficient robotics infrastructure, makes all the subsequent parts of your takeover / expansion plan much less contingent and more straightforward to reason about. (And I think the analogy to historical relatively-bloodless coups here is a pretty weak counter / faint hope - for one, because human coup instigators generally still need humans to rule over, whereas AIs wouldn't.)
 

If there are a large number of different rogue AIs, it becomes more likely that one of them would benefit from massive fatalities (e.g. due to a pandemic) making this substantially more likely.

I don't see how the number of AIs makes a big difference here, rather than the absolute power level of the leading AI? An extinction or near-extinction event seems beneficial to just about any unaligned AI that is not all-powerful enough to not have to worry about humanity at all.

Put another way, the scenarios where extinction doesn't happen due to takeover only feel plausible in scenarios where a single AI fooms so fast and so hard that it can leave humanity alive without really sweating it. But if I understand the landscape of the discourse / disagreement here, these fast and discontinuous takeoff scenarios are exactly the ones that you and some others find the least plausible.

Reply
Max Harms's Shortform
Max H1mo62

I think the linked tweet is possibly just misinterpreting what the authors meant by "transistor operations"?  My reading is that "1000" binds to "operations"; the actual number of transistors in each operation is unspecified. That's how they get the 10,000x number - if a CPU runs at 1 GHz, neurons run at 100 Hz, then even if it takes 1000 clock cycles to do the work of neuron, the CPU can still do it 10,000x faster. 

(IDK what the rationale was in the editorial process for using "transistor operations" instead of a more standard term like "clock cycles", but a priori it seems defensible. Speculating, "transistors" was already introduced in the sentence immediately prior, so maybe the thinking was that the meaning and binding of "transistor operations" would be self-evident in context. Whereas if you use "clock cycles" you have to spend a sentence explaining what that means. So using "transistor operations" reduces the total number of new jargon-y / technical terms in the paragraph by one, and also saves a sentence of explanation.)

Anyway, depending on the architecture, precision, etc. a single floating point multiplication can take around 8 clock cycles. So even if a single neuron spike is doing something complicated that requires several high-precision multiply + accumulate operations in serial to replicate, that can easily fit into 1000 clock cycles on a normal CPU, and much fewer if you use specialized hardware.

As for the actual number of transistors themselves needed to do the work of a neuron spike, it again depends on exactly what the neuron spike is doing and how much precision etc. you need to capture the actual work, but "billions" seems too high by a few OOM at least. Some reference points: a single NAND gate is 4 transistors, and a general-purpose 16-bit floating point multiplier unit is ~5k NAND gates.

Reply1
Max Harms's Shortform
Max H1mo5019

The passage seems fine to me; I commented on Erdil's post and other brain efficiency discussions at the time, and I still think that power consumption is a more objective way of comparing performance characteristics of the brain vs. silicon, and that various kinds of FLOP/s comparisons favored by critics of the clock speed argument in the IAB passage are much more fraught ([1], [2]).

It's true that clock speed (and neuron firing speed) aren't straightforwardly / directly translatable to "speed of thought", but both of them are direct proxies for energy consumption and power density. And a very rough BOTEC shows that ~10,000x is a reasonable estimate for the difference in power density between the brain and silicon.

Essentially, the brain is massively underclocked because of design-space restrictions imposed by biology and evolution, whereas silicon-based processing has been running up against fundamental physical limits on component size, clock speed, and power density for a while now. So once AIs can run whatever cognitive algorithms that the brain implements (or algorithms that match the brain in terms of high-level quality of the actual thoughts) at any speed, the already-existing power density difference implies they'll immediately have a much higher performance ceiling in terms of the throughput and latency that they can run those algorithms at. It's not a coincidence that making this argument via clock speeds leads to basically the same conclusion as making the same argument via power density.

Reply11
I enjoyed most of IABIED
Max H1mo70
  • Tricky hypothesis 1: ASI will in fact be developed in a world that looks very similar to today's (e.g. because sub-ASI AIs will have negligible effect on the world; this could also be because ASI will be developed very soon).

 

  • Tricky hypothesis 2: But the differences between the world of today and the world where ASI will be developed don't matter for the prognosis.

 

Both of these hypotheses look relatively more plausible than they did 4y ago, don't they? Looking back at this section from the 2021 takeoff speed conversation gives a sense of how people were thinking about this kind of thing at the time.

AI-related investment and market caps are exploding, but not really due to actual revenue being "in the trillions" - it's mostly speculation and investment in compute and research.

Deployed AI systems can already provide a noticeable speed-up to software engineering and other white-collar work broadly, but it's not clear that this is having much of an impact on AI research (and especially a differential impact on alignment research) specifically.

Maybe we will still get widely deployed / transformative robotics, biotech, research tools etc. due to AI that could make a difference in some way prior to ASI, but SoTA AIs of today are routinely blowing through tougher and tougher benchmarks before they have widespread economic effects due to actual deployment.

I think most people in 2021 would have been pretty surprised by the fact we have widely available LLMs in 2025 with gold medal-level performance on the IMO, but which aren't yet having much larger economic effects. But in relative terms it seems like you and Christiano should be more surprised than Yudkowsky and Soares.

Reply1
Shortform
Max H1mo20

The "you're sued" part is part of what ensures that the forms get filled out honestly and comprehensively.

Depending on the kind of audit you do, the actual deliverable you give your auditor may just be a spreadsheet with a bunch of Y/N answers to hundreds of questions like "Do all workstations have endpoint protection software", "Do all servers have intrusion detection software", etc. with screenshots of dashboards as supporting evidence for some of them.

But regardless of how much evidence an external auditor asks for, at large companies doing important audits, every single thing you say to the auditor will be backed internally with supporting evidence and justification for each answer you give. 

At a bank you might have an "internal audit" department that has lots of meetings and back-and-forth with your IT department; at an airline it might be a consulting firm that you bring in to modernize your IT and help you handle the audit, or, depending on your relationship with your auditor and the nature of the audit, it might be someone from the audit firm itself that is advising you. In each case, their purpose is to make sure that every machine across your firm really does have correctly configured EDR, fully up to date security patches, firewalled, etc. before you claim that officially to an auditor.

Maybe you have some random box used to show news headlines on TVs in the hallways - turns out these are technically in-scope for having EDR and all sorts of other endpoint controls, but they're not compatible with or not correctly configured to run Microsoft Defender, or something. Your IT department will say that there are various compensating / mitigating controls or justifications for why they're out of scope, e.g. the firewall blocks all network access except the one website they need to show the news, the hardware itself is in a locked IT closet, they don't even have a mouse / keyboard plugged in, etc. These justifications will usually  be accepted unless you get a real stickler (or have an obstinate "internal auditor"). But it's a lot easier to just say "they all run CrowdStrike" than it is to keep track of all these rationales and compensating controls, and indeed ease-of-deployment is literally the first bullet in CrowdStrike's marketing vs. Microsoft Defender:
 

CrowdStrike: Deploy instantly with a single, lightweight agent — no OS prerequisites, complex configuration, or fine tuning required.
Microsoft: Complicated deployment hinders security. All endpoints require the premium edition of the latest version of Windows, requiring upfront OS and hardware upgrades for full security functionality.

You wrote in a sibling reply:

Further, the larger implication of the above tweet is that companies use Crowdstrike because of regulatory failure, and this is also simply untrue. There are lots of reasons people sort of unthinkingly go with the name brand option in security, but that's a normal enterprise software thing and not anything specific to compliance.
 

I agree that this has little to do with "regulatory failure" and don't know / don't have an opinion on whether that's what the original tweet author was actually trying to communicate. But my point is that firms absolutely do make purchasing decisions about security software for compliance reasons, and a selling point of CrowdStrike (and Carbon Black, and SentinelOne) is that they make 100% compliance easier to achieve and demonstrate vs. alternative solutions. That's not a regulatory failure or even necessarily problematic, but it does result in somewhat different outcomes compared to a decision process of "unthinkingly going with the name brand option" or "carefully evaluate and consider only which solutions provide the best actual security vs. which are theater".

Reply11
Load More
32Support for bedrock liberal principles seems to be in pretty bad shape these days
4mo
52
68Bayesian updating in real life is mostly about understanding your hypotheses
2y
4
21Emmett Shear to be interim CEO of OpenAI
2y
5
42Concrete positive visions for a future without AGI
2y
28
34Trying to deconfuse some core AI x-risk problems
2y
13
35An explanation for every token: using an LLM to sample another LLM
2y
5
37Actually, "personal attacks after object-level arguments" is a pretty good rule of epistemic conduct
2y
15
60Forum Karma: view stats and find highly-rated comments for any LW user
2y
18
3610 quick takes about AGI
2y
17
12Four levels of understanding decision theory
2y
11
Load More