All of Grant Demaree's Comments + Replies

Why so little AI risk on rationalist-adjacent blogs?

I buy that… so many of the folks funded by Emergent Ventures are EAs, so directly arguing against AI risk might alienate his audience

Still, this Straussian approach is a terrible way to have a productive argument

3Bill Benzon13d
FWIW, Cowen rarely has arguments. He'll state strong positions on any number of things in MR but (almost) he never engages with comments at MR. If you want an actual back and forth discussion, the most likely way to get it is in conversation in some forum.
Why so little AI risk on rationalist-adjacent blogs?

Many thanks for the update… and if it’s true that you could write the very best primer, that sounds like a high value activity

I don’t understand the astroid analogy though. Does this assume the impact is inevitable? If so, I agree with taking no action. But in any other case, doing everything you can to prevent it seems like the single most important way to spend your days

The asteroid case - it wouldn't be inevitable; it's just the knowledge that there are people out there substantially more motivated than me (and better positioned) to deal with it. For some activities where I'm really good (like... writing blogposts) and where I expect my actions to make more of an impact relative to what others would be doing I could end up writing a blogpost about 'what you guy should do' and emailing it to some other relevant people. Also, you can edit your post accordingly to reflect my update!
Are there English-speaking meetups in Frankfurt/Munich/Zurich?

Many thanks! It looks like EA was the right angle... found some very active English-speaking EA groups right next to where I'll be

Why so little AI risk on rationalist-adjacent blogs?

I bet you're right that a perceived lack of policy options is a key reason people don't write about this to mainstream audiences

Still, I think policy options exist

The easiest one is adding right right types AI capabilities research to the US Munitions List, so they're covered under ITAR laws. These are mind-bogglingly burdensome to comply with (so it's effectively a tax on capabilities research). They also make it illegal to share certain parts of your research publicly

It's not quite the secrecy regime that Eliezer is looking for, but it's a big step in that direction

Why so little AI risk on rationalist-adjacent blogs?

I think 2, 3, and 8 are true but pretty easy to overcome. Just get someone knowledgeable to help you

4 (low demand for these essays) seems like a calibration question. Most writers probably would lose their audience if they wrote about it as often as Holden. But more than zero is probably ok. Scott Alexander seems to be following that rule, when he said that we was summarizing the 2021 MIRI conversations at a steady drip so as not to alienate the part of his audience that doesn’t want to see that

I think 6 (look weird) used to be true, but it’s not any more. It’s hard to know for sure without talking to Kelsey Piper or Ezra Klein, but I suspect they didn’t lose any status for their Vox/NYT statements

I think that you're grossly underestimating the difficulty of developing and communicating a useful understanding, and the value and scarcity of expert time. I'm sure Kelsey or someone similar can get a couple of hours of time from one of the leading researchers to ensure they understand and aren't miscommunicating, if they really wanted to call in a favor - but they can't do it often, and most bloggers can't do it at all. Holden has the advantage of deep engagement in the issues as part of his job, working directly with tons of people who are involved in the research, and getting to have conversations as a funder - none of which are true for most writers.
MIRI announces new "Death With Dignity" strategy

I agree that it's hard, but there are all sorts of possible moves (like LessWrong folks choosing to work at this future regulatory agency, or putting massive amounts of lobbying funds into making sure the rules are strict)

If the alternative (solving alignment) seems impossible given 30 years and massive amounts of money, then even a really hard policy seems easy by comparison

Given the lack of available moves that are promising, attempting to influence policy is a reasonable move. It's part of the 80,000 hours career suggestions. On the other hand it's a long-short and I see no reason to expect a high likelihood of success.

How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI?  A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.

MIRI announces new "Death With Dignity" strategy

Eliezer gives alignment a 0% chance of succeeding. I think policy, if tried seriously, has >50%. So it's a giant opportunity that's gotten way too little attention

I'm optimistic about policy for big companies in particular. They have a lot to lose from breaking the law, they're easy to inspect (because there's so few), and there's lots of precedent (ITAR already covers some software). Right now, serious AI capabilities research just isn't profitable outside of the big tech companies

Voluntary compliance is also a very real thing. Lots of AI researchers a... (read more)

This is exactly what we have piloted at the Existential Risk Observatory [], a Dutch nonprofit founded last year. I'd say we're fairly successful so far. Our aim is to reduce human extinction risk (especially from AGI) by informing the public debate. Concretely, what we've done in the past year in the Netherlands is (I'm including the detailed description so others can copy our approach - I think they should): 1. We have set up a good-looking website, found a board, set up a legal entity. 2. Asked and obtained endorsement from academics already familiar with existential risk. 3. Found a freelance, well-known ex-journalist and ex-parliamentarian to work with us as a media strategist. 4. Wrote op-eds warning about AGI existential risk, as explicitly as possible, but heeding the media strategist's advice. Sometimes we used academic co-authors. Four [] out [] of [] six [] of our op-eds were published in leading newspapers in print. 5. Organized drinks, networked with journalists, introduced them to others who are into AGI existential risk (e.g. EAs). Our most recent result (last weekend) is that a prominent columnist who is agenda-setting on tech and privacy issues in NRC Handelsblad, the Dutch equivalent of the New York Times, wrote a piece [] where he talked about AGI existential risk as an actual thing. We've also had a meeting with the chairwoman of the Dutch parliamentary committee on di

Look at gain of function research for the result of a government moratorium on research. At first Baric feared that the moratorium would end his research. Then the NIH declared that his research isn't officially gain of function and continued funding him. 

Regulating gain of function research away is essentially easy mode compared to AI.

A real Butlerian jihad would be much harder.

MIRI announces new "Death With Dignity" strategy

It sounds like Eliezer is confident that alignment will fail. If so, the way out is to make sure AGI isn’t built. I think that’s more realistic than it sounds

1. LessWrong is influential enough to achieve policy goals

Right now, the Yann LeCun view of AI is probably more mainstream, but that can change fast.

LessWrong is upstream of influential thinkers. For example:
- Zvi and Scott Alexander read LessWrong. Let’s call folks like them Filter #1
- Tyler Cowen reads Zvi and Scott Alexander. (Filter #2)
- Malcolm Gladwell, a mainstream influencer, reads Tyler Cowen... (read more)

I think you have to specify which policy you mean. First, let's for now focus on regulation that's really aiming to stop AGI, at least until safety is proven (if possible), not on regulation that's only focusing on slowing down (incremental progress). I see roughly three options: software/research, hardware, and data. All of these options would likely need to be global to be effective (that's complicating things, but perhaps a few powerful states can enforce regulation on others - not necessarily unrealistic). Most people who talk about AGI regulation seem to mean software or research regulation. An example is the national review board proposed by Musk. A large downside of this method is that, if it turns out that scaling up current approaches is mostly all that's needed, Yudkowsky's argument that a few years later, anyone can build AGI in their basement (unregulatable) because of hardware progress seems like a real risk. A second option not suffering from this issue is hardware regulation. The thought experiment of Yudkuwsky that an AGI might destroy all CPUs in order to block competitors, is perhaps its most extreme form. One nod less extreme, chip capability could be forcibly held at either today's capability level, or even at a level of some safe point in the past. This could be regulated at the fabs, which are few and not easy to hide. Regulating compute has also been proposed by Jaan Tallinn in a Politico newsletter, where he proposes regulating flops/km2. Finally, an option could be to regulate data access. I can't recall a concrete proposal but it should be possible in principle. I think a paper should urgently be written about which options we have, and especially what the least economically damaging, but still reliable and enforcible regulation method is. I think we should move beyond the position that no regulation could do this - there are clearly options with >0% chance (depending strongly on coordination and communication) and we can't afford to w

I tend to agree that Eliezer (among others) underestimates the potential value of US federal policy. But on the other hand, note No Fire Alarm, which I mostly disagree with but which has some great points and is good for understanding Eliezer's perspective. Also note (among other reasons) that policy preventing AGI is hard because it needs to stop every potentially feasible AGI project but: (1) defining 'AGI research' in a sufficient manner is hard, especially when (2) at least some companies naturally want to get around such regulations, and (3) at least ... (read more)

Omicron: My Current Model

Is there a good write up of the case against rapid tests? I see Tom Frieden’s statement that rapid tests don’t correlate with infectivity, but I can’t imagine what that’s based on

In other words, there’s got to be a good reason why so many smart people oppose using rapid tests to make isolation decisions

Considerations on interaction between AI and expected value of the future

Could you spell out your objection? It’s a big ask, having read a book just to find out what you mean!

Biology-Inspired AGI Timelines: The Trick That Never Works

Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.

Longer summary:

Lots of folks made bad AGI predictions by asking: 

  1. How much compute is needed for AGI?
  2. When that compute will be available?

To find (1), they use a “biological anchor,” like the computing power of the human brain, or the tota... (read more)

6Sammy Martin7mo
Holden also mentions something a bit like Eliezer's criticism in his own write-up, When Holden talks about 'ingenuity' methods that seems consistent with Eliezer's I.e. if you wanted to fold this consideration into OpenAI's estimate you'd have to do it by having a giant incredibly uncertain free-floating variable for 'speedup factor' because you'd be nonsensically trying to estimate the 'speed-up' to brain processing applied from using some completely non-Deep Learning or non-brainlike algorithm for intelligence. All your uncertainty just gets moved into that one factor, and you're back where you started. It's possible that Eliezer is confident in this objection partly because of his 'core of generality' model of intelligence [] - i.e. he's implicitly imagining enormous numbers of varied paths to improvement that end up practically in the same place, while 'stack more layers in a brainlike DL model' is just one of those paths (and one that probably won't even work), so he naturally thinks estimating the difficulty of this one path we definitely won't take (and which probably wouldn't work even if we did try it) out of the huge numbers of varied paths to generality is useless. However, if you don't have this model [] , then perhaps you can be more confident that what we're likely to build will look at least somewhat like a compute-limited DL system and that these other paths will have to share some properties of this path. Relatedly, it's an implication of the model that there's some imaginable (and not e.g. galaxy sized) model we could build right now that would be an AGI, which I think Eliezer disputes?
Attempted Gears Analysis of AGI Intervention Discussion With Eliezer

What particular counterproductive actions by the public are we hoping to avoid?

What would we do if alignment were futile?

I should’ve been more clear…export controls don’t just apply to physical items. Depending on the specific controls, it can be illegal to publicly share technical data, including source code, drawings, and sometimes even technical concepts

This makes it really hard to publish papers, and it stops you from putting source code or instructions online

What would we do if alignment were futile?

Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?

EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this

But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other

I agree with everything in your comment except the value of showing EY’s claim to be wrong:

  • Believing a problem is harder than it is can stop you from finding creative
... (read more)
9Martin Randall8mo
I think by impending doom you mean AI doom after a few years or decades, so "impending" from a civilizational perspective, not from an individual human perspective. If I misinterpret you, please disregard this post. I disagree on your mental health point. Main lines of argument: people who lose belief in heaven seem to be fine, cultures that believe in oblivion seem to be fine, old people seem to be fine, etc. Also, we evolved to be mortal, so we should be surprised if evolution has left us mentally ill-prepared for our mortality. However, I discovered/remembered that depression is a common side-effect of terminal illness. See Living with a Terminal Illness []. Perhaps that is where you are coming from? There is also Death row phenomenon [], but that seems to be more about extended solitary confinement than impending doom. I don't think this is closely analogous to AI doom. A terminal illness might mean a life expectancy measured in months, whereas we probably have a few years or decades. Also our lives will probably continue to improve in the lead up to AI doom, where terminal illnesses come with a side order of pain and disability. On the other hand, a terminal illness doesn't include the destruction of everything we value. Overall, I think that belief in AI doom is a closer match to belief in oblivion than belief in cancer and don't expect it to cause mental health issues until it is much closer. On a personal note, I've placed > 50% probability on AI doom for a few years now, and my mental health has been fine as far as I can tell. However, belief in your impending doom, when combined with belief that "Belief in your impending doom is terrible for your mental heath", is probably terrible for your mental health. Also, belief that "Belief in your impending doom is terrible for your mental heath" could cause motivated reasoning that makes it harder to
5Grant Demaree8mo
Zvi just posted EY's model []
What would we do if alignment were futile?

I agree. This wasn’t meant as an object level discussion of whether the “alignment is doomed” claim is true. What I’d hopes to convey is that, even if the research is on the wrong track, we can still massively increase the chances of a good outcome, using some of the options I described

That said, I don’t think Starship is a good analogy. We already knew that such a rocket can work in theory, so it was a matter of engineering, experimentation, and making a big organization work. What if a closer analogy to seeing alignment solved was seeing a proof of P=NP this year?

Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation

In fact, what I’d really like to see from this is Leverage and CFAR’s actual research, including negative results

What experiments did they try? Is there anything true and surprising that came out of this? What dead ends did they discover (plus the evidence that these are truly dead ends)?

It’d be especially interesting if someone annotated Geoff’s giant agenda flowchart with what they were thinking at the time and what, if anything, they actually tried

Also interested in the root causes of the harms that came to Zoe et al. Is this an inevitable consequence of Leverage’s beliefs? Or do the particular beliefs not really matter, and it’s really about the social dynamics in their group house?

Probably not what you wanted, but you can read CFAR's handbook [] and updates [] (where they also reflect on some screwups). I am not aware of Leverage having anything equivalent publicly available.
Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation

I don’t agree with the characterization of this topic as self-obsessed community gossip. For context, I’m quite new and don’t have a dog in the fight. But I drew memorable conclusions from this that I couldn’t have gotten from more traditional posts

First, experimenting with our own psychology is tempting and really dangerous. Next time, I’d turn up the caution dial way higher than Leverage did

Second, a lot of us (probably including me) have an exploitable weakness brought on high scrupulously combined with openness to crazy-sounding ideas. Next time, I’d b... (read more)

In fact, what I’d really like to see from this is Leverage and CFAR’s actual research, including negative results

What experiments did they try? Is there anything true and surprising that came out of this? What dead ends did they discover (plus the evidence that these are truly dead ends)?

It’d be especially interesting if someone annotated Geoff’s giant agenda flowchart with what they were thinking at the time and what, if anything, they actually tried

Also interested in the root causes of the harms that came to Zoe et al. Is this an inevitable consequence of Leverage’s beliefs? Or do the particular beliefs not really matter, and it’s really about the social dynamics in their group house?

Is Functional Decision Theory still an active area of research?

So is this an accurate summary of your thinking?

  1. You agree with FDT on some issues. The goal of decision theory is to determine what kind of agent you should be. The kind of agent you are (your "source code") affects other agents' decisions
  2. FDT requires you to construct counterfactual worlds. For example, if I'm faced with Newcomb's problem, I have to imagine a counterfactual world in which I'm a two-boxer
  3. We don't know how to construct counterfactual worlds. Imagining a consistent world in which I'm a two-boxer is just as hard as imagining a one where object
... (read more)
"The goal of decision theory is to determine what kind of agent you should be" I'll answer this with a stream of thought: I guess my position on this is slightly complex. I did say that the reason for preferring one notion of counterfactual over another must be rooted in the fact that agents adopting these counterfactuals do better over a particular set of worlds. And maybe that reduces to what you said, although maybe it isn't quite as straightforward as that because I content "possible" is not in the territory. This opens the door to there being multiple notions of possible and hence counterfactuals being formed by merging lessons from the various notions. And it seems that we could merge these lessons either at the individual decision level or at the level of properties about agent or at the level of agents. Or at least that's how I would like my claims in this post to be understood. That said, the lesson from my post The Counterfactual Prisoner's Dilemma [] is that merging at the decision-level doesn't seem viable. "FDT requires you to construct counterfactual worlds" I highly doubt that Eliezer embraces David Lewis' view of counterfactuals, especially given his post Probability is in the Mind []. However, the way FDT is framed sometimes gives the impression that there's a true definition we're just looking for. Admittedly, if you're just looking for something that works such as in Newcomb's and Regret of Rationality [] then that avoids this mistake. And I guess if you look at how MIRI has investigated this, which is much more mathematical than philosophical that the do seem to be following this pragmatism principle. I would like to suggest though that this can only get you so far. "We don't know how to
A Small Vacation

Really enjoyed this. I’m skeptical, because (1) a huge number of things have to go right, and (2) some of them depend on the goodwill of people who are disincentivized to help

Most likely: the Vacated Territory flounders, much like Birobidzhan (Which is a really fun story, by the way. In the 1930’s, the Soviet Union created a mostly-autonomous colony for its Jews in Siberia. Macha Gessen tells the story here)

Best case:

In September 2021, the first 10,000 Siuslaw Syrians touched down in Siuslaw National Forest, land that was previously part of Oregon.

It was a... (read more)

Thanks for the positive feedback and interesting scenario. I'd never heard of Birobidzhan.