Previously "Lanrian" on here. Research analyst at Redwood Research. Views are my own.
Feel free to DM me, email me at [my last name].[my first name]@gmail.com or send something anonymously to https://www.admonymous.co/lukas-finnveden
I thought a potential issue with wild caught fish is that other consumers would simply substitute away from wild to farmed fish, since most people don’t care much and wild caught fish supply isn’t very elastic.
But anchovies and sardines (as suggested in the post) seem like they avoid that issue since apparently there’s basically no farming of them.
I also think it’s just super reasonable to eat animal products and offset with donations — which can easily net reduce animal suffering given how good donation opportunities there are.
IMO, a big appeal of controlled takeoff is that, if successful, it slows down all of takeoff.
Whereas a global shut down, that might have happened at a time before we had great automated alignment research, and that might incidentally ban a lot of safety research as well… might just end some number of years later, whereupon we might quickly go through the remainder of takeoff, and incur similarly much risk as without the shutdown.
(Things that can cause a shutdown to end: elections or deaths swap out who rules countries, geopolitical power shifts, verification becoming harder as it becomes more plausible that ppl could invest a lot to develop and hide compute and data centers where they can’t be seen, and maybe as AI software efficiency advances using smaller scale experiments that were hard to ban.)
Successful controlled takeoff definitely seems more likely to me than ”shutdown so long that intelligence augmented humans have time to grow up”, and also more likely than ”shutdown so long that we can solve superintelligence alignment up front without having very smart models to help us or to experiment with”.
Short shutdown to do some prep before controlled takeoff seems reasonable.
Edit: I guess technically, some very mildly intelligence augmented humans (via embryo selection) are already being born, and they have a decent chance to grow up before superintelligence even without shutdown. I was thinking about intelligence augmentation that was good enough to significantly reduce x-risk. (Though I'm not sure how long people expect that to take.)
Lots of plausible mechanisms by which something could be "a little off" suggested in this Rohin comment.
This is the most compelling version of "trapped priors" I've seen. I agreed with Anna's comment on the original post, but the mechanisms here make sense to me as something that would mess a lot with updating. (Though it seems different enough from the very bayes-focused analysis in the original post that I'm not sure it's referring to the same thing.)
I think that's true in how they refer to it.
But it's also a bit confusing, because I don't think they have a definition of superintelligence in the book other than “exceeds every human at almost every mental task”, so AIs that are broadly moderately superhuman ought to count.
Edit: No wait, correction:
A few pages later they say:
> We will describe it using the term “superintelligence,” meaning a mind much more capable than any human at almost every sort of steering and prediction problem — at least, those problems where there is room to substantially improve over human performance.*
Hm, you seem more pessimistic than I feel about the situation. E.g. I would've bet that Where I agree and disagree with Eliezer added significant value and changed some minds. Maybe you disagree, maybe you just have a higher bar for "meaningful change".
(Where, tbc, I think your opportunity cost is very high so you should have a high bar for spending significant time writing lesswrong content — but I'm interpreting your comments as being more pessimistic than just "not worth the opportunity cost".)
This is roughly what seems to have happened in DC, where the internal influence approach was swept away by a big Overton window shift after ChatGPT.
In what sense was the internal influence approach "swept away"?
Also, it feels pretty salient to me that the ChatGPT shift was triggered by public, accessible empirical demonstrations of capabilities being high (and social impacts of that). So in my mind that provides evidence for "groups change their mind in response to certain kinds of empirical evidence" and doesn't really provide evidence for "groups change their mind in response to a few brave people saying what they believe and changing the overton window".
If the conversation changed a lot causally downstream of the CAIS extinction letter or FLI pause letter, that would be better evidence for your position (though also consistent with a model that put less weight on preference cascades and model the impact more like "policymakers weren't aware that lots of experts were concerned, this letter communicated that experts were concerned"). I don't know to what extent this was true. (Though I liked the CAIS extinction letter a lot and certainly believe it had a good amount of impact — I just don't know how much.)
As such, I disagree with the various actions you recommend lab employees to take, and do not intend to take them myself.
It's not clear that you disagree that much? You say you agree with leo's statement, which seems to be getting lots of upvotes and "thanks" emojis suggesting that people are going "yes, this is great and what we asked for".
I'm not sure what other actions there are to disagree with. There's "advocate internally to ensure that the lab lets its employees speak out publicly, as mentioned above, without any official retaliation" — but I don't really expect any official retaliation for statements like these so I don't expect this to be a big fight where it's costly to take a position.
I think the discussion wouldn't have to be like "here's a crazy plan".
I think there could have been something more like: "Important fact to understand about the situation: Even if superintelligence comes within the next 10 years, it's pretty likely that sub-ASI systems will have had a huge impact on the world by then — changing the world in a few-year period more than any technology ever has changed the world in a few-year period. It's hard to predict what this would look like [easy calls, hard calls, etc]. Some possible implications could be: [long list: ..., automated alignment research, AI-enabled coordination, people being a lot more awake to the risks of ASI, lots of people being in relationships with AIs and being supportive of AI rights, not-egregiously-misaligned AIs that are almost as good at bio/cyber/etc as the superintelligences...]. Some of these things could be helpful, some could be harmful. Through making us more uncertain about the situation, this lowers our confidence that everyone will die. In particular, some chance that X, Y, Z turns out really helpful. But obviously, if we see humanity as an agent, it would be a dumb plan for humanity to just assume that this crazy, hard-to-predict mess will save the whole situation."
I.e. it could be presented as an important thing to understand about the strategic situation rather than as a proposed plan.
I'm somewhat sympathetic to this reasoning. But I think it proves too much.
For example: If you're very hungry and walk past someone's fruit tree, I think there's a reasonable ethical case that it's ok to take some fruit if you leave them some payment, if you're justified in believing that they'd strongly prefer the payment to having the fruit. Even in cases where you shouldn't have taken the fruit absent being able to repay them, and where you shouldn't have paid them absent being able to take the fruit.
I think the reason for this is related to how it's nice to have norms along the lines of "don't leave people on-net worse-off" (and that such norms are way easier to enforce than e.g. "behave like an optimal utilitarian, harming people when optimal and benefitting people when optimal"). And then lots of people also have some internalized ethical intuitions or ethics-adjacent desires that work along similar lines.
And in the animal welfare case, instead of trying to avoid leaving a specific person worse-off, it's about making a class of beings on-net better-off, or making a "cause area" on-net better-off. I have some ethical intuitions (or at least ethics-adjacent desires) along these lines and think it's reasonable to indulge them.