Basically totally agreed, except I'd also argue that we're not doing mostly okay together, and that we should work quite hard to make sure that we're cooperating better before we advance any more down the mind tech tree.
Personally, I was never a fan of the FAE term because it seems to privilege environmental causes over dispositional ones without justification.
I get where you're coming from, but I think most people need a nudge towards an environmental direction rather than towards the dispositional direction. FAE is I guess just a 'reminder' that environmental causes are even possible.
Fundamental attribution error applies to arguments as well. We often think of people who are quick to anger, standoffish, unwilling to accept hypotheticals, or deny "obvious" claims as intrinsically close-minded, foolish, or otherwise unreasonable. Instead those behaviours are often manifested due to other sources of persistent stress in their lives, feeling vulnerable/exposed in the moment, or even just a shitty mood.
Just to be clear, while I "vibe very hard" with what the author says on a conceptual level, I'm not directly calling for you to shut down those projects. I'm trying to explain what I think the author sees as a problem within the AI safety movement. Because I am talking to you specifically, I am using the immediate context of your work, but only as a frame not as a target. I found AI 2027 engaging, a good representation of a model of how takeoff will happen, and I thought it was designed and written well (tbh my biggest quibble is "why isn't it called AI 2028"). The author is very very light on actual positive "what we should do" policy recommendations, so if I talked about that I would be filling in with my own takes, which probably differ from the author's in several places. I am happy to do that if you want, though probably not publicly in a LW thread.
@Daniel Kokotajlo Addendum:
Finally, my interpretation of "Chapter 18: What Is to Be Done?" (and the closest I will come to answering your question based on the author's theory/frame) is something like "the AGI-birthing dynamic is not a rational dynamic, therefore it cannot be defeated by policies or strategies that are focused around rational action". Furthermore, since each actor wants to believe that their contribution to the dynamic is locally rational (if I don't do it someone else will/I'm counterfactually helping/this intervention will be net positive/I can use my influence for good at a pivotal moment [...] pick your argument), further arguments about optimally rational policies only encourages the delusion that everyone is acting rationally, making them dig in their heels further.
The core emotions the author points to that motivate the AGI dynamic are: thrill of novelty/innovation/discovery, paranoia and fear about "others" (other labs/other countries/other people) achieving immense power, distrust of institutions, philosophies, and systems that underpin the world, and a sense of self importance/destiny. All of these can be justified with intellectual arguments but are often the bottom line that comes before such arguments are written. On the other hand the author also shows how poor emotional understanding and estrangement from one's emotions and intuitions lead to people getting trapped by faulty but extremely sophisticated logic. Basically, emotions and intuitions offer first order heuristics in the massively high dimensional space of possible actions/policies, and when you cut off the heuristic system you are vulnerable to high dimensional traps/false leads that your logic or deductive abilities are insufficient to extract you from.
Therefore, the answer the author is pointing at is something like an emotional or frame realignment challenge. You don't start arguing with a suicidal person about why the logical reasons they have offered for jumping don't make sense (at least, you don't do this if you want them to stay alive), you try to point them to a different emotional frame or state (i.e. calming them down and showing them there is a way out). Though he leaves it very vague, it seems that he believes the world will also need such a fundamental frame shift or belief-reinterpretation to actually exit this destructive dynamic, the magnitude of which he likens to a religious revelation and compares to the redemptive power of love. Beyond this point I would be filling in my own interpretation and I will stop there, but I have a lot more thoughts about this (especially the idea of love/coordination/ends to moloch).
To be honest, I wasn't really pointing at you when I made the comment, more at the practice of the hedges and the qualifiers. I want to emphasise that (from the evidence available to me publicly) I think that you have internalised your beliefs a lot more than those the author collects into the "uniparty". I think that you have acted bravely and with courage in support of your convictions, especially in face of the NDA situation, for which I hold immense respect. It could not have been easy to leave when you did.
However, my interpretation of what the author is saying is that beliefs like "I think what these people are doing might seriously end the world" are in a sense fundamentally difficult to square with measured reasoning and careful qualifiers. The end of the world and existential risk are by their nature so totalising and awful ideas that any "sane" interaction with them (as in, trying to set measured bounds and make sensible models) is extremely epistemically unsound, the equivalent of arguing whether 1e8 + 14 people or 1e8 + 17 people (3 extra lives!) will be the true number of casualties in some kind of planetary extinction event when the error bars are themselves +- 1e5 or 1e6. (We are, after all, dealing with never-seen-before black swan events.)
In this sense, detailed debates about which metrics to include in a takeoff model and the precise slope of the METR exponential curve and which combination of chip trade and export policies increases tail risk the most/least is itself a kind of deception. This is because the arguing over details implies that our world and risk models have more accuracy and precision than they actually do, and in turn that we have more control over events than we actually do. "Directionally correct" is in fact the most accuracy we're going to get, because (per the author) Silicon Valley isn't actually doing some kind of carefully calculated compute-optimal RSI takeoff launch sequence with a well understood theory of learning. The AGI "industry" is more like a group of people pulling the lever of a slot machine over and over and over again, egged on by a crowd of eager onlookers, spending down the world's collective savings accounts until one of them wins big. By "win big", of course, I mean "unleashes a fundamentally new kind of intelligence into the world". And each of them may do it for different reasons, and some of them may in their heads actually have some kind of master plan, but all it looks like from the outside is ka-ching, ka-ching, ka-ching, ka-ching...
The ideal market-making move is to introduce a new necessity for continued existence, like water.
well, with nuance. Like, it's not my ideal policy package, I think if I were in charge of the whole world we'd stop AI development temporarily and then figure out a new, safer, less power-concentrating way to proceed with it. But it's significantly better by my lights than what most people in the industry and on twitter and in DC are advocating for. I guess I should say I approximately believe all those things, and/or I think they are all directionally correct
With all due respect, I'm pretty sure that the existence of this very long string of qualifiers and very carefully reasoned hedges is precisely what the author means when he talks about intellectualised but not internalised beliefs.
Information warfare and psychological warfare are well known terms. However, I would suggest that any well-intentioned outsider trying to figure out "what's going on with AI right now" (especially in a governance context) is effectively being subject to the equivalent of an information state of nature (a la Hobbes). There are masses of opinions being shouted furiously, most of the public experts have giant glowing signs marked "I have serious conflicts of interest", and the number of self-proclaimed insiders trying to get power/influence/money/a job at a lab/a slice of the lightcone by marketing insider takes is kind of deafening. And of course the companies are running targeted influencing/lobbying campaigns on top of all this, trying to position themselves as the sole reliable actors.
My best argument as to why coarse-graining and "going up a layer" when describing complex systems are necessary:
Often we hear a reductionist case against ideas like emergence which goes something like this: "If we could simply track all the particles in e.g. a human body, we'd be able to predict what they did perfectly with no need for larger-scale simplified models of organs, cells, minds, personalities etc.". However, this kind of total knowledge is actually impossible given the bounds of the computational power available to us.
First of all, when we attempt to track billions of particle interactions we very quickly end up with a chaotic system, such that tiny errors in measurements and setting up initial states quickly compound into massive prediction errors (A metaphor I like is that you're "using up" the decimal points in your measurement: in a three body system the first timestep depends mostly on the value of the non-decimal portions of the starting velocity measurements. A few timesteps down changing .15 to .16 makes a big difference, and by the 10000th timestep the difference between a starting velocity of .15983849549 and .15983849548 is noticeable). This is the classic problem with weather prediction.
Second of all, tracking "every particle" means that the scope of the particles you need to track explodes out of the system you're trying to monitor into the interactions the system has with neighbouring particles, and then the neighbours of neighbours, so on and so forth. In the human case, you need to track every particle in the body, but also every particle the body touches or ingests (could be a virus), and then the particles that those particles touch... This continues until you reach the point where "to understand the baking process of an apple pie you must first track the position of every particle in the universe"
The emergence/systems solution to both problems is to essentially go up a level. Instead of tracking particles, you should track cells, organs, individual humans, systems etc. At each level (following Erik Hoel's Causal Emergence framework) you trade microscale precision for predictive power i.e. the size of the system you can predict for a given amount of computational power. Often this means collapsing large amounts of microscale interactions into random noise - a slot machine could in theory be deterministically predicted by tracking every element in the randomiser mechanism/chip, but in practice it's easier to model as a machine with an output distribution set by the operating company. Similarly, we trade Feynman diagrams for brownian motion and Langevin dynamics.