The issue of unified AI parties is discussed but not resolved in section 2.2. There, I discuss some of the paths AIs may take to begin engaging in collective decision making. In addition, I flag that the key assumption is that one AI or multiple AIs acting collectively accumulate enough power to engage in strategic competition with human states.
I think there's a steady stream of philosophy getting interested in various questions in metaphilosophy; metaethics is just the most salient to me. One example is the recent trend towards conceptual engineering (https://philpapers.org/browse/conceptual-engineering). Metametaphysics has also gotten a lot of attention in the last 10-20 years https://www.oxfordbibliographies.com/display/document/obo-9780195396577/obo-9780195396577-0217.xml. There is also some recent work in metaepistemology, but maybe less so because the debates tend to recapitulate previous work in metaethics https://plato.stanford.edu/entries/metaepistemology/.
Sorry for being unclear, I meant that calling for a pause seems useless because it won't happen. I think calling for the pause has opportunity cost because of limited attention and limited signalling value; reputation can only be used so many times; better to channel pressure towards asks that could plausibly get done.
Great questions. Sadly, I don't have any really good answers for you.
I think most academic philosophers take the difficult of philosophy quite seriously. Metaphilosophy is a flourishing subfield of philosophy; you can find recent papers on the topic here https://philpapers.org/browse/metaphilosophy. There is also a growing group of academic philosophers working on AI safety and alignment; you can find some recent work here https://link.springer.com/collections/cadgidecih. I think that sometimes the tone of specific papers sounds confident; but that is more stylistic convention than a reflection of the underlying credences. Finally, I think that uncertainty / decision theory is a persistent theme in recent philosophical work on AI safety and other issues in philosophy of AI; see for example this paper, which is quite sensitive to issues about chances of welfare https://link.springer.com/article/10.1007/s43681-023-00379-1.
Good question, Seth. We begin to analyse this question in section II.b.i of the paper, 'Human labor in an AGI world', where we consider whether AGIs will have a long-term interest in trading with humans. We suggest that key questions will be whether humans can retain either an absolute or comparative advantage in the production of some goods. We also point to some recent economics papers that address this question. One relevant factor for example is cost disease: as manufacturing became more productive in the 20th century, the total share of GDP devoted to manufacturing fell: non-automatable tasks can counterintuitively make up a larger share of GDP as automatable tasks become more productive, because the price of automatable goods will fall.
Thanks Brendon, I agree with a lot of this! I do think there's a big open question about how capable autoGPT-like systems will end up being compared to more straightforward RL approaches. It could turn out that systems with a clear cognitive architecture just don't work that well, even though they are safer
Thanks for the thoughtful post, lots of important points here. For what it’s worth, here is a recent post where I’ve argued in detail (along with Cameron Domenico Kirk-Giannini) that language model agents are a particularly safe route to agi: https://www.alignmentforum.org/posts/8hf5hNksjn78CouKR/language-agents-reduce-the-risk-of-existential-catastrophe
I really liked your post! I linked to it somewhere else in the comment thread
I think one key point you're making is that if AI products have a radically different architecture than human agents, it could be very hard to align them / make them safe. Fortunately, I think that recent research on language agents suggests that it may be possible to design AI products that have a similar cognitive architecture to humans, with belief/desire folk psychology and a concept of self. In that case, it will make sense to think about what desires to give them, and I think shutdown-goals could be quite useful during development to lower the chance of bad outcomes. If the resulting AIs have a similar psychology to our own, then I expect them to worry about the same safety/alignment problems as we worry about when deciding to make a successor. This article explains in detail why we should expect AIs to avoid self-improvement / unchecked successors.
1. In my opinion one of the likeliest motivations for deliberate debris would be as part of an escalation ladder in the early stages of WW3. Whichever player has weaker satellite intelligence / capabilities would have an incentive to trigger a cascade in order to destroy the advantage of their opponent. The point effectively is that space conflict is very strongly offense dominant because of debris cascades, and we know that in general offense dominant dynamics tend to be very unstable.
2. Related to your discussion of totipotence, another dynamic I could imagine in the future is MAD dynamics between a moon colony and earth, where each side has the capacity to create a debris cascade for the other. One concern is that there will not be second strike capability, and so the dynamic could be unusually unstable.
3. One concern is that space colonization is extremely trajectory dependent, so that initial forays into space colonization could have massive impacts on the far future. If so, there may be good reasons to delay space colonization as long as possible, as a "long reflection." A debris cascade would cause a long reflection, by forcing space colonization to pause until new technologies for escape are invented. On the other hand, space colonization is also very important to hedge against catastrophic risk. So the disvalue of debris cascades may be controlled by the relative prioritization of existential risk versus better future dynamics.