I ended up writing a long rant about agency in my review of Joe Carlsmith’s report on x-risk from power-seeking AI. I’ve polished the rant a bit and posted it into this sequence. The central analogy between APS-AI and self-propelled machines (“Auto-mobiles”) is a fun one, and I suspect the analogy runs much deeper than I’ve explored so far.
For context, the question being discussed is whether we should “expect incentives to push relevant actors to build agentic planning and strategically aware systems [APS systems] in particular, once doing is possible and financially feasible.”
Joe says 80% yes, 20% no:
“The 20% on false, here, comes centrally from the possibility that the combination of agentic planning and strategic awareness isn’t actually that useful or necessary for many tasks -- including tasks that intuitively seem like they would require it (I’m wary, here, of relying too heavily on my “of course task X requires Y” intuitions). For example, perhaps such tasks will mostly be performed using collections of modular/highly specialized systems that don’t together constitute an APS system; and/or using neural networks that aren’t, in the predictively relevant sense sketched in 2.1.2-3, agentic planning and strategically aware. (To be clear: I expect non-APS systems to play a key role in the economy regardless; in the scenarios where (2) is false, though, they’re basically the only game in town.)”
I agree that “of course X requires Y” intuitions have been wrong in the past and also that evidence from how nature solved the problem in humans and nonhuman animals will not necessarily generalize to artificial intelligence. However:
- Beware isolated demands for rigor. Imagine someone in 1950 saying “Some people thought battleships would beat carriers. Others thought that the entire war would be won from the air. Predicting the future is hard; we shouldn’t be confident. Therefore, we shouldn’t assign more than 90% credence to the claim that powerful, portable computers (assuming we figure out how to build them) will be militarily useful, e.g. in weapon guidance systems or submarine sensor suites. Maybe it’ll turn out that it’s cheaper and more effective to just use humans, or bio-engineered dogs, or whatever. Or maybe there’ll be anti-computer weapons that render them useless. Who knows. The future is hard to predict.” This is what Joe-with-skeptic-hat-on sounds like to me. Battleships vs. carriers was a relatively hard prediction problem; whether computers would be militarily useful was an easy one. I claim it is obvious that APS systems will be powerful and useful for some important niches, just like how it was obvious in 1950 that computers would have at least a few important military applications.
- To drive this point home, let me take the reasons Joe gave for skepticism and line-by-line mimic them with a historical analogy to self-propelled machines, i.e. “automotives.” Left-hand column is entirely quotes from the report.
|Skeptic about APS systems||Skeptic about self-propelled machines|
|Many tasks -- for example, translating languages, classifying proteins, predicting human responses, and so forth -- don’t seem to require agentic planning and strategic awareness, at least at current levels of performance. Perhaps all or most of the tasks involved in automating advanced capabilities will be this way.||Many tasks — for example, raising and lowering people within a building, or transporting slag from the mine to the deposit, or transporting water from the source to the home — don’t seem to require automotives. Instead, an engine can be fixed in one location to power a system of pulleys or conveyor belts, or pump liquid through pipes. Perhaps all or most of the tasks involved in automating our transportation system will be this way.|
|In many contexts (for example, factory workers), there are benefits to specialization; and highly specialized systems may have less need for agentic planning and strategic awareness (though there’s still a question of the planning and strategic awareness that specialized systems in combination might exhibit).||In many contexts, there are benefits to specialization. An engine which is fixed to one place (a) does not waste energy moving its own bulk around, (b) can be specialized in power output, duration, etc. to the task at hand, (c) need not be designed with weight as a constraint, and thus can have more reliability and power at less expense.|
|Current AI systems are, I think, some combination of non-agentic-planning and strategically unaware. Some of this is clearly a function of what we are currently able to build, but it may also be a clue as to what type of systems will be most economically important in future.|
Current engines are not automotives. Some of this is clearly a function of what we are currently able to build (our steam engines are too heavy and weak to move themselves) but it may also be a clue as to what type of systems will be most economically important in the future.
|To the extent that agentic planning and strategic awareness create risks of the type I discuss below, this might incentivize focus on other types of systems.||To the extent that self-propelled machines may create risks of “crashes,” this might incentivize focus on other types of systems (and I would add that a fixed-in-place engine seems inherently safer than a careening monstrosity of iron and coal!) To the extent that self-propelled machines may enable some countries to invade other countries more easily, e.g. by letting them mobilize their armies and deploy to the border within days by riding “trains,” and perhaps even to cross trench lines with bulletproof “tanks,” this threat to world peace and the delicate balance of power that maintains it might incentivise focus on other types of transportation systems. [Historical note: The existence of trains was one of the contributing causes of World War One. See e.g. Railways and the mobilisation for war in 1914 | The National Archives.]|
Plan-based agency and strategic awareness may constitute or correlate with properties that ground moral concern for the AI system itself (though not all actors will treat concerns about the moral status of AI systems with equal weight; and considerations of this type could be ignored on a widespread scale).
|OK, I admit that it is much more plausible that people will care for the welfare of APS-AI than for the welfare of cars/trains. However I don’t think this matters very much so I won’t linger on this point.|
3. There are plenty of cases where human “of course task X requires Y” intuitions turned out to be basically correct. (e.g. self-driving cars need to be able to pathfind and recognize images, image-recognizers have circuits that seem to be doing line detection, tree search works great for board game AIs, automating warehouses turned out to involve robots that move around rather than a system of conveyor belts, automating cruise missiles turned out to not involve having humans in the loop steering them… I could go on like this forever. I’m deliberately picking “hard cases” where a smart skeptic could plausibly have persuaded the author to doubt their intuitions that X requires Y, as opposed to cases where such a skeptic would have been laughed out of the room.)
4. There’s a selection effect that biases us towards thinking our intuition about these things is worse than it is:
- Cases where our intuition about is incorrect are cases where it turns out there is an easier way, a shortcut. For example, chess AI just doing loads of really fast tree search instead of the more flexible, open-ended strategic reasoning some people maybe thought chess would require.
- If the history of AIs-surpassing-humans-at-tasks looks like this:
- Then we should expect the left tail to contain a disproportionate percentage of the cases where there is a shortcut. Cases where there is no shortcut will be clumped over on the right.
4. More important than all of the above: As Gwern pointed out, it sure does seem like some of the tasks some of us will want to automate are agency tasks, tasks such that anything which performs them is by definition an agent. Tasks like “gather data, use it to learn a general-purpose model of the world, use that to make a plan for how to achieve X, carry out the plan.”
5. Finally, and perhaps most importantly: We don’t have to go just on intuition and historical analogy. We have models of agency, planning, strategic awareness, etc. that tell us how it works and why it is so useful for so many things. [This sequence is my attempt to articulate my model.]
Many thanks to Joe Carlsmith for his excellent report and for conversing with me at length about it.