In Recursive Middle Manager Hell, Raemon argues that organisations get increasingly warped (immoral/maze-like/misaligned/...) as they gain more levels of hierarchy. I conjecture that analogous dynamics also arise in the context of AI automation.
To be frank, I think 95% of the value of my post comes just from the two-sentence tl;dr above. If you haven't done so yet, I recommend reading Recursive Middle Manager Hell and then asking "how does this change in the context of automating jobs by AI". But let me give some thoughts anyway.
[low confidence, informal] It seems to me that automation so far mostly worked like this:
I think this is starting to change. On the abstract level, I would say that (3) has changed with the introduction of ML, and (1) & (2) will now be changing because LLMs are growing more general and capable.
But perhaps it is better to instead give more specific examples of "recursive middle automaton hell":
In a sane world with lots of time to investigate this, this sounds like an exciting research topic for game theory, mechanism design, and whole lot of other fields! In the actual world, I am not sure what the implications are. Some questions that seem important are:
The proposed mechanism for this is that the only pieces of the organisation that are "in touch with reality" are the very top and bottom of the hierarchy (eg, the CEO/the people creating the product), while everybody else is optimising for [looking good in front of their direct supervisor]. Because of various feedback-cycles, this --- the claim goes --- results in misalignment that is worse than just "linear in the depth of the hierarchy".
For a better description of this, see Raemon's Recursive Middle Manager Hell or Zvi's (longer and more general) Immoral Mazes.
As a manager (and sometimes middle manager) I've been thinking about how LLMs are going to change management. Not exactly your topic but close enough. Here's my raw thoughts so far:
Very interesting points, if I was still in middle management these things would be keeping me up at night!
One point I query is "this is a totally new thing no manager has done before, but we're going to have to figure it out" -- is it that different from the various types of tool introduction & distribution / training / coaching that managers already do? I've spent a good amount of my career coaching my teams on how to be more productive using tools, running team show-and-tells from productive team members on why they're productive, sending team members on paid training courses, designing rules around use of internal tools like Slack/Git/issue trackers/intranets etc... and it doesn't seem that different to figuring out how to deploy LLM tools to a team. But I'm rusty as a manager, and I don't know what future LLM-style tools will look like, so I could be thinking about this incorrectly. Certainly if I had a software team right now, I'd be encouraging them to use existing tools like LLM code completion, automated test writing, proof-reading etc., and encouraging early adopters to share their successes & failures with such tools.
Does "no manager has done before" refer to specific LLM tools, and is there something fundamentally different about them compared to past new technologies/languages/IDEs etc?
I think the general vibe of "this hasn't been done before" might have been referring to fully automating the manager job, which possibly comes with very different scaling of human- vs AI managers. (You possibly remove the time bottleneck, allowing unlimited number of meetings. So if you didn't need to coordinate the low-level workers, you could have a single manager for infinite workers. Ofc, in practice, you do need to coordinate somewhat, so there will be other bottlenecks. But still, removing a bottleneck could changes things dramatically.)
My first guess is that the sort of problems I expect from here are sufficiently weirder/differenter from the classic human middle-manager-culture problems that I think it's probably better to just Original See on the topic rather than try to check whether the Recursive Middle Manager Hell hypothesis in particular holds here.
(I do think it's an interesting/important question though)
This sounds really interesting! Generally it seems that most people either believe AI will get power by directly being ordered to organize the entire world, or it will be some kind of paper-clip factory robot going rogue and hacking other computers. I am starting to think it will more likely be: Companies switch to AI middle managers to save $$$, then things just happen from there.
Now one way this could go really mazy is like this: All AI-models, even unique ones custom made for a particular company are based on some earlier model. Let's say Wallmart buys a model that is based on a general Middle Manager Model.
This model now has the power to hire and fire low-level workers, so it will be very much in their interest to find out what makes the model tick. They can't analyse the exact model (which is being run from a well guarded server park). But at some point somebody online will get hold of a general Middle Manager Model and let people play with it. Perhaps the open-source people will do all sorts of funny experiments with it and find bugs that could have been inherited by the Wallmart model.
Now the workers at Wallmart all start playing with the online model in their spare time, looking around AI-forums for possible exploits. Nobody knows if these also work on the real model, but people will still share them, hoping to be able to hack the system: "Hey, if you try to report sick on days when you anyway had time off, the Manager will give you extra credits!" "Listen, if you scan these canned tomatoes fifty times it triggers bug in the system so you will get a higher raise!" Etc.
The workers have no way to know which of the exploits work, but everybody will be too afraid of loosing their jobs if they are the only one NOT hacking the AI. Wait for a few years and you will see the dark tech-cult turning up.
Evolution gives us many organically designed Systems which offer potential Solutions:
A team of Leucocytes (white blood cells):
This is a system that can be implemented in a Company and fix most of the recursive middle manager hell.
Many humans would not like that (real accountability is hard, especially for middle-men who benefit from status quo), but AI won’t mind.
So the AI “head” could send their Leucocytes check all the system routinely.
I agree that the general point (biology needs to address similar issues, so we can use it for inspiration) is interesting. (Seems related to https://www.pibbss.ai/ .)
That said, I am somewhat doubtful about the implied conclusion (that this is likely to help with AI, because it won't mind): (1) there are already many workspace practices that people don't like, so "people liking things" doesn't seem like a relevant parameter of design, (2) (this is totally vague, handwavy, and possibly untrue, but:) biological processes might also not "like" being surveiled, replaced, etc, so the argument proves too much.
(1) « people liking thing does not seem like a relevant parameter of design ».
This is quite a bold statement. I personally believe the mainstream theory according to which it’s easier to have designs adopted when they are liked by the adopters.
(2) Nice objection, and the observation of complex life forms gives a potential answer :
Given that all your cells welcome even literal kill-switch, and replacement, I firmly believe that they don’t mind surveillance either!
In complex multicellular life, the Cells that refuse surveillance, replacement, or Apoptosis, are the Cancerous Cells, and they don’t seem able to create any complex life form (Only parasitic life forms, feeding off of their host, and sometimes spreading and infecting others, like HeLa).
(1) « people liking thing does not seem like a relevant parameter of design ».This is quite a bold statement. I personally believe the mainstream theory according to which it’s easier to have designs adopted when they are liked by the adopters.
Fair point. I guess "not relevant" is a too strong phrasing. And it would have been more accurate to say something like "people liking things might be neither sufficient nor necessary to get designs adopted, and it is not clear (definitely at least to me) how much it matters compared to other aspects".
Re (2): Interesting. I would be curious to know to what extent this is just a surface-level-only metaphor, or unjustified antrophomorphisation of cells, vs actually having implications for AI design. (But I don't understand biology at all, so I don't really have a clue :( .)
(1) « Liking », or « desire » can be defined as « All other things equal, Agents will go to what they Desire/Like most, whenever given a choice ». Individual desire/liking/tastes vary.
(2) In Evolutionary Game Theory, in a Game where a Mitochondria-like Agent offers you choice between :
then that Agent is likely to win. To a rational agent, it’s a winning wager. My last publication expands on this.
Some thoughts (don't have any background in stuff related but seemed interesting).
I think it would be interesting to see what you found if you looked into the state of existing research on AI coordination / delegation / systemic interactions and if any of it feels related. I'd be mildly surprised if people have studied exactly this but expect many relevant posts/papers.
In terms of related stuff on LessWrong, I can't find it now but Paul Christiano has a post on worlds where things go badly slowly and I think this would be kinda in that genre. I think this is an interesting thing to consider and feels somewhat related to Dan Hendrycks Natural "Selection Favors AIs over Humans" https://arxiv.org/abs/2303.16200. The connection in my head is "what does an AI ecosystem look like", "what does it mean to discuss alignment in this context", "what outcomes will this system tend towards" etc. The same way middle managers get selected for, so more generally AI systems with certain properties get selected for.
You might want to read about Ought's agenda with supervise processes not outcomes which feels relevant. Recursive middle manager hell feels somewhat related to inner misalignment / misaligned mesa-optimizers where instead of being a subset of the processing of an LLM (how I normally think about it but maybe not how others do), you have your AI system made of many layers and it's plausible that intermediate layers end up optimizing proxies for inputs to what you care about and not even the thing itself. In this view, it seems like the misalignment of middle managers which usually makes companies less effective might just lead to selection against such systems as compared to systems with less of these properties.
There might be some strategically valuable research to be done here but it's not super clear to me what the theory of change would be. Maybe there something to do with bandwidth / scalability tradeoffs that affect how tightly coupled vs diffuse/distributed useful/popular AI systems will be in the future.