This is the third and final post in a sequence on control theory. In the first post I introduced the subject of control theory and stepped through some basics. In the second post I outlined Powers's model, as presented in Behavior: The Control of Perception. This post is a collection of comments on the subject that are only somewhat related, and so I'll use section headings to separate them. I'll also explicitly note the absence of a section on the design of control systems, which is where most of the effort used in talking about them in industrial settings goes, and is probably relevant to philosophical discussions surrounding them.
From Wikipedia's Cybernetics page:
Artificial intelligence (AI) was founded as a distinct discipline at a 1956 conference. After some uneasy coexistence, AI gained funding and prominence. Consequently, cybernetic sciences such as the study of neural networks were downplayed; the discipline shifted into the world of social sciences and therapy.
I'm no historian of science, and it's not clear to me why this split happened. It seems likely that control theory was simply not a useful approach for many of the early problems researchers associated with AI, like natural language processing: Powers has a description of how neural circuits as he models them could solve the phoneme parsing problem (which seems very compatible with sophisticated approaches that use Hidden Markov Models) but how one would go from parsing sounds to make words to parsing words to make concepts is not quite clear. It seems like there might be some difference in kind between the required circuitry, but perhaps not: one of the recent advances in machine learning is "deep learning," which the ultra-simplified explanation of is "neural nets, just dialed up to 11." It seems possible (certain, if you count NNs as a 'cybernetic' thing) that AI is moving back in the direction of cybernetics/control theory/etc., but possibly without much intellectual continuity. Did backpropagation spread from controls to AI, or was it independently invented? As mentioned before, I'm not a historian of science. People working in robotics, as my limited understanding goes, have always maintained a connection to engineering and cybernetics and so on, but the 'hardware' and 'software' fields diverged, where the roboticists sought to move from the first level up and the AI researchers sometimes sought to move from the top down, perhaps without the hierarchical view.
This article on Walter Pitts (an important early figure in cybernetics) describes the split thus:
Von Neumann was the first to see the problem. He expressed his concern to Wiener in a letter that anticipated the coming split between artificial intelligence on one side and neuroscience on the other. “After the great positive contribution of Turing-cum-Pitts-and-McCulloch is assimilated,” he wrote, “the situation is rather worse than better than before. Indeed these authors have demonstrated in absolute and hopeless generality that anything and everything … can be done by an appropriate mechanism, and specifically by a neural mechanism—and that even one, definite mechanism can be ‘universal.’ Inverting the argument: Nothing that we may know or learn about the functioning of the organism can give, without ‘microscopic,’ cytological work any clues regarding the further details of the neural mechanism.”
Utility theory is the mathematically correct way to behave in an uncertain world if you have preferences over consequences that can be collapsed onto the real line and can solve the maximization problem. So long as your values follow four desirable rules, that describes your preferences. If we express those preferences as a probabilistically relevant score, then we can entirely separate our module that expresses preferences over consequences and our module that predicts probabilities of consequences once we take a particular action, and this is a huge boon to mathematical decision-making.
But it turns out that decision-making under certainty can often be a hard problem for humans. This is a black mark for the descriptive application of utility theory to humans, but is explained by the control theory paradigm as multiple goals (i.e. control systems) conflicting. I don't see this as a challenge to the prescriptive usefulness of utility theory: when presented with a choice, it is often better to make one than not make one--or, if one is delaying until additional information arrives, to know exactly what impact possible information could impact the decision through a VoI calculation. Even if you've identified the two terminal goals that are conflicting, it is probably better to explicitly short circuit one of those desires, determine the right tradeoff, or pull in a third direction rather than remain locked in conflict.
It also seems that utility maximization is mostly appropriate for an agent in the LW sense--a unitary entity that has a single preference function and plans how to achieve that (potentially complex) preference as well as possible. This requires a potentially immense amount of computing power, and it's not at all obvious that many of the "systems" with "intelligence" that we might be worried about will be described that way. When we look at, say, trading algorithms causing problems in the financial markets, utility maximization doesn't appear to be the right viewpoint for understanding why those algorithms behave the way they do, and game theory doesn't seem like the right approach to try to determine the correct reaction to their algorithms and their use.
It may also be helpful to separate out "strong" and "weak" senses in which an agent maximizes utility. The strong sense is that they actually have a known function that they use to value consequences, and simulate the future to determine the action that gets them the most value. The weak sense is that we can describe any agent as behaving as though it is maximizing some utility function, by observing what it does and calling that the utility-maximizing action. As the names suggest, the strong sense is useful for predicting how an agent will behave, and the weak sense isn't.
As mentioned earlier, I don't think it's easy (or desirable) to dethrone utility as the premier prescriptive decision-making approach, if you have the self-editing ability to change your decision-making style and the computing power to solve the maximization problems it poses. But we may need to figure out where we're coming from to figure out how to get there. (In some sense, that's the premise of the Heuristics and Biases literature.)
Previous Discussion on LW
It's not quite fair or reasonable to respond to comments and posts made years ago (and before I even found LW), especially in light of Yvain's roundup that had PCT listed with ideas that seemed outlandish before being partly absorbed into the LW consensus. One of the reasons why I bring the subject up again, with a longer treatment, is because I think I see a specific hole in the LW consensus that I might be well-suited to fill. So let's look at the list of links from the first post again: this book and Powers's Perceptual Control Theory have been discussed on LW here, here, and here, as well as mentioned in Yvain's roundup of 5 years (and a week) of LW.
I feel like the primary criticisms (see SilasBarta's comment as an example) were about the presentation and the unexplained enthusiasm, rather than the invalidity or inappropriateness of the model, and the primary defenses were enthusiasm (as I recall, this comment by pjeby prompted me to buy and read the book, but I'm only familiar with one of the six things that he says it explains, which impairs my ability to understand why he thinks it's impressive!). I don't mean to fault people involved in that conversation on the PCT side for not explaining- even I see my two thousand words in the last post as an argument to read Powers' three hundred page long book rather than a full explanation (just like my two thousand words spent on the basics of controls wouldn't get you through an undergraduate level class on the subject, and are more of an argument to take that class).
Since you can make an arbitrary function out of enough control loops, saying that human minds run on control loops doesn't constrain the possible behavior of humans much by itself, just like saying that a program is written in a high-level language doesn't constrain the possible behavior of that program much. I view PCT as describing the inherent modularity of the code, rather than what's possible to code, which helps quite a bit in figuring out how the code functions and where bugs might be hiding or how to edit it. Any model built in the controls framework will have to be very complicated to be fully functional--I feel like it's easier to understand the mechanics of a person's arm than the mechanics of their personality, but if we want to explain the arm at any meaningful level of detail we need a meaningful number of details!
And, in terms of models, I think the way to think about PCT is as a competitor for utility. I don't know many LWers who model themselves or other humans as utility maximizers, but it seems like that remains the default model for describing intelligent agents whenever we step up a level of abstraction (like when, say, we start talking about ethics or meta-ethics). As part of writing this post, I reread Yvain's sequence on the Blue-Minimizing Robot. At parts, it seems to me to present a dilemma between either modeling intelligence as utility-optimization or arbitrary code, where the former can't be implemented and the latter can't be generalized. A control system framework seems like it finds a middle ground that can be both feasibly implemented and generalized. (It pays for that, of course, by not being easily implemented or easily generalized. Roboticists are finding that walking is hard, and that's only level 5 in the hierarchy! On the input side, computer vision folks don't seem to be doing all that much better.)
Ideally, this is where I would exhibit some example that demonstrates the utility of thinking this way: an ethical problem that utilitarianism can't answer well but a control theory approach can, or a self-help or educational problem that other methods couldn't resolve and this method can. But I don't have such an example ready to go, I'm not convinced that such an example even exists, and even if one exists and I have it, it's not clear to me that it would be convincing to others. Perhaps the closest thing I have to an example is my experience training in the Alexander Technique, which I see as being easy to describe from a control theory perspective, but is highly related to my internal experience and methodology of moving through the world, both of which are difficult to describe through a text-based medium. Further, even if it does become obvious that positive change is taking place, determining how much that positive change validates a control system-based explanation of what's going on underneath is it's own difficult task!
A model fits the situation when easy problems are easy in the model and hard problems are hard in the model. A thermostat is simple, and a person is complex . The utilitarian approach says "find the thing defined as the thing to be maximized, and then maximize it," and in some sense the utilitarian model for a thermostat is 'as simple' as the utilitarian model for a person- they've swept all the hard bits into the utility function, and give little guidance on how to actually go about finding that function. The control system approach says "find the negative feedback loops, and edit them or their reference levels so they do what you want them to do," and the number of feedback loops involved in the thermostat is rightly far lower than the number of feedback loops involved in the person. If I could exhibit a simple model that solves a complex problem, then it seems to me that my model doesn't quite fit.1
Intelligent Action without Environmental Simulation
This is mostly covered by RichardKennaway's post here, but is important enough to repeat. Typically, when we think about optimization, we have some solution space (say, possible actions to take) and some objective function (over solutions, i.e. actions), and go through the solution space applying the objective function to points until we're satisfied that we have a point that's good enough. (If the relationship between the solution space and objective function is kind enough, we'll actually search for a proof that no better solutions exist than the one we picked.)
A common mathematical approach is to model the world as having states, with the transition probability from state to state depending on the actions the robot takes (see Markov Decision Processes). Typically, we want to find an optimal policy, i.e. a mapping from states of the world to actions that lead to the maximum possible accrual of value.
But the computational cost of modeling reality in that level of depth may not be worth it. To give a concrete example, there's a human movement neuroscience community that studies the questions of how muscles and joints and brains work (i.e. fleshing out the first five levels of the model we talked about in the last post), and one of the basic ideas in that field is that there's a well-behaved function that maps from the position of the joints in the arm to where the tip of the finger is. Suppose you want to press a button with your finger. You now have to solve the inverse problem, where I give you a position for the tip of the finger (where the button is) and you figure out what position to put the joints in so that you touch the button. Even harder, you want to find the change in joint positions that represents the least amount of effort. This is a computationally hard problem, and one of the questions the community is debating is how the human nervous system solves this hard problem.
My favorite answer is "it doesn't solve the hard problem." (Why would it? Effort spent in the muscles is measured in calories, and so is effort spent in the nerves.) Instead of actually inverting the function and picking the best possible action out of all actions, there might be either stored approaches in memory or the brain might do some sort of gradient descent (easily implemented in control systems using the structure described in the last post), where the brain knows the difference between where the finger is and where it should be, moves each joint in a way that'll bring the finger closer to where it should be, and then corrects its approach as it gets closer. This path is not guaranteed to be globally optimal, i.e. it does not solve the hard problem, but is locally optimal in muscular effort and probably optimal in combined muscular and nervous calorie expenditure.
Preference Modeling and Conflicts
I'm not familiar with enough Internal Family Systems Therapy to speak to how closely it aligns with the control systems view, but I get the impression that the two share many deep similarities.
But it seems to me that if one hopes to preserve human values, it would help to work in the value space that humans have- and we can easily imagine control systems that compare the relative position or rate of change of various factors to some references. I recall a conversation with another rationalist about the end of Mass Effect 3, where (if I recall their position correctly, and it's been years so I'm not very confident that I do) they preferred a galactic restart to a stagnant 'maximally happy' galaxy, because the former offered opportunities for growth and the latter did not, and they saw life without growth as not worth living. From a utility maximization or efficiency point of view, this seems strange- why want the present to be worse than it could be? But this is a common preference (that shows up in the Heuristics and Biases literature), that people often prefer an increasing series of payments to a decreasing series of payments, even though by exponential discounting they should prefer the decreasing series (where you get the largest payment first) if the amounts are the same with only the order reversed.
Reasoning About AI
I don't see all that much relevance to solving the general problem of value preservation (it seems way easier to prove results about utility functions), but as mentioned in the conflicts section it does seems relevant to human value preservation if it's a good description of human values. There is the obvious caveat that we might not want to preserve the fragmentary shattering of human values; a potential future person who wants the same things at the same strength as we do, but has their desires unified into a single introspectively accessible function with known tradeoffs between all values, will definitely be more efficient than current humans--potentially more human than the humans! But when I ask myself for a fictional example of immediate, unswerving confidence in one's values, the best example that comes to mind is the Randian hero (which is perhaps an argument for keeping the fragmentation around). As Roark says to Peter (emphasis mine):
If you want my advice, Peter, you've made a mistake already. By asking me. By asking anyone. Never ask people. Not about your work. Don't you know what you want? How can you stand it, not to know?
But leaving aside values, there's the question of predicting behavior. It seems to me that there are two facets--what sort of intensity of change we would expect, and how we should organize predictions of the future. It seems likely that local controllers will have, generally, local effects. I recall a conversation, many years ago, where another suggested that an AI in charge of illuminating the streets might decide to destroy the sun in order to prevent the street from being too bright. Or, I suggested, it might invest in some umbrellas, since that's a much more proportionate response. I italicize proportionate because under a linear negative feedback controller that would be literally true- the more lumens between the sensor and the reference, the more control effort would be expended, in a one to one fashion. Controllers are a third class of intentional agents, different from both satisficers and maximizers, and a potentially useful one to have in one's mental toolkit.
If we know a system is an intelligent optimizer, and we have a range of possible futures that we're trying to estimate the probability of, we can expect that futures higher in the preference ordering of the optimizer are more likely. But if we also have an idea of what actuators the system has, we might expect futures where those actuators are used, or the direct effect of those actuators leads to a higher preference ordering, to be more likely, and this might be a better way for reasoning about those problems. I'm not sure how far I would take this argument, though; many examples abound of genetic algorithms and other metaheuristic optimization methods being clever in surprising ways, using their ability to simulate the future to find areas of the solution space that did not look especially promising to their human creators that turned out to have proverbial gold. It seems likely that superhuman intelligence is likely to rely heavily on numerical optimization, and even if the objective function is determined by control systems,2 as soon as optimizers are in the mix (perhaps as determining what control to apply to reduce an error) it makes sense to break out the conservative assumptions on their power. And actuators that might seem simple, like sending plaintext through a cable to be read by people, are often in fact very complex.
1. gwern recently posted an Asimov essay called Forget It!, which discusses how an arithmetic textbook from 1797 managed to require over 500 pages to teach the subject. One might compare today's simple arithmetic model to their complex arithmetic model, apply my argument in this paragraph, and say "but if you've managed to explain a long subject in a short amount of time, clearly you've lost a lot of the inherent complexity of the subject!" I'd counter with Asimov's counter, that the arithmetic of today really is simpler than the arithmetic they were doing then, and that the difference is not so much that the models of today are better, but that the reality is simpler today and thus simpler models suffice for simpler problems. But perhaps this is a dodge because it depends on the definitions of "model" and "reality" that I'm using.
2. A few years ago, I saw an optimization algorithm that designed the targeting of an ion beam (I believe?) used to deliver maximum radiation to a tumor while delivering minimum radiation to the surrounding tissue. The human-readable output was a dose probability curve, basically showing the radiation distribution that the tumor received and that the surrounding tissue received. The doctor would look at the curve, decide whether or not they liked it, and play with the meta-parameters of the optimization until the optimizer spat out dosage distributions that they were happy with. I thought this was terribly inefficient- even if the doctors thought they were optimizing a complex function of the distribution, they were probably doing something simple and easy to learn like area under the curve in particular regions or a simple integration, and then that could be optimized directly. The presenter disagreed, though I suspect they might have been disagreeing on the practicality of getting doctors to accept such a system rather than building one. As the fable goes, "instant" cake mix requires that the customer break an egg because customers prefer to do at least one thing as part of making the cake.