11 min read5th Feb 202314 comments

19

Ω 7

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

[Metadata: crossposted from https://tsvibt.blogspot.com/2022/08/control.html. First completed 3 July 2022.]

I don't know how to define control or even point at it except as a word-cloud, so it's probably wanting to be refactored. The point of talking about control is to lay part of the groundwork for understanding what determines what directions a mind ends up pushing the world in. Control is something like what's happening when values or drives are making themselves felt as values or drives. ("Influence" = "in-flow" might be a better term than "control".)

Previous: Structure, creativity, and novelty

Definitions of control

  • Control is when an element makes another element do something. This relies on elements "doing stuff".
  • Control is when an element {counterfactually, evidentially, causally, logically...} determines {the behavior, the outcome of the behavior} of an assembly of elements.
  • Control is when an element modifies the state of an element. This relies on elements having a state. Alternatively, control is when an element replaces an element with a similar element.
  • Control is when an element selects something according to a criterion.

These definitions aren't satisfying in part because they rely on the pre-theoretic ideas of "makes", "determines", "modifies", "selects". Those ideas could be defined precisely in terms of causality, but doing that would narrow their scope and elide some of the sense of "control". To say, pre-theoretically, "My desire for ice cream is controlling where I'm walking.", is sometimes to say "The explanation for why I'm walking along such-and-such a path, is that I'm selecting actions based on whether they'll get me ice cream, and that such-and-such a path leads to ice cream.", and explanation in general doesn't have to be about causality. Control is whatever lies behind the explanations given in answer to questions like "What's controlling X?" and "How does Y control Z?" and "How can I control W?".

Another way the above definitions are unsatisfactory is that they aren't specific enough; some of them would say that if I receive a message and then update my beliefs according to an epistemic rule, that message controls me. That might be right, but it's a little counterintuitive to me.

There's a tension between describing the dynamics of a mind--how the parts interact over time--vs. describing the outcomes of a mind, which is more easily grasped with gemini modeling of "desires". (I.e. by having your own copy of the "desire" and your own machinery for playing out the same meaning of the "desire" analogously to the original "desire" in the original mind.) I'm focusing on dynamical concepts because they seem more agnostic as discussed above, but it might be promising to instead start with presumptively unified agency and then distort / modify / differentiate / deform / vary the [agency used to gemini model a desire] to allow for modeling less-presumptively-coherent control. (For discussion of the general form of this "whole->wholes" approach, distinct from the "parts->wholes" approach, see Non-directed conceptual founding.) Another definition of control in that vein, a variation on a formula from Sam Eisenstat:

Control is an R-stable relationship between an R-stable element and R-unstable prior/posterior elements (which therefore play overlapping roles). "R-stable" means stable under ontological Revolutions. That is, we have C(X,Y) and C(X,Z), where X and C are somehow the same before and after an ontological revolution, and Y and Z aren't the same.

Control vs. values

I'm talking about control rather than "values" because I don't want to assume:

  • that there are terminal values,
  • that there's a clear distinction between terminal values and non-terminal values,
  • that there are values stable across time and mental life (e.g. self-modification, ontological revolutions),
  • that there's a fixed world over which values could be defined,
  • that there's a clear distinction/unentanglement between values and other elements,
  • that there aren't fundamental conflicts between values within a mind,
  • that if a mind pushes the world in a direction, that direction must be "represented" in the mind's values or in any of the mind's elements,
  • that the relevant questions are about stable features of the mind (such as terminal values after reaching reflective stability) rather than about transient features,
  • that there is, or isn't, or possibly is, or possibly isn't, a "wrapper-mind" with fixed goals or "loci of control" (elements of the mind that determine effects of the mind to an extent disproportionate to the size of the elements, e.g. a limbic system),
  • that the mind already incorporates convergent instrumental drives and tools, such as being non-Dutch-book-able,
  • that control is about external outcomes, as opposed to being about internal / intermediate outcomes or something else (e.g. behavior rather than "outcomes").
    • Expanding on this point: note that the definitions of control given above mostly avoid talking about outcomes. That's because I want to also talk about the control that's exerted by [an agent A minus its utility function]. You could (for some sorts of agents, maybe) slot in a totally different utility function, and the resulting agent A' would have a totally different outcome. But A and A' have something in common: the decision-making machinery is organized in analogous ways, although it will go down many non-overlapping lines of thought in A and A' because of the different priorities held by A and A'. The sense in which the shared decision-making machinery controls the thoughts and actions of A and A' should be included in the concept of control. In particular, this decision-making machinery includes some way of interfacing with the novelty required for the agent to become highly capable, and that task may be very non-trivial.

Word-cloud related to control

  • Want. Cognate with "vacuum", as in ("having an emptiness, lacking something"). This suggests homeostatic pressure and satisficing.
  • Try, attempt. "Try" from Old French "trier" ("to choose, test, verify"). "Attempt" = "ad-tent" ("towards-test") (analogous to "attend"; cognate with "tentative", "tense", "-tend", "-tain"). Suggests experimenting to see what works, trial and error.
  • Desire. Latin "de-sidus" ("from the stars"), cognate with "sidereal" ("of the stars"). Suggests transcendence, universality, wide scope; hope, things out of reach.
  • Care. From Proto-Germanic *karō ("care, sorrow, cry"), from Proto-Indo-European *ǵeh₂r- ("voice, exclamation"); distantly cognate with "garrulous" ("talkative"). Suggests depth, relations to other agents; negative reinforcement, turning homeostatic pressure into strategic preservation by projecting negative reinforcement with imagination.
  • Control. "Contra-rotulus" ("against a little wheel"; "a register used to verify accounts"). Suggests tracking, registration, feedback cycles.
  • Strategy. From στρατός (stratós, "army"; from Proto-Indo-European *ster- ("to spread, stretch out, extend"), whence also "strew", "structure", "-struct") + ἄγω (ágō, "I lead, I conduct"; cognate with "act", "agent"). So, something like "what is done by what conducts an extension". Suggests organization, orchestration, integration; initiation, agitation, without-which-not.
  • Direct. "Dis-rego" ("apart-straighten", cognate "right"), I think as in telling something where to go. Suggests making things handier by putting them into more specific legible contexts.
  • Select. "Se-lect", "se" ("away", as in "seduce" ("lead away"), "seclude" ("shut away"), "secede" ("go apart")) and "lect" from PIE *leǵ- ("gather", cognate with "logos" and "-lect" like "dialect"). Suggests taking something from one context and then putting it into another context by naming it and gathering it with other things.

(Some of the other etymons of the following words are also interesting.)

Choose, constrain; sway, pursue, force, pressure, pull, push; effect, cause, make, determine, modify; power, influence, reign, rule, manage, regulate, lead, obey, prescribe, hegemony, preside, principal, authority, govern, cybernetic, order, command; steer, pilot, compass, rudder, reins, helm, drive; organize, orchestrate, design, manufacture; manipulate, craft, use, tool; supervise, guide, instruct, wield, ambition; wish, will, aim, target, value, utility function, objective function, criterion.

Aspects of control

  • Control transmission or non-transmission. If X controls Z by controlling how Y controls Z, that's transmission (through a line of control). Examples: a general giving orders to a commander giving orders to privates; a hunger calling on a route finder to call on spatial memory of where restaurants are; a mathematician engineering a concept to compactly describe something, so that futurue thoughts using that concept will proceed smoothly; a programmer rewriting a function so that it has different functional behavior when applied in some context. Non-example: an optimizer speeding up a piece of code. The optimized code, when applied, still does all the same stuff as the unoptimized code; the code optimizer hasn't controlled the application of the optimized code. (This isn't entirely right: you could use faster code in new ways because it's faster, and being faster overall is some effect. But those are weak effects of a specific kind, and don't show up in the "internal topology" of the computation. In general, function extensionality implies a kind of control non-transmission, as do APIs, markov blankets, and any kind of screening off.)

  • Non-fungibility, non-conservation. Unless shown otherwise, there's no conservation or fungibility of control. For example, two people each throwing a rock at a window simultaneously, both cause the window to break. An agent's decision-making machinery and its outcome-target both determine the agent's effect on the world, but not interchangeably (the outcome-target determines the direction and the decision-making determines the magnitude). The parts of a machine all have to work for the machine to work.

  • World-robustness. Control that is exerted in many possible worlds.

  • Control distance / depth. Through how many elements is control serially transmitted? Through how many "levels" or "domains" or "realms" is control serially transmitted? Through how much time and space? Is new understanding about a domain "far" from a controlling element recruited to have "near" effects?

  • Control breadth. Across how many different domains (weighted by control distance) does one element exert control?

  • Co-control. What's happening with an element that's being controlled.

  • Co-control context-independence. I.e., being generally useful, predictable, manipulable, programmable, applicable; possibilizing.

  • Control stability. Is the control still exerted after an ontological revolution? E.g. you keep your house warm by putting in your fireplace materials that are listed in your alchemical handbook as "high in phlogiston", then you learn about oxidization, and then you still put those materials in your fireplace (now thinking of them as "high in rapidly oxidizable stuff").

  • Control amplitude. The force of the control. Optimization power is an example. A distinct example is if you turn your thermostat to 90F and turn your window AC unit on: the AC unit is going to lose and the room is going to get hot, but the more powerful the AC unit, the harder the furnace has to work. The AC unit has very little optimization power (over the temperature) in this context, since it can only barely change the actual temperature, but it has nonnegligible control amplitude (over the temperature), since it can force the furnace to work noticeably harder.

  • Explicitness. Some control is explicit; sugergoals, terminal values, back-chaining agency. In contrast, some control routes through not-yet-realized creativity; reinforcement learning. (This is an important concept for comparing novelty with control: implicit control gives up control to the external control exercised by the novelty manifested by the creativity it calls on. This roughly corresponds to saying that inner optimizers happen.)

  • Internal / external. All elements control the inside of themselves, e.g. the idea of the group is a structure of control in that it's constituted in part by controlling the combination of two distinct reflections to be a non-trivial rotation. Some elements don't control anything else, e.g. a mental picture of a rock doesn't control anything else without itself being controlled to control, while others do.

  • Ambition. How far would this control push the world if unchecked, unconstrained, unopposed, unblocked?

  • Yearning vs. pursuing. Yearning is waiting passively and seizing opportunities when they present themselves on their own; following lines of control that are already known, handy, interfaced with, incorporated, integrated. Pursuing is seeking and creating new lines of control; calling on creativity; routing aroung a given stubborn failure by recursing on trying new things, by seeking knowledge, by expanding into new domains, by instigating ontological revolutions, by exploring possible epistemic stances. (The line between yearning and pursuing is blurred when there are lines of control, already integrated, that include seeking and creating new lines of control.)

  • Locality. I haven't analyzed this concept. There's in-degree of control / sensitivity of the controlled thing to the controller; there's out-degree weighted by in-degree of the target; there's integration (combining information from different domains, making whole-world hypotheses, making comparisons); there's orchestration / organization / coordination / planning / combination / assembly / arrangement / recruitment; there's bottlenecks through which control flows; and in contrast, there's participating in an assembly that's controlling something.

  • Criterial delegation. A type of transmission. Controlling an element E by setting a criterion which E will apply when E controls other elements. (Requires that the delegate has and applies "criteria", e.g. agents with goals or search processes with success criteria.)

  • Goal delegation. A type of criterial delegation where the criterion is a goal. Controlling an outcome by setting the outcome as a target of another element's control. (Requires that the delegate can have "targets", e.g. agents with outcomes as goals; implies the controlled element has some control breadth (so that "goal" has meaning beyond "search criterion").)

    • Not all criterial delegation is goal delegation: setting the expected-utility threshold applied by a quantilizer is criterial delegation because it's changing the criterion applied by the quantlizer, but it's not changing the direction of the outcome selected by the criterion. Other examples: setting a success criterion for a domain-narrow search, setting a homeostatic target for a simple feedback system. (Neither of those systems have goals, so they can't be goal delegates.)
  • Superiority. E₁ is superior to E₂ when E₁'s control is higher than E₂'s in amplitude, breadth, depth, creativity, externality, pursuantness. (Note that amplitude, locality, ambitiousness, and stability aren't listed.)

  • Domination. E₁ controlling E₂ to make / keep it the case that E₁ is superior to E₂. Done e.g. by directly cutting off E₂'s access to domains, by punishing or threatening to punish E₂ for increasing its control, by weakening E₂, and generally keeping E₂ within bounds so that E₁ can't be overpowered by E₂ (as doing so becomes convergently instrumental for E₂, though it may not be the type of element that picks up on convergently instrumental things). Satisficers are more easily dominable than optimizers. The point is to make E₂ more predictable, understandable, and reliable (because it's not pursuing other things), and less of a threat.

  • Cybernetic control. A specific flavor of control that's empirically common: E₁ criterially delegates to E₂, and E₁ is (mostly) superior to E₂.

    • "Cybernetic" = steering, cognate with "govern" and possibly "whirl" via PIE *kʷerb- ("to turn").

    • Examples: setting the target of a homeostat / control system, setting the success criterion of a search, setting subgoals of subagents, giving orders, subsidizing and regulating an industry.

    • Non-examples: getting clear on how works is control, but it's not cybernetic control; the idea of might later be involved in controlling other elments, but not with criteria set by the element that orchestrated getting clear on . Designing a car is mainly non-cybernetic control because the car doesn't control anything. But making a detailed design for a car has a significant admixture of cybernetic control, whenever the designer makes decisions with the manufacturing process in mind, because parts of the design will control parts of the process of manufacturing the car, e.g. the decision about axle thickness provides a target-point for the lathe (or whatever). Making a request of a free person isn't cybernetic control because they can refuse your request and because you aren't superior to them (these two things are related of course). (I haven't fully disentangled superiority and domination from an element actually exerting its capacity to threaten / extort another to accept delegation or other control, which seems to require conflict and communication.)

    • Note that not all cybernetic control is goal delegation because there's criterial delegation that's not goal delegation.

    • E₁ is only mostly superior to E₂; otherwise there'd be no point in delegating to E₂. Centrally, E₁ is superior to E₂ except that E₂'s control has higher amplitude than E₁'s for some narrow set of kinds of control.

    • Cybernetic control is common because if E₁ is superior to E₂, that makes it easier and more effective for E₁ to criterially delegate to E₂ (and for this reason sometimes E₁ will dominate E₂).

    • Since E₁ is superior to E₂, often E₂ needs constant feedback from E₁, i.e. new information and new settings of criteria. E.g. constantly adjusting one's posture, or issuing new orders, or opening / closing a throttle. Thus cybernetic control correlates with active/ongoing oversight and directive feedback.

    • Ambitiousness makes E₂ less amenable to be cybernetically controlled, because it implies that escaping domination by E₁ is more likely to be a convergent instrumental goal of E₂.

    • Control stability seems like a plus for cybernetic control because it implies a kind of reliability, though it also implies breadth which is harder to dominate.

New Comment
14 comments, sorted by Click to highlight new comments since: Today at 2:57 PM

pre-theoretic note-quite-true notes:

  • it has something to do with hyperstition, because hyperstition is the informational direction of time.
  • garrabrant inductors are an important component of understanding, after all. I had wondered if I'd ever need them; thanks, garrabrant, y'all were right all along.
  • garrabrant inductors may be the precursor of something simpler that has to do with information-theoretic relativity.
  • Kennaway is probably right in the other comment as long as information only flows backward through time by learning the shape of the mechanisms that defined the past of a particular local region, or so. It is the flow of information through a mechanism that allows predicting the future, thereby illuminating the shape of the emergent mechanism that is the network of reductionist mechanisms of each spot in a time-synchronous behavior.
  • Pondering this makes me wonder ... trailed off, see next comment

Hyperstition in the accelerationist sense? I view hyperstitions (self-fulfilling prophecies) as a sort of extreme or degenerate form of control. It's definitely real, but in the same sense that chaos magic (https://en.wikipedia.org/wiki/Chaos_magic#Belief_as_a_tool) is real, or like how "which rising star will be the next CEO" is a real question in a Moral Maze. I could be convinced otherwise but my sense is that this is a non-answer to the question of how real agents (with some partial coherence, and with reflective efforts to become more coherent) have their ultimate effects on the world determined.

Hmm. yeah, I meant self-fulfilling prophecy. I'll have to ponder this response, I had been thinking of self-fulfilling prophecy as encompassing all control, including thermostats. Surprised you disagree.

Some version of that could make sense. But there's something that seem saliently different about a person deciding to build a house and then building it, vs. say Mr. Trumpet going around telling Mr. Gold "oh yeah, Mr. Silver and Mr. Platinum are big investors on this project" and telling Mr. Silver and Mr. Platinum the conjugate thing until the project is pulled out of the ether as sufficiently invested in. In the latter case, from the investor's perspective, the project is kind of a hyperstition. But actually they could decide to make choices about that, and Mr. Trumpet is in fact making choices about that.

Got distracted before finishing edits. Makes me wonder whether chatgpt's commentary here asking it to come up with its own [parable of predict-o-matic](https://www.lesswrong.com/posts/SwcyMEgLyd4C3Dern) - might have been more insightful than I realized at the time. It might actually be competitive as as a short parable of predict-o-matic, in a descriptive sense, if competing for the name.

Posting to shortform because there are many of these. While searching for posts to share as prior work, I found the parable of predict-o-matic, and found it to be a very good post about self-fulfilling prophecies (tag). I thought it would be interesting to see what ChatGPT had to say when prompted with a reference to the post. It mostly didn't succeed. I highlighted key differences between each result. The prompt:

Describe the parable of predict-o-matic from memory.

samples (I hit retry several times):

The parable of predict-o-matic is a story about a powerful machine that can predict the future with incredible accuracy. It is a valuable tool for those who can afford it, but it has a dark side.

The machine is used by a wealthy and powerful man to gain even more wealth and power. He uses it to predict the stock market, elections, and other events, and he always comes out ahead.

However, one day, the man's son begins to question the ethics of using the machine. He realizes that by predicting the future, the machine is taking away free will and choice from those who are not privy to its predictions.

The son eventually destroys the machine, but not before it predicts its own destruction. The parable ends with the message that although knowledge of the future may be useful, it also comes with great responsibility and the potential for harm.

[[Post continues](https://www.lesswrong.com/posts/rKmojEZ9qKwApjCfX/the-gears-to-ascenscion-s-shortform?commentId=yZXG2EAdY4gtiiHr2) with more samples]

Of course, there's prior art it was pulling from, including the post on less wrong which is much longer and gets into the point more and explores all the interesting and varied ways it could go in reality at a company building such a thing, but I've never seen any presented in a distilled format like this. Very nicely done, IMO.

I don't know how to define control

I do.

Control is what control systems do. What a control system does is hold some measurable property of a thing at or close to some reference value, in spite of other influences that would change it.

A thermostat controls the temperature of something: a room, a fridge, a hot water tank.

A person standing up is controlling their standing posture.

The driver of a vehicle is controlling many things: the vehicle’s lateral position on the road, its speed, not hitting anything, following an intended route to an intended destination.

Control systems keep your heart beating and lungs breathing at a suitable rate to supply oxygen sufficient for the body’s needs, however those needs change.

Measuring this against your initial putative definitions:

Control is when an element makes another element do something. This relies on elements "doing stuff".

Ice freezing in cracks will split a rock. Is the ice “controlling” the rock? No, this is not an example of control.

Part of the traditional process of manufacturing slate tiles is to expose quarried slate slabs to water in low winter temperatures in order that the freezing water will split it into suitably thin sheets. The slateworkers doing this are engaging in control: they act as necessary to get the slate into the desired form. The water that they are using is not controlling anything.

Control is when an element {counterfactually, evidentially, causally, logically...} determines {the behavior, the outcome of the behavior} of an assembly of elements.

That is only causation, not control. An avalanche will determine the behaviour of the village it is thundering towards: it will obliterate it. This is not control. The avalanche would act no differently if there were no village there, nor if the buildings were strong enough not to be swept away. If the village happens only to lie near the path of the avalanche, the avalanche will not notice that it is going to miss and veer towards it.

Control is when an element modifies the state of an element. This relies on elements having a state. Alternatively, control is when an element replaces an element with a similar element.

Again, many things modify the state of many other things. This also does not capture the idea of control: the one thing bringing the other thing into a specific state and keeping it there, the thing varying its actions so as to produce that result.

Control is when an element selects something according to a criterion.

This contains only a fragment of the idea of control. To have a control system, the thing selected must be an action and the criterion is that that action must bring the controlled variable closer to its reference value.

For control to be present, these things must be present.

  1. The variable to be controlled.

  2. A means of sensing that variable.

  3. A reference value for the variable.

  4. A range of available actions that will influence that variable.

  5. A rule to determine an action, given the reference value and the sensed value, such that the selected action always tends to reduce the difference between them.

Identifying all of these things in the case of the thermostat is left as an exercise. In this example, all five are visible, physical things that you can point at.

I suggest that this is precisely the concept that you are searching for.

I suggest that this is precisely the concept that you are searching for.

No.

The point of talking about control is to lay part of the groundwork for understanding what determines what directions a mind ends up pushing the world in.

This notion of control is fine but doesn't answer the question.

what determines what directions a mind ends up pushing the world in.

What determines what directions a mind ends up pushing the world in is that mind's purposes — the reference values of its control systems. These determine the narrow class of states towards which it herds the world state, at all levels from seeking water to quench thirst, to striving for grand political ideals.

Sure, if the mind really is some control systems and you know how it's some control systems and where their reference points are and how they combine. You don't know that so it's not a good enough answer, and I don't see a good reason to restrict to this one model-piece, in trying to find ideas that could provide a good enough answer.

There are many theories of what the mind is and how it works, and nobody knows which if any of them are right. I have presented one idea (not originating with me, BTW), and I do not see a reason in what you have said to exclude it from the ideas that could provide a good enough answer.

But the question, "what determines what directions a mind ends up pushing the world in" is somewhat vague, and perhaps you are looking for a completely different sort of thing than a theory of how the mind works.

The idea of control system shouldn't be excluded, but it's far from answering the question. Even if the answer does look like "it's a control system", we still don't know HOW minds (e.g. human minds) are control systems--we'd want to be able to look at a mind and then say "oh, see, it's a control system, here's the action-selector and here's the reference value and here's the comparator" and so on. And we would still want to understand whence comes the reference point, and whence comes the language in which the reference point is stated. Though I'm skeptical that a satisfying answer could look like that.

Perhaps I should have given more of the background. Hang on, I did. The theory is not that the mind "is a control system", but that it and the body are made of control systems, very large numbers of them, arranged in a particular hierarchical way, the output of higher-level systems providing the references for those in the layer below. At the foot of the hierarchy is muscular control and biochemical control loops. The lower levels of this are standard physiology and biochemistry. The upper levels are more speculative, and there is as yet not even conjectures about the morphogenesis of the whole hierarchy, i.e. how it is built during development. (For that matter, not much is known about physical morphogenesis either.)

So there it is: consider it, or not.

It seems to me that TsviBT is trying to figure out for good which of them is definitely right in terms of the concept of what control is in all cases and contexts.

Yet he objects that I do not know if what I described is definitely right. No-one knows what is definitely right. If he wants to find out, all that other people can contribute is possible ideas for him to measure against whatever the idea in his mind is. Well, I have given one. What he does with it is up to him.