Epistemic status: Presentation of an established technique and history. I learned most of my NLP knowledge from Chris Mulzer who’s one of Bandler’s top students. The Origins of Neuro-Linguistic Programming by John Grinder and Frank Pucelik is my main source for the history.

What's NLP? In 1971 Frank Pucelik and Richard Bandler started teaching Fritz Perls’s Gestalt therapy in a group at the University of California, Santa Cruz where the two were in a Bachelor program of psychology. They were joined by John Grinder who was an assistant professor in Linguistics who had just finished writing his PHD thesis on the topic of deletions. As a linguist he had projects like modeling the language of the Tanzanian Wagogo tribe to be able to communicate with them. He had the idea that if he would create a model of how Fritz Perls was using language to get the results he got in his Gestalt therapy work, he should be able to achieve the same results.

Just like modeling the customs of the Wagogo tribe, the goal was to copy the linguistic patterns that were present in Perls’s work to be able to achieve the same results. As a side job Bandler was transcribing lectures of the late Fritz Perls, so they had plenty of video material to study. In addition to the videos Grinder could also study Bandler and Pucelik as they were doing their Gestalt work. 

Modeling in NLP

Later they described the modeling process that they followed as a five-step process description of NLP Modeling:

1. Identification of and obtaining access to a model in the context where he or she is performing as a genius.

2. Unconscious uptake of model’s patterns without any attempt to understand them consciously.

3. Practice in a parallel context to replicate the pattern. The intention is to achieve a performance of the model’s patterns which is equal to the model him/herself.

4. Once the modeler can consistently reproduce the pattern in an applied fashion with equal results, the modeler begins the coding process.

5. Testing to determine if the pattern as coded can be transferred successfully to others who will in turn be able to get equally effective results from the coded results … and then, ultimately to teach those processes to others

Every Monday Bandler and Pucelik had their Gestalt group and the Thursday afterwards Grinder tried to do the same thing as their group to another group of students. They also got additional students to form additional study groups. In their own account they spent around 30 hours per week engaging in modeling and experimenting. 

The three lead the group and spent according to their own account 30 hours per week engaging in modeling and experimenting. In addition to Fritz Perls they modeled other famous therapists like Milton Erickson and Virginia Satir as well to learn their ways of interacting with clients. They also modeled people who they believed to have changed themselves like people who overcome their own phobias. 

Ideologically, the NLP developers didn't like trusting authorities. They were also skeptical of developing elaborate theoretical models that intended to fully reflect reality. At the time the cybernetics community provided a skeptical and constructivist framework from which the NLP developers took ideas about how to deal with knowledge. The added the P in Neuro-Linguistic-Programming to refer to programming in the sense it was thought of in the Biological Computer Laboratory. The Biological Computer Laboratory was led by Heinz von Förster who saw cybernetics as an alternative way to science to gather knowledge. 

Test-Operate-Test-Exist (TOTE). 

From cybernetics work (over George Miller) they borrowed the concept of Test-Operate-Test-Exist (TOTE). In the classic cybernetics example of the thermostat, the thermostat first measures the temperature (test) and whether it's under the desired level. Then it pumps warm water (operate). The thermostat measures the temperature again (test). If the temperature is at the desired level it shuts off (exit) and otherwise it goes back the previous step.

In NLP the TOTE model gets used, to see whether a technique such as the Fast Phobia Cure works on a patient. Before doing the Fast Phobia Cure the NLP practitioner is supposed to calibrate a test. If a client has a spider phobia and is told to imaging a spider, their body language will react to show fear. 

Once the NLP practitioner has it's test that validates the fact that the phobia is there, they will do the Fast Phobia Cure. After they have done the Fast Phobia Cure they will test again with the test they calibrated earlier. If the test still shows the phobia, they know that the Fast Phobia Cure didn't do the job for the client yet and they can try again with a slightly different approach. If the test doesn't show a fear response anymore, it shows that the Fast Phobia Cure had a success.

This way of testing for the perceived body language of a patient has the disadvantage that it's subject to an ability to read body language. It's testing against a subjective measure instead of testing against an objective measure. The advantage is that the feedback-cycles are very fast. Fast feedback cycles allow a practitioner to develop practical knowledge faster than the feedback cycles of the traditional psychological research.

It's best practice to add further tests and tell a person who was just treated with a Fast Phobia Cure, to face the fear in their own lives and report back. It's however not an either or between tests of whether the phobia exists in situations of daily life and testing whether it can be triggered verbally. Mixing tests with fast feedback with more reliable tests with longer feedback cycles allows for more learning to happen.

As rationalists, when we invent a rationality technique, the paradigm of TOTE is useful. If we are clear about the desired outcomes, both those that are directly available and those that are available over longer time-frames, we can more effectively learn about whether our new rationality technique works and how it's done most effectively.

New Comment
10 comments, sorted by Click to highlight new comments since:

Thanks for the straightforward article. I'd like to see more NLP content here, especially related to modelling!

The impression I got from Chris Mulzer is that core NLP crew never got good at teaching other people modelling. They were very productive at producing various interventions themselves but never felt that they satisfactorily could pass that ability to other people. 

Leslie Cameron-Bandler did write down the concepts of how to model in Emprint Method: A Guide to Reproducing Competence and then in the follow up book Know How: Guided Programs for Inventing Your Own Best Future how to teach people the models you make so that they can use it.

Emprint Method has a concent density that's more like a scientific textbook then a self-help book. It presents a notation for models. Maybe most people who learn NLP simply don't have the energy to undertake the effort that good modeling takes.

Given that I do know that you actually spent a significant amount of effort on modeling, the best I can do is point you to those books for your modelling interests.

That said, I still plan to write more about NLP on LW. 

Interesting perspective. I never learned TOTE loops as an intervention strategy but as a modeling one: i.e., the observation that people behave as if their internal operations are TOTE loops -- a parallel framework to the higher-order control systems in Perceptual Control Theory.

But in retrospect, looking at it from this point of view my Collect-Connect-Correct framework for self-improvement is literally a TOTE loop (because part of Collect is identifying a test, and part of Correct is seeing if the test result has changed so you can loop back or exit). I mean, it's kind of a meta-TOTE loop because you're choosing a test, first... I suppose there's the outer TOTE where you decide to use CCC in the first place, and then there's the TOTE where you identify the test to be used in the CCC -- because you may have to try a few different things to identify a suitable test for the Connect/Correct TOTE loop.

On the other hand, TOTEs share the weakness of PCT that they can be used to basically model anything.

But on the other other hand, I can't count how many places where explicitly adding tests and gating around existing self-help processes has made them more repeatable, reliable, and teachable. I wasn't thinking specifically in terms of TOTEs when I did those things, but it makes good sense.

TBH, I looked at it more as an application of testing in general, plus outcome frames and well-formedness conditions... the latter two of which I learned from NLP.

The idea that things in the brain have syntax, a sequence of steps required to unlock them if one is applying certain techniques, allows you to use TOTEs as part of a training process.

Concrete example: my "desk cleaning trick" video describes the "mmm" test, without which the trick will not do anything. Having an explicit test condition for exiting a "step" of a mental process makes it vastly more useful than merely having a list of mental steps.

IOW, explicit tests between steps of a mental process, as well-formedness conditions for proceeding to the next step greatly enhance communicability and reproducability of a technique, which helps to bypass the Interdict of Merlin with regard to self-help.

On the other hand, TOTEs share the weakness of PCT that they can be used to basically model anything.

I don't see how that's a weekness. If you model something as a test, you can start thinking about whether it's a good test or whether it doesn't provide good information.

A framework that can predict anything is not really a predictive framework; it's just a modeling convention.

In the specific case of PCT, the model treats everything as closed-loop homeostasis occurring within the organism being modeled. However, there are plenty of situations where a significant part of the loop control occurs outside the organism, or where organism behavior is only homeostatic if certain EEA assumptions apply. (e.g. the body's tendencies to hoard certain nutrients and flush others, based on historic availability rather than actual availability)

While this doesn't harm PCT's use as a conceptual model of organism behavior, it limits its use as a predictive framework with regard to what 1) we will find happening in the hardware, and 2) we will find happening in actual behavior.

The extension of this problem to TOTE loops is straightforward, since a TOTE loop is just a description of one possible implementation strategy for a PCT control loops and linkage, and one that similarly doesn't always map to the hardware or the location where the tests and operations are taking place (i.e., in-organism or outside-organism).

In the specific case of PCT, the model treats everything as closed-loop homeostasis occurring within the organism being modeled.

That is not the case. Indeed, most of the experimental work on PCT involves creatures controlling perceptions of things outside themselves, e.g. cursor-tracking experiments, or ball catching. Indeed, this is where the important applications are. Homeostatic processes within the organism, such as control of deep body temperature, are well understood to be control processes, and in the case of body temperature, I believe it is known where the temperature sensor is. It is for interactions with the environment that many still think in terms of stimulus-response, or plan-then-execute, or sensing and compensating for disturbances, none of which are control processes, and therefore cannot explain how organisms achieve consistent results in the face of varying environments.

When I say "closed loop within the organism" I mean "having within the organism all the error detection and machinery for reducing the error", not that the subject of perception is also within the organism.

Note, too, that It's possible for people to display apparently-homeostatic processes where no such process is actually occurring.

For example, outside observation might appear to create the impression that say, a person is afraid of success and downregulating their ambitions or skill in order to maintain a lower level of success.

However, upon closer observation, it might instead be the case that the person is responding in a stimulus-response based way to something that is perceived as a threat related to success.

While you could reframe that in terms of homeostasis away from anxiety or threat perception, this framing doesn't give you anything new in terms of solving the problem -- especially if the required solution is to remove the conditioned threat perception. If anything, trying to view that problem as homeostatic in nature is a red herring, despite the fact that homeostasis is the result of the process.

This is a practical example of how using PCT as an explanatory theory -- rather than simply a modeling paradigm -- can interfere with actually solving problems.

In my early learning of PCT, I was overly excited by its apparent explanatory power, but later ended up dialing it back significantly as I realized it was mainly a useful tool for communicating certain ideas; the number of high-level psychological phenomena that actually involve homeostasis loops in the brain appear to be both quite few and relatively short-term in nature.

Indeed, to some extent, looking at things through the PCT lens was a step backwards, as it encouraged me to view things in terms of such higher-order homeostasis loops when those loops were merely emergent properties, rather than reified primitives. (And this especially applies when we're talking about unwanted behavior.)

To put it another way, some people may indeed regulate their perception of "success" in some abstract high-level fashion. But most of the things that one might try to model in such a way, for most people, most of the time, actually involve much tinier, half-open controls like "reduce my anxiety in response to thinking about this problem, in whatever way possible as soon as possible", and not some hypothetical long-term perception of success or status or whatnot.

If I model something as a TOTE that's modeling. The model however implies predictions.If I use the TOTE model I can predict from the fact that my termostate being broken in a way that it doesn't let it detect heat that it will likely overheat the room.  

If I set my room to heat to 22C and find that my room is heated to 26C the TOTE model for the thermostate helps me reason that there's likely a problem with the temperature sensor. 

PCT does that too. Except that sometimes, body and brain processes are open-ended, with an important part of the loop existing in the outside world.

The problem with a model that can explain anything, is that you can't notice when you're being confused by a fake explanation.

A useful explanatory model needs to be able to rule things out, as well as "in".

I think we are talking about different meanings of "modeling" here. There are plenty of uses for which PCT and TOTEs are apt. But if you're trying to discover something about the physical nature of things involved, being able to explain anything equally well is not actually a benefit. That is, it doesn't provide us with any information we don't already know, absent the model.

So e.g. in your thermostat example, the TOTE model doesn't provide you with any predictions you didn't have without it: a person who lacks understanding of how thermostats work internally can trivially make the prediction that something is wrong with it, since it's supposed to produce the reqested temperature.

Conversely, if you know the thermostat contains a sensor, then the idea that "it might be broken if the room temperature is wrong" is trivially derivable from that mere fact, without a detailed control systems model.

IOW, the TOTE model adds nothing to your existing predictions; it doesn't constitute evidence of anything you didn't already know.

This doesn't take away from the many valuable uses of paradigms like PCT or TOTE: it's just that they're one of those things that seems super-valuable because it seems to be a more efficient mental data compressor than whatever you had before. But being a good compressor for whatever data you have is not the same as having any new data!

So paradigmatic models are more about being able to more efficiently think or reason about something, or focus your attention in useful ways, without necessarily changing much about how much one actually knows, from an evidentiary perspective.

But if you're trying to discover thing about the physical nature of things involved, being able to explain anything equally well is not actually a benefit. 

I do grant that's the case. In the context of NLP, modeling doesn't have the intention of discovering things about the phsyical nature of the things involved and if you go to NLP with that intention it's easy to get disappointed.