Gears in understanding

by Valentine15 min read12th May 201737 comments


Gears-LevelWorld Modeling

Some (literal, physical) roadmaps are more useful than others. Sometimes this is because of how well the map corresponds to the territory, but sometimes it's because of features of the map that are irrespective of the territory. E.g., maybe the lines are fat and smudged such that you can't tell how far a road is from a river, or maybe it's unclear which road a name is trying to indicate.

In the same way, I want to point at a property of models that isn't about what they're modeling. It interacts with the clarity of what they're modeling, but only in the same way that smudged lines in a roadmap interact with the clarity of the roadmap.

This property is how deterministically interconnected the variables of the model are. There are a few tests I know of to see to what extent a model has this property, though I don't know if this list is exhaustive and would be a little surprised if it were:

  1. Does the model pay rent? If it does, and if it were falsified, how much (and how precisely) could you infer other things from the falsification?
  2. How incoherent is it to imagine that the model is accurate but that a given variable could be different?
  3. If you knew the model were accurate but you were to forget the value of one variable, could you rederive it?

I think this is a really important idea that ties together a lot of different topics that appear here on Less Wrong. It also acts as a prerequisite frame for a bunch of ideas and tools that I'll want to talk about later.

I'll start by giving a bunch of examples. At the end I'll summarize and gesture toward where this is going as I see it.

Example: Gears in a box

Let's look at this collection of gears in an opaque box:

(Drawing courtesy of my colleague, Duncan Sabien.)

If we turn the lefthand gear counterclockwise, it's within our model of the gears on the inside that the righthand gear could turn either way. The model we're able to build for this system of gears does poorly on all three tests I named earlier:

  • The model barely pays rent. If you speculate that the righthand gear turns one way and you discover it turns the other way, you can't really infer very much. All you can meaningly infer is that if the system of gears is pretty simple (e.g., nothing that makes the righthand gear alternate as the lefthand gear rotates counterclockwise), then the direction that the righthand gear turns determines whether the total number of gears is even or odd.
  • The gear on the righthand side could just as well go either way. Your expectations aren't constrained.
  • Right now you don't know which way the righthand gear turns, and you can't derive it.

Suppose that Joe peeks inside the box and tells you "Oh, the righthand gear will rotate clockwise." You imagine that Joe is more likely to say this if the righthand gear turns clockwise than if it doesn't, so this seems like relevant evidence that the righthand gear turns clockwise. This gets stronger the more people like Joe who look in the box and report the same thing.

Now let's peek inside the box:

…and now we have to wonder what's up with Joe.

The second test stands out for me especially strongly. There is no way that the obvious model about what's going on here could be right and Joe is right. And it doesn't matter how many people agree with Joe in terms of the logic of this statement: Either all of them are wrong, or your model is wrong. This logic is immune to social pressure. It means that there's a chance that you can accumulate evidence about how well your map matches the territory here, and if that converges on your map being basically correct, then you are on firm epistemic footing to disregard the opinion of lots of other people. Gathering evidence about the map/territory correspondence has higher leverage for seeing the truth than does gathering evidence about what others think.

The first test shows something interesting too. Suppose the gear on the right really does move clockwise when you move the left gear counterclockwise. What does that imply? Well, it means your initial model (if it's what I imagine it is) is wrong — but there's a limited space of possibilities about ways in which it can be wrong. For instance, maybe the second gear from the left is on a vertical track and moves upward instead of rotating. By comparison, something like "Gears work in mysterious ways" just won't cut it.

If we combine the two, we end up staring at Joe and noticing that we can be a lot more precise than just "Joe is wrong". We know that either Joe's model of the gears is wrong (e.g., he thinks some gear is on a vertical track), Joe's model of the gears is vague and isn't constrained the way ours is (e.g., he was just counting gears and made a mistake), or Joe is lying. The first two give us testable predictions: If his model is wrong, then it's wrong in some specific way; and if it's vague, then there should be some place where it does poorly on the three tests of model interconnectedness. If we start zooming in on these two possibilities while talking to Joe and it turns out that neither of those are true, then it becomes a lot more obvious that Joe is just bullshitting (or we failed to think of a fourth option).

Because of this example, in CFAR we talk about how "Gears-like" or how "made of Gears" a model is. (I capitalize "Gears" to emphasize that it's an analogy.) When we notice an interconnection, we talk about "finding Gears". I'll use this language going forward.

Example: Arithmetic

If you add 25+18 using the standard addition algorithm, you have to carry a 1, usually by marking that above the 2 in 25.

Fun fact: it's possible to get that right without having any clue what the 1 represents or why you write it there.

This is actually a pretty major issue in math education. There's an in-practice tension between (a) memorizing and drilling algorithms that let you compute answers quickly, and (b) "really understanding" why those algorithms work.

Unfortunately, there's a kind of philosophical debate that often happens in education when people talk about what "understand" means, and I find it pretty annoying. It goes something like this:

  • Person A: "The student said they carry the 1 because that's what their teacher told them to do. So they don't really understand the addition algorithm."
  • Person B: "What do you mean by reallyunderstand'?Wˆ'swrongwiththejustificationofA person who knows this subject really well says this works, and I believe them'?"
  • A: "But that reason isn't about the mathematics. Their justification isn't mathematical. It's social."
  • B: "Mathematical justification is social. The style of proof that topologists use wouldn't be accepted by analysts. What constitutes a 'proof' or a 'justification' in math is socially agreed upon."
  • A: "Oh, come on. We can't just agree that e=3 and make that true. Sure, maybe the way we talk about math is socially constructed, but we're talking about something real."
  • B: "I'm not sure that's true. But even if it were, how could you know whether you're talking about that `something real' as opposed to one of the social constructs we're using to share perspectives about it?"

Et cetera.

(I would love to see debates like this happen in a milieu of mutual truth-seeking. Unfortunately, that's not what academia rewards, so it probably isn't going to happen there.)

I think Person A is trying to gesture at the claim that the student's model of the addition algorithm isn't made of Gears (and implicitly that it'd be better if it were). I think this clarifies both what A is saying and why it matters. In terms of the tests:

  • The addition algorithm totally pays rent. E.g., if you count out 25 tokens and another 18 tokens and you then count the total number of tokens you get, that number should correspond to what the algorithm outputs. If it turned out that the student does the algorithm but the answer doesn't match the token count, then the student can only conclude that the addition algorithm isn't useful for the tokens. There isn't a lot else they can deduce. (By way of contrast, if I noticed this, then I'd conclude that either I'd made a mistake in running the algorithm or I'd made a mistake in counting, and I'd be very confident that at least one of those two things is true.)
  • The student could probably readily imagine a world in which you aren't supposed to carry the 1 but the algorithm still works. This means their model isn't very constrained, at least as we're imagining it. (Whereas attempting to think about carrying being wrong to do for getting the right answer makes my head explode.)
  • If the student forgot what their teacher said about what to do when a column adds up to more than nine, we imagine they wouldn't spontaneously notice the need to carry the 1. (If I forgot about carrying, though, I'd get confused about what to do with this extra ten and would come up with something mathematically equivalent to "carrying the 1".)

I find this to be a useful tabooing of the word "understand" in this context.

Example: My mother

My mother really likes learning about history.

Right now, this is probably an unattached random fact in your mind. Maybe a month down the road I could ask you "How does my mother feel about learning history?" and you could try to remember the answer, but you could just as well believe the world works another way.

But for me, that's not true at all. If I forgot her feelings about learning about history, I could make a pretty educated guess based on my overall sense of her. I wouldn't be Earth-shatteringly shocked to learn that she doesn't like reading about history, but I'd be really confused, and it'd throw into question my sense of why she likes working with herbs and why she likes hanging out with her family. It would make me think that I hadn't quite understood what kind of person my mother is.

As you might have noticed, this is an application of tests 1 and 3. In particular, my model of my mom isn't totally made of Gears in the sense that I could tell you what she's feeling right now or whether she defaults to thinking in terms of partitive division or quotative division. But the tests illustrate that my model of my mother is more Gears-like than your model of her.

Part of the point I'm making with this example is that "Gears-ness" isn't a binary property of models. It's more like a spectrum, from "random smattering of unconnected facts" to "clear axiomatic system with well-defined logical deductions". (Or at least that's how I'm currently imagining the spectrum!)

Also, I speculate that this is part of what we mean when we talk about "getting to know" someone: it involves increasing the Gears-ness of our model of them. It's not about just getting some isolated facts about where they work and how many kids they have and what they like doing for hobbies. It's about fleshing out an ability to be surprised if you were to learn some new fact about them that didn't fit your model of them.

(There's also an empirical question in getting to know someone of how well your Gears-ish model actually matches that person, but that's about the map/territory correspondence. I want to be careful to keep talking about properties of maps here.)

This lightly Gears-ish model of people is what I think lets you deduce what Mr. Rogers probably would have thought about, say, people mistreating cats on Halloween even though I don't know if he ever talked about it. As per test #2, you'd probably be pretty shocked and confused if you were given compelling evidence that he had joined in, and I imagine it'd take a lot of evidence. And then you'd have to update a lot about how you view Mr. Rogers (as per test #1). I think a lot of people had this kind of "Who even is this person?" experience when lots of criminal charges came out against Bill Cosby.

Example: Gyroscopes

Most people feel visceral surprise when they watch how gyroscopes behave. Even if they logically know the suspended gyroscope will rotate instead of falling, they usually feel like it's bizarre somehow. Even people who get gyroscopes' behavior into their intuitions probably had to train it for a while first and found them surprising and counterintuitive.

Somehow, for most people, it seems coherent to imagine a world in which physics works exactly the same way except that when you suspend one end of a gyroscope, it falls like a non-spinning object would and just keeps spinning.

If this is true of you, that means your model of the physics around gyroscopes does poorly on test #2 of how Gears-like it is.

The reason gyroscopes do what they do is actually something you can derive from Newton's Laws of Motion. Like the gears example, you can't actually have a coherent model of rotation that allows (a) Newton's Laws and (b) a gyroscope that doesn't rotate instead of falling when suspended on one end in a gravitational field. So if both (a) and (b) seem plausible to you, then your model of rotation isn't coherent. It's missing Gears.

This is one of the beautiful (to me) things about physics: everything is made of Gears. Physics is (I think) the system of Gears you get when you stare at any physical object's behavior and ask "What makes you do that?" in a Gears-seeking kind of way. It's a different level of abstraction than the "Gears of people" thing, but we kind of expect that eventually, at least in theory, a sufficient extension of physics will connect the Gears of mechanics to the Gears of what makes a romantic relationship last while feeling good to be in.

I want to rush to clarify that I'm not saying that the world is made of Gears. That's a type error. I'm suggesting that the property of Gears-ness in models is tracking a true thing about the world, which is why making models more Gears-like can be so powerful.

Gears-ness is not the same as goodness

I want to emphasize that, while I think that more Gears are better all else being equal, there are other properties of models that I think are worthwhile.

The obvious one is accuracy. I've been intentionally sidestepping that property throughout most of this post. This is where the rationalist virtue of empiricism becomes critical, and I've basically ignored (but hopefully never defied!) empiricism here.

Another is generativity. Does the model inspire a way of experiencing in ways that are useful (whatever "useful" means)? For instance, many beliefs in God or the divine or similar are too abstract to pay rent, but some people still find them helpful for reframing how they emotionally experience beauty, meaning, and other people. I know of a few ex-atheists who say that having become Christian causes them to be nicer people and has made their relationships better. I think there's reason for epistemic fear here to the extent that those religious frameworks sneak in claims about how the world actually works — but if you're epistemically careful, it seems possibly worthwhile to explore how to tap the power of faith without taking epistemic damage.

I also think that even if you're trying to lean on the Gears-like power of a model, lacking Gears doesn't mean that the activity is worthless. In fact, I think this is all we can do most of the time, because most of our models don't connect all the way down to physics. E.g., I'm thinking of getting my mother a particular book as a gift because I think she'll really like it, but I can also come up with a within-my-model-of-her story about why she might not really care about it. I don't think the fact that my model of her is weakly constrained means that (a) I shouldn't use the model or that (b) it's not worthwhile to explore the "why" behind both my being right and my being wrong. (I think of it as a bit of pre-computation: whichever way the world goes, my model becomes a little more "crisp", which is to say, more Gears-like. It just so happens that I know in what way beforehand.)

I mention this because sometimes in rationalist contexts, I've felt a pressure to not talk about models that are missing Gears. I don't like that. I think that Gears-ness is a really super important thing to track, and I think there's something epistemically dangerous about failing to notice a lack of Gears. Clearly noting, at least in your own mind, where there are and aren't Gears seems really good to me. But I think there are other capacities that are also important when we're trying to get epistemology right.

Gears seem valuable to me for a reason. I'd like us to keep that reason in mind rather than getting too fixated on Gears-ness.

Going forward

I think this frame of Gears-ness of models is super powerful for cutting through confusion. It helps our understanding of the world become immune to social foolishness and demands a kind of rigor to our thinking that I see as unifying lots of ideas in the Sequences.

I'll want to build on this frame as I highlight other ideas. In particular, I haven't spoken to how we know Gears are worth looking for. So while I view this as a powerful weapon to use in our war against sanity drought, I think it's also important to examine the smithy in which it was forged. I suspect that won't be my very next post, but it's one I have in mind upcoming.


37 comments, sorted by Highlighting new comments since Today at 7:40 AM
New Comment

This is the kind of thing that would deserve a promotion to Main, if we still did that.

Well, we do now.

This is a great post.

I'm still confused about what Gear-ness is. I know it is pointing to something, but it isn't clear whether it is pointing to a single thing, or a combination of things. (I've actually been to a CFAR workshop, but I didn't really get it there either).

Is gear-ness:

a) The extent to which a model allows you to predict a singular outcome given a particular situation? (Ideal situation - fully deterministic like Newtonian physics)

b) The extent to which your model includes each specific step in the causation? (I put my foot on the accelerator -> car goes faster. What are the missing steps? Maybe -> Engine allows more fuel in -> Compressions have greater explosive force -> Axels spin faster -> Wheels spin faster ->. This could be broken down even further)

c) The extent to which you understand how the model was abstracted out from reality? (ie. You may understand the causation chain and have a formula for describing the situation, but still be unable to produce the proof)

d) The extent to which your understanding of each sub-step has gears-ness?

I'm still confused about what Gear-ness is.

Honestly, so am I. I think there's work yet to be done in making the idea of Gears become more Gears-like. I think it has quite a few, but I don't have a super precise definition that feels to me like it captures the property exactly.

I thought of this when Eliezer sent me a draft of a chapter from a book he was working on. In short (and possibly misrepresenting what he said since it's been a long time since I've read it), he was arguing about how there's a certain way of seeing what's true that made him immune to the "sensible" outside-view-like arguments against HPMOR being a worthwhile thing to work on. The arguments he would face, if I remember right, sounded something like this:

  • "Most fanfics don't become wildly successful, so yours probably won't."
  • "You haven't been writing Harry Potter fanfic for long enough to build up a reputation such that others will take your writing seriously."
  • "Wait, you haven't read the canon Rowling books?!? There's no way you can write good Harry Potter fanfic!" (Yes, seriously. I understand that he maybe still hasn't read past book 4?)
  • "Come on, it's Harry Potter fanfic. There's no way this matters for x-risk."

And yet.

(I mean, of course it remains to be seen what will have ultimately mattered, and we can't compare with much certainty with the counterfactual. But I think it's totally a reasonable position to think that HPMOR had a meaningful impact on interest in and awareness of x-risk, and I don't think there's much room for debate about whether it became a successful piece of fan fiction.)

If I remember right, Eliezer basically said that he understood enough about what engages audiences in a piece of fiction plus how fiction affects people plus how people who are affected by fiction spread the word and get excited by related material that he could see the pathway by which writing HPMOR could be a meaningful endeavor. He didn't feel terribly affected by people's attempts to do what he called "reference class tennis" where they would pick a reference class to justify their gut-felt intuition that what he was claiming was sort of beyond his social permissions.

So the query is, What kind of perception of the world and of truth (a) gives this kind of immunity to social pressure when social pressure is out of line and yet (b) will not grant this kind of immunity if culture is more right than we are?

Which reminds me of the kind of debate I was used to seeing in math education research about what it meant to "understand" math, and how it really does feel to me like there's a really important difference between (a) justifications based "in the math" versus (b) justification based on (even very trustworthy and knowledgeable) other people or institutions.

So, if I trust my intuition on this and assume there really is some kind of cluster here, I notice that the things that feel like more central examples consistently pass the same few tests (the ones I name early in the OP), and the ones that feel like pretty clear non-examples don't pass those tests very well. I notice that we have something stronger than paying rent from more Gears-like models, and that there's a capacity to be confused by fiction, and that it seems to restate something about what Eliezer was talking about when the model is "truly a part of you".

But I don't really know why. I find that if I start talking about "causal models" or about "how close to physics" the model is or whatever, I end up in philosophical traps that seem to distract from the original generating intuition. E.g., there's totally a causal model of how the student comes to write the 1 in the addition algorithm, and it seems fraught with philosophical gobbledygook and circular reasoning to specify what about "because the teacher said so" it is that isn't as "mathematical" as "because you're summing ones and tens separately".

I shall endeavor. Eventually the meta-model will be made of Gears too, whatever that turns out to mean. But in the meantime I still think the intuition is super-helpful — and it has the nice property of being self-repairing over time, I think. (I plan on detailing that more in a future post. TL;DR: any "good" process for finding more Gears should be able to be pointed at finding the Gears of Gears in general, and also at itself, so we don't necessarily have to get this exactly right at the start in order to converge on something right eventually.)

"Seems fraught with philosophical gobbledygook and circular reasoning to specify what about "because the teacher said so" it is that isn't as "mathematical" as "because you're summing ones and tens separately"."

"Because you're summing ones and tens separately" isn't really a complete gears level explanation, but a pointer or hint to one. In particular, if you are trying to explain the phenomenon formally, you would begin by defining a "One's and ten's representation" of a number n as a tuple (a,b) such that 10a + b = n. We know that at least on such representation exists with a=0 and b=n.

Proof (warning, this is massive, you don't need to read the whole thing)

You then can define a "Simples one's and ten's representation" as such a representation such that 0<=b<=9. We want to show that each number has at least one such representation. It is easy to see that (a, b) = 10a + b = 10a +10 + b - 10 = 10(a+1) + (b-10) = (a+1, b-10). We can repeat this process x times to get (a+x, b-10x). We know that for some x, b-10x will be negative, ie. if x=b, b-10x = -9x. We can decide to look at the last value before it is negative. Let this representation be (m,n). We have defined that n>=0. We also know that n can't be >=10, else, (m+1, n-10) still has the second element of the tuple >=0. So any number can be written in the tuple form.

Suppose that there are two simple representations of a number (x, y) and (p, q). Then 10x+y = 10p + q. 10(x-p) =y-q. Now, since y and q are between 0 and 9 inclusive, we get that y-q is between 9 and -9, the only factor of 10 in this range is 0. So 10(x-p) =0 meaning x=p and y-q=0, meaning y=q. ie. both members of the tuple are the same.

It is then trivial to prove that (a1, b1) + (a2, b2) = (a1+a2, b1+b2). It is similarly easy to show 0<=b1+b2<=18, so that b1+b2 or b1+b2-10 is between 0 and 9 inclusive. It then directly follow that (a1+a2, b1+b2) or (a1+a2-1, b1+b2-10) is a simple representation (here we haven't put any restriction on the value of the a's).


So a huge amount is actually going on in something so simple. We can make the following observations:

  • "Because you're summing ones and tens separately" will seem obvious to many people because they've been doing it for so long, but I suspect that the majority of the population would not be able to produce the above proof. In fact, I suspect that the majority of the population would not even realise that it was possible to break down the proof to this level of detail - I believe many of them would see the above sentence as unitary. And even when you tell them that there is an additional level of detail, they probably won't have any idea what it is supposed to look like.

  • Part of the reason why it feels more gear like is because it provides you the first step of the proof (defining one's and ten's tuples). When someone has a high enough level of maths, they are able to get from the "hint" quite quickly to the full proof. Additionally, even if someone does not have the full proof in their head, they can still see that a certain step will be useful towards producing a proof. The hint of "summing the one's and tens separately" allows you to quite quickly construct a formal representation of the problem, which is progress even if you are unable to construct a full proof. Discovering that the sum will be between 0 and 18, let's you know that if you carry, you will only ever have to carry the one. This limits the problem to a more specific case. Any person attempting to solve this will probably have examples in the past where limiting the case in such a way made the proof either easier or possible, so whatever heurestic pattern matching which occurs within their brain will suggest that this is progress (though it may of course turn out later that the ability to restrict the situation does not actually make the proof any easier)

  • Another reason why it may feel more gear like is that it is possible to construct sentence of a similar form and use them as hints for other proofs. So, "Because you're summing ones and tens separately" is linguistically close to "Because you're summing tens and hundreds separately", although I don't want to suggest that people only perform a linguistic comparison. If someone has developed an intuitive understand of these phenomenon, this will also play a role.

I believe that part of the reason why it is so hard to define what is or what is not "gears-like" is because this isn't based on any particular statement or model just by itself, but in terms of how this interacts with what a person already knows and can derive in order to produce statements. Further, it isn't just about producing one perfect gears explanation, but the extent to which a person can produce certain segments of the proof (ie. a formal statement of the problem or restricting the problem to the sub-case as above) or the extent to which it allows the production of various generalisation (ie. we can generalise to (tens & hundreds, hundreds and thousands... ) or to (ones & tens & hundreds) or to binary or to abstract algebra). Further, what counts as a useful generalisation is not objective, but relative to the other maths someone knows or the situations that someone knows in which they can apply this maths. For example, imaginary numbers may not seem like a useful generalisation until a person knows the fundamental theorem of algebra or how it can be used to model phases in physics.

I won't claim that I've completely or even almost completely mapped out the space of gears-ness, but I believe that this takes you pretty far towards an understanding of what it might be.

I think this is part of it but the main metaphor is more like "your model has no hand-wavy-ness. There are clear reasons that the parts connect to each other, that you can understand as clearly as you can understand how gears connect to each other."

Tangentially, I thought you might find repair theory interesting, if not useful. Briefly, when students make mistakes while doing arithmetic, these mistakes are rarely the effect of a trembling hand; rather, most such mistakes can be explained via a small set of procedural skills that systematically produce incorrect answers.

Your three criterion are remarkably similar to the three criterion David Deutsch uses to distinguish between good and bad explanations. He argues this via negativa by stating what makes explanations bad.

From The Logic of Experimental Tests:

[A]n explanation is bad (or worse than a rival or variant explanation) to the extent that...
(i) it seems not to account for its explicanda; or
(ii) it seems to conflict with explanations that are otherwise good; or
(iii) it could easily be adapted to account for anything (so it explains nothing).

The first principle would correspond to beliefs paying rent, although unlike BPR it seems implied that you're setting the scope of the phenomena to be explained rather than deriving expectations for your belief. But this ought not to be a problem since a theory once established would also derive what would be further potential explicanda.

The third principle corresponds to having our models constrain on possibility rather than being capable of accounting for any possibility, something Deutsch and Eliezer would actually agree on.

And the second principle would correspond more weakly to what it would mean to make something truly a part of you. A good explanation or gears-level model would not be in conflict with the rest of your knowledge in any meaningful way, and a great test of this would be if one explanation/model were derivable from another.

This suggests to me that Deutsch is heading in a concordant direction to what you're getting at and could be another point of reference for developing this idea. But note that he's not actually Bayesian. However, most of his refutations are relevant to the logical models that are supposed to populate probabilistic terms and have been known by Bayesians for awhile; they amount to stating that logical omniscience doesn't actually hold in the real world, and only in toy models like for AI. This is precisely the thing that logical induction is supposed to solve, so I figure that once its kinks are worked out then all will be well.

Additionally, if we adopted Deutch's terms, using value judgements of "good" and "bad" also gets you gradients for free, as you can set up a partial ordering as a preference ranking amongst your models/explanation. Otherwise I find the notion of "degrees of gears-ness" a bit incoherent without reintroducing probability to something that's supposed to be deterministic.

My interpretation of your Gears-ness tracks well with the degree to which prior beliefs are interrelated.

Interrelatedness of prior beliefs is useful because it allows for rapid updating on limited information. If I visit another world and find that matches don't work, for example, I will begin investigating all sorts of other chemical interactions and question how the hell I'm deriving energy, as clearly oxygen doesn't work the way I think it does any more. I'll re-evaluate every related prior.

An unrelated prior belief acts like a single gear - an update on it may change the direction of itself, but doesn't give me other useful information. A highly interrelated prior belief gives me many avenues of investigation in finding what other prior beliefs were wrong, and helps me make more predictions accurately in other contexts.

A sufficiently interrelated prior has many checks on itself; and if you blank a particular belief, if it is still connected to other beliefs you should be able to derive a likely result from the beliefs connected to it.

An aside - I'd really like to give you karma for this post, but I can't figure out how to do so. Is there some limitation on giving karma?

My interpretation of your Gears-ness tracks well with the degree to which prior beliefs are interrelated.

Yep, that seems right to me. I'm a bit bugged by (my and maybe your) lack of Gears around what's meant by "interrelated", but yeah, this matches my impressions. I like the explicit connection to priors.

I'm reminded of this from HPMOR chapter 2:

His brain ought to have been flushing its entire current stock of hypotheses about the universe, none of which allowed this to happen. But instead his brain just seemed to be going, All right, I saw the Hogwarts Professor wave her wand and make your father rise into the air, now what?

The witch-lady was smiling benevolently upon them, looking quite amused. "Would you like a further demonstration, Mr. Potter?"

"You don't have to," Harry said. "We've performed a definitive experiment. But..." Harry hesitated. He couldn't help himself. Actually, under the circumstances, he shouldn't be helping himself. It was right and proper to be curious. "What else can you do?"

Professor McGonagall turned into a cat.

Harry scrambled back unthinkingly, backpedalling so fast that he tripped over a stray stack of books and landed hard on his bottom with a thwack. His hands came down to catch himself without quite reaching properly, and there was a warning twinge in his shoulder as the weight came down unbraced.

An aside - I'd really like to give you karma for this post, but I can't figure out how to do so. Is there some limitation on giving karma?

I think there's a minimum amount of karma you need for it.

It seems like the images of the gears have disappeared. Are they still available anywhere? EDIT: They're back!

This article is very much along the same lines:

"Illusion of Explanatory Depth: Rozenblit and Keil have demonstrated that people tend to be overconfident in how well they understand how everyday objects, such as toilets and combination locks, work; asking people to generate a mechanistic explanation shatters this sense of understanding. The attempt to explain makes the complexity of the causal system more apparent, leading to a reduction in judges’ assessments of their own understanding.

... Across three studies, we found that people have unjustified confidence in their understanding of policies. Attempting to generate a mechanistic explanation undermines this illusion of understanding and leads people to endorse more moderate positions. Mechanistic explanation generation also influences political behavior, making people less likely to donate to relevant advocacy groups. These moderation effects on judgment and decision making do not occur when people are asked to enumerate reasons for their position. We propose that generating mechanistic explanations leads people to endorse more moderate positions by forcing them to confront their ignorance. In contrast, reasons can draw on values, hearsay, and general principles that do not require much knowledge.

... More generally, the present results suggest that political debate might be more productive if partisans first engaged in a substantive and mechanistic discussion of policies before engaging in the more customary discussion of preferences and positions."

So, I'm a bit confused. I think I need some translation. Let me try to sort it out in public.

First, models. We're talking in general terms, so let's say we have some observables. In the starting Gears example the top-left gear (the "input") and the bottom-right gear (the "output") are observable. A model needs to specify a relationship between the observables.

In that sense the initial opaque box is not a model at all since it doesn't specify anything. Its answer to all questions about the output observable is "I don't know".

Now Joe says "clockwise". That is a model, albeit a very simple one, because it specifies the relationship and it's even falsifiable.

Next you take off the cover and suddenly you have a lot more observables. Moreover, you think you see a particular causal structure, a chain of causes and effects. Your world got a lot more complicated.

This causal structure is decomposable into smaller chunks: A causes B, then B leads to C, then... ... ... and we get Z in the end. So you have a bunch of smaller and, hopefully, simpler models which describe the causal steps involved. Ideally these sub-models are simple enough to be obvious and trivial in which case you "just know how this works".

Note, by the way, that this is all theorizing -- we haven't done anything empiric, like turning the top-left gear and looking at the bottom-right one.

So this all looks reasonable, but then I get confused about the three criteria.

The "paying rent" thing is basically about whether I care. And sure, I don't care about things I don't care about, but that does not affect the correctness (or adequateness, etc.) of the models in question. I don't see why it's a criterion of the model and not of which things I (subjectively) find important.

The "what if a given variable could be different" I don't understand. It it an issue of how generic or robust or fragile the model is? Is it whether a model is falsifiable?

And the third criterion looks like redundancy to me. Or consistency? Or is it about how this particular model fits into a wider, more generic model of how the world works?


[…]the initial opaque box is not a model at all since it doesn't specify anything.

Er… I disagree, though I wonder whether we're just using the word "model" differently and interpreting "specify" differently because of that.

We don't expect the righthand gear to, say, turn green and explode. We generally expect it'll rotate clockwise at some constant rate or counterclockwise at some constant rate. And we can deduce whether the number of hidden gears is odd or even based on that. Before Joe speaks, I think we have to put even odds on those two possibilities, but those are our most likely possibilities.

After Joe speaks, we're in the position that we'd want to take some bets at even odds on which way the righthand gear rotates. This suggests we have a falsifiable model of what's going on in the situation. We'd even be willing to take some bets on the number of hidden gears.

And the odds at which we'd accept bets changes as the number of people who look and come to agree with Joe grows. So we can become more confident in our model given evidence.

But something really important changes when we see the inside. Suddenly a bunch of that probability mass from Joe and his friends converges on a specific subset of claims that could be rounded to "My model doesn't match the world". And a bunch of that same probability mass moves onto "Joe and his friends are all wrong." Which means that if you find strong Bayesian evidence in favor of your model being right, it quickly overwhelms the evidence from Joe and his friends even if there are thousands of them. (I mean, you need more evidence about your model being right to justify defying larger numbers of people, but the point about evidential leverage remains.)

So, I don't think it's just about whether you have a model. It looks to me like there's something about the nature of the model that changes when you look inside the box.

Next you take off the cover and suddenly you have a lot more observables. Moreover, you think you see a particular causal structure, a chain of causes and effects. Your world got a lot more complicated.

I worry when people start bringing in the word "causal". I don't really know what that means. I have an intuition, but I think that intuition is less clear than is my intuition about what Gears are.

E.g., the student who thinks you carry the 1 "because the teacher said so" might have a causal model that looks something like "Following social rules from authority figures invokes laws of magic that make things work". In what sense does this causal model not "count"? Well, somehow we don't like the invocation of laws of magic as though the magic is an axiomatic thing in reality. It seems to violate reductionism. But, um, a lot of the laws of physics seem like they come out of the magic aether of math, and we seem to be okay with that? We can dig into the philosophy of how reductionism and causal modeling should interact and so on…

…but it's curious that we don't have to in order to immediately get the intuition that there's something wrong with the type of justification of "because the teacher said so". Which tells me that what we're picking up on isn't likely to be a deep thing about the nature of causality and how it interacts with reductionism. That or we're picking up on a proxy for that, and I happen to be calling the proxy "Gears-ness".

The "paying rent" thing is basically about whether I care. And sure, I don't care about things I don't care about, but that does not affect the correctness (or adequateness, etc.) of the models in question. I don't see why it's a criterion of the model and not of which things I (subjectively) find important.

Hmm. I think you missed something really important about test #1.

It's not enough for a model to pay rent. Gears-ness is a stronger property than paying rent. (Well, with a possible caveat. I suspect it's possible to have a squatter belief that's made of Gears. But I think we really don't care about Gears-like squatter models even if they're possible, so I'm going to ignore that for now.) The claim is that if a Gears-like model makes a prediction and the prediction is falsified, then you can deduce something else from the falsification.

E.g., if I tell you "All blergs are fizzles" and you can somehow go look at blergs, and you find a blerg that isn't a fizzle, then you can confidently say "Nope, you're wrong." But you basically can't deduce anything other than that. The model is just a floating (false) fact.

But if I tell you "All blergs are fizzles because all blergs eat snargles and all things that eat snargles also become fizzles in addition to what they were", and you go find a blerg that isn't a fizzle, then you can additionally deduce that either (a) not all blergs eat snargles or (b) some things that eat snargles don't also become fizzles. You can know this even if you can't observe snargles let alone what eats them or what happens to things that eat them.

Which is to say, the second snargle claim is more Gears-like than the first, even though they both pay just as much rent.

The "what if a given variable could be different" I don't understand. It it an issue of how generic or robust or fragile the model is? Is it whether a model is falsifiable?

Nope, not about falsifiability. I don't know what the words "generic" or "robust" or "fragile" mean in this context, so I can't speak to that.

In the second snargle claim above (the more Gears-like one), it's totally implausible to have both (a) the claim being true and (b) some snargles not also be fizzles. So you can't flip that variable conditioned on the model being right.

I guess you could trivially say the same thing about the first one… but I say "trivially" because it's tautologically true that you can't have both A and not-A being true at the same time. I guess here there aren't variables in the model, so there isn't really a way to run test #2.

(This is me bumping into a place where the concept of Gears is missing some Gears, so I'm resorting back to the intuition for guidance. This is now having me speculate that a necessary condition of a model being Gears-like is that it has variables that aren't just the whole model. But I'm only just now speculating about this.)

And the third criterion looks like redundancy to me. Or consistency? Or is it about how this particular model fits into a wider, more generic model of how the world works?

Well, it might be redundant. I don't know. It looks like it might be redundant. But I think that's roughly equivalent to saying that having an understanding be truly a part of you is redundant with respect to being reliably confused by fiction in the area the understanding applies to.

Also, in practice, I sometimes find test #2 easier to run and sometimes find test #3 easier to run. So from a purely pragmatic standpoint, test #3 still seems useful.

We don't expect the righthand gear to, say, turn green and explode.

Ah, OK, that I would call context. Context is important. If something that looks like a gear sticks from one side of a box that an alien ship dropped off and there is another looks-like-a-gear thing on the other side, my expectations are that it might well turn green and explode. On the other hand, if we are looking at a Victorian cast-iron contraption, turning green is way down on my list of possibilities.

Context, basically, provides boundaries for the hypotheses that we are willing to consider. Sometimes we take a too narrow view and nothing fits inside the context boundaries -- then widening of the context (sometimes explosively) is in order. But some context is necessary, otherwise you'd be utterly lost.

But something really important changes when we see the inside.

Well, you got some evidence that you have a strong tendency to believe (though I think there were some quite discouraging psych experiments about the degree to which people are willing to believe the social consensus over their own lying eyes). And yes, there is a pretty major difference between hearsay and personal experience. But still, I'm not sure where is the boundary that you wish to draw -- see stage magic, optical illusions, convincing conmen, and general trickery.

there's something about the nature of the model that changes when you look inside the box.

There is a traditional division of models into explanatory models and forecasting models. The point of a forecasting model is to provide a forecast -- and that's how it is judged. If it provides good forecasts, it might well be a black box and that's not important. But for explanatory models being a black box is forbidden. The point of an explanatory model is to provide insight and, potentially, show what possible interventions could achieve.

Is that something related to your change of perspective as you open the box?

the word "causal". I don't really know what that means

There is a fair amount of literature on it -- see e.g. Pearl -- but, basically, a causal model makes stronger claims then, say, a correlational model. A correlational model would say things like "any time you see X you should expect to see Y" -- and it might well be a very robust and well supported by evidence claim. A causal model, on the other hand, would say that X causes Y and that, specifically, changing X (an "intervention") would lead to an appropriate change in Y. A correlational model does not make such a claim.

Interpreting correlational models as causal is a very common mistake.

In what sense does this causal model not "count"?

You test causal models by interventions -- does manipulating X lead to the changes you expect in Y? If you are limited to passive observation, establishing causal models is... difficult.

to immediately get the intuition that there's something wrong with the type of justification of "because the teacher said so"

Isn't that just the hearsay vs personal experience difference?

The claim is that if a Gears-like model makes a prediction and the prediction is falsified, then you can deduce something else from the falsification.

Hmmm. OK, let me try to get at it from another side. Let's say that Gearness is the property of being tied into the wider understanding of how the world works.

Generally speaking, you have an interconnected network of various models of how the world is constructed. Some are implied by others, some are explicitly dependent on others, etc. This network is vaguely tree-like in the sense that some models are closer to the roots and changes in them have wide-ranging repercussions (e.g. a religious (de)conversion) and some models are leaves and changes in them affect little if anything else (e.g. learning that whales on dying usually sink to the ocean floor).

Gearness would then be the degree to which a model is implied and constrained by "surrounding" knowledge. Does that make any sense?

Then the second test would be basically about the implications of a particular model / result for the surrounding knowledge. Is it deeply enmeshed or does it stand by itself? And the third test is about the same thing as well -- how well does the model fit into the overall picture.

Perhaps we could say that Gears-like models have low entropy?  (Relative to the amount of territory covered.)

You can communicate the model in a small number of bits.  That's why you can re-derive a missing part (your test #3)--you only need a few key pieces to logically imply the rest.

This also implies you don't have many degrees of freedom; [you can't just change one detail without affecting others](  This makes it (more likely to be) incoherent to imagine one variable being different while everything else is the same (your test #2).

Because the model itself is compact, you can also specify the current state of the system in a relatively small number of bits, inferring the remaining variables from the structure (your test #1).  (Although the power here is really coming from the "...relative to the amount of territory covered" bit.  That bit seems critical to reward a single model that explains many things versus a swarm of tiny models that collectively explain the same set of things, while being individually lower-entropy but collectively higher-entropy.)

This line of thinking also reminds me of Occam's Razor/Solomonoff Induction.

Did you possibly mean to link to Godwin's Law instead of Goodhart's Law?

Um… no? I'm pretty sure I mean to talk about optimizing for metrics, not about Hitler. Am I missing something?

I'm just failing at being funny and probably succeeding at being cruel in reinforcing your recent in-person tendency to confuse the two laws (because it was hilarious in context, and I'm terrible).

Thanks for writing down the gears concept in detail.

Haha! I didn't know who you were from your username. No cruelty received!

Have you considered trying to teach factor analysis as a fuzzy model (very useful when used loosely, not just rigorously)? It seems strongly related to this and imports some nice additional connotations about hypothesis search, which I think is a common blind spot.

I'm not familiar with factor analysis, so I have to say no, I haven't considered this. Can you recommend me a good place to start looking to get a flavor of what you mean?

Big five personality traits is likely the factor analysis most people have heard of. Worth reading the blurb here:

Many many models can be thought of as folk factor analyses whereby people try to reduce a complex output variable to a human readable model of a few dominant input variables. Why care?

Additive linear models outperform or tie expert performance in the forecasting literature:

Teaching factor analysis is basically an excuse to load some additional intuitions to make Fermi estimates(linear model generation for approximate answers) more flexible in representing a broader variety of problems. Good sources on fermi estimates (eg the first part of The Art of Insight in Science and Engineering) often explain some of the concepts used in factor analysis in layman terms. So for example instead of sensitivity analysis they'll just talk about how to be scope sensitive as you go so that you drop non dominant terms.

It's also handy for people to know that many 'tricky' problems are a bit more tractable if you think of them as having more degrees of freedom than the human brain is good at working with and that this indicates what sorts of tricks you might want to employ, eg finding the upstream constraint or some other method to reduce the search space first of which a good example is E-M theory of John Boyd Fame.

It also just generally helps in clarifying problems since it forces you to confront your choice of proxy measure for your output variable. Clarifying this generally raises awareness of possible failures (future goodheart's law problems, selection effects, etc.).

Basically I think it is a fairly powerful unifying model for a lot of stuff. It seems like it might be closer to the metal so to speak in that it is something a bayesian net can implement.

Credit to Jonah Sinick for pointing out that learning this and a few other high level statistics concepts would cause a bunch of other models to simplify greatly. is Jonah Sinick's post speaking about benefits he got through the prism of dimensionality reduction

I suspect IQ (the g factor) is the most well-known application of factor analysis.

Factor analysis is also a specific linear technique that's basically matrix rotation. On a higher, more conceptual level I find talking about dimensionality reduction more useful than specifically about factor analysis.

Response: (Warning: not-exactly-coherent thoughts below)

So in one sense this feels a lot like...causal modeling? Like, this seems to be what people tend to talk about when they mean models, in general, I think? It's about having a lot of little things that interact with one another, and you know the behavior of the little things, so you can predict the big stuff?

At some point, though, doesn't every good model need to bottom out to some causal justification?

(EX: I believe that posting on Facebook at about 8 pm is when I can get the most interaction. I know this because this has happened often in the past and also people tend to be done with dinner and active online. If these two things hold, I can be reasonably certain 8 pm will bring with it many comments.)

Also, the "plausibly either way" is definitely a good sign that something's broken, like when certain adages like "birds of a feather flock together" and "opposites attract" can both be seen as plausible.

(I think Kahneman actually ran a study w/ those two adages and found that people rated the one they were told had scientific backing as more plausible than the other one? But that's straying a little from the point...)

If the two adages both seemed plausible, then that seems to be a statement, not about the world, clearly, but about your models of humans. If you really query your internal model, the question to ask yourself might be, "Do you see two quiet people having just as a good of a time as one quiet person and one loud person?"

[…]this seems to be what people tend to talk about when they mean models, in general, I think? It's about having a lot of little things that interact with one another, and you know the behavior of the little things, so you can predict the big stuff?

I think I might be missing your meaning here. Both I and the arithmetic student have a model of how the addition algorithm works such that we make all the same predictions. But my model has more Gears than does the student's. The difference is that my sense of what the algorithm could even be is much more constrained than is the student's.

Also, the student has a cause for their belief. It's just not a Gears-like cause.

(EX: I believe that posting on Facebook at about 8 pm is when I can get the most interaction. I know this because this has happened often in the past and also people tend to be done with dinner and active online. If these two things hold, I can be reasonably certain 8 pm will bring with it many comments.)

Well, okay. I want to factor apart two different things here.

First, it happened a lot before, so you expect it to happen again. Test #2: How Earth-shattering would it be if you were to post to Facebook at about 8pm and not get many comments? Test #1: If you don't get many comments, what does this demand about the world? Test #3: If you were to forget that people tend to interact on Facebook around 8pm, how clearly would you rederive that fact sans additional data? I think that on its own, noticing a correlation basically doesn't give you any Gears. You have to add something that connects the two things you're correlating.

…and you do offer a connection, right? "[P]eople tend to be done with dinner and active online [at about 8pm]." Cool. This is a proposed Gear. E.g., as per test #1, if people don't reply much to your 8pm Facebook post, you start to wonder if maybe people aren't done with dinner or aren't active online for some other reason.

Also, the "plausibly either way" is definitely a good sign that something's broken[…]

I agree with what I imagine you to mean here. In the spirit of "hard on work, soft on people", I want to pick at the language.

I think the "plausibly either way" test (#2) is a reasonably accurate test of how Gears-like a model is. And Gears tend to be epistemically very useful.

I worry about equating "Gears-like" with "good" or "missing Gears" with "broken" or "bad". I think perspectives are subject to easy distortion when they aren't made of Gears, and that it's epistemically powerful to track this factor. I want to be careful that this property of models doesn't get conflated with, say, the value of the person who is using the model. (E.g., I worry about thought threads that go something like, "You're an idiot. You don't even notice your explanation doesn't constrain your expectations.")

Otherwise, I agree with you! As long as we're very careful to create common knowledge) about what we mean when we say things like "good" and "broken" and "wrong", then I'm fine with statements like "These tests can help you notice when something is going wrong in your thinking."

Meta question:

How do I create links when the URL has close-parentheses in them?

E.g., I can't seem to link properly to the Wikipedia article on common knowledge in logic. I could hack around this by creating a TinyURL for this, but surely there's a nicer way of doing this within Less Wrong?

backslash escape special characters. Test Common knowledge)

done by adding the '\' in logic'\') without the quotes (otherwise it disappears)

Thanks! Fixed.

In addition to what RomeoStevens said, while comments on LW use markup formatting, the main post uses html formatting.

Yep! I noticed that. I know what to do to avoid this problem in HTML. I just didn't know what the escape character was in the markup.

I actually miss when the main posts were markup too. It made making the posts have the same type of format a lot easier. I also like something about the aesthetic of the types all being the same. C'est la vie!

At some point, though, doesn't every good model need to bottom out to some causal justification?

If your claim that a model without a justification isn't a model? I don't have any problem with conceptualizing a poorly justified or even not justified model.

Hm, okay. I think it's totally possible for people to have models that aren't actually based on justifications. I do think that good models are based off justifications, though.