This is a framing practicum post. We’ll talk about what general factors are, how to recognize general factors in the wild, and what questions to ask when you find them. Then, we’ll have a challenge to apply the idea.
Today’s challenge: come up with 3 examples of general factors which you have not thought of as general factors before. They don’t need to be good, they don’t need to be useful, they just need to be novel (to you).
Expected time: ~15-30 minutes at most, including the Bonus Exercise.
What are general factors?
Consider the effects of aging on various bodily functions. Aging makes your muscles weaker, makes you more vulnerable to germs, makes your senses and mind worse, increases your risk of cancer, and so on. Aging is a common cause for a huge amount of health problems.
Any one of these effects would be relatively important, but the fact that aging has so many effects makes aging supremely important within contexts to do with health.
The importance of aging shows up in a lot of ways. One is a theoretical perspective; medical studies must make sure not to be confounded by aging, as otherwise their results will be completely uninterpretable. Another is an interventionist perspective; if we could control aging, we could cure an enormous number of health problems. And there's also planning perspectives; it is more practical to do intense things when young than when older, as your body can better handle them while you are young.
Aging is not the traditional example of a general factor; a more canonical example would be the g factor of intelligence. g is an ability or set of abilities that people vary in, and it is required to various extents by just about any cognitive task you can come up with. When it comes to IQ tests, the specific skills that tap into g are often called indicators of g, though I will call them outputs to emphasize that they can be causally relevant too.
The cases where general factors become especially important are when their outputs have common effects. For instance, many outputs of aging, like weak immune systems or higher cancer rates, will also increase your mortality. These shared effects add up, and makes age the most important determinant of survival, as the likelihood of surviving the next year consistently drops with age. Similarly, in order to perform well at school or at work, you need to solve a broad variety of cognitive tasks, making g particularly important for success.
What to look for
Sometimes, it's fairly obvious that you have a general factor at hand, because you have some variable that you already know is an important influence on many other variables.
However, otherwise it is common when one sees a large set of correlated, related variables to infer that their correlations are driven by one or more unobserved general factors. Such a set of correlated variables is called a positive manifold.
Another good thing to be on the lookout for is when something is active in multiple contexts; if that is the case, then it will likely have similar effects in these multiple contexts, making it act like a general factor.
In general, if there is a variable that seems much more important than would be justified by its effects, and it feels like it is because of its implications or "the bigger picture", it may be that its importance stems from it being reflective of a general factor.
Useful questions to ask
When someone claims to intervene on a domain, it is important not to mistake an intervention on the general factor from an intervention on its outputs. For instance, in the case of aging, some people say that exercise can reverse aging. And it's not entirely wrong that exercise can have widespread positive effects on your health, but at the same time the primary effects will be on the specific systems you exercise, such as the cardiovascular system and muscles, and will not "transfer" to your general health. It is, essentially, treating symptoms, and this makes it much less useful than it would be if it truly reversed aging itself.
Correlation with outputs
Often, a general factor may not be directly or easily observable, while some of its outputs are readily observable. If one then wants to know about the factor, it may be useful to pay attention to its observable outputs. For instance, aging is associated with wrinkly skin and gray hair; these are not very important in and of themselves, but they provide a lot of information about a person's age, and therefore also about their general health.
It may also be useful to think about the relationship between the general factor and the average of its outputs. Assuming that the general factor has a lot of independent outputs, one can for many purposes treat it as being identical to this average. However, some of the outputs of the general factor may not be known, or at least, not observed. Also, the general factor is strictly speaking not the same as its outputs; rather, it is the shared processes underlying the outputs. I often find that my understanding of some domain gets improved when I meditate on the distinction between the general factors and the averages of their outputs.
If one has inferred the existence of a general factor from a widespread set of correlations between variables in a domain, and so doesn't know the root cause(s) of the factor, it can be enlightening to meditate on the realism of the general factor. Sometimes, there turns out to be a single core variable that mediates the common causes of everything, making the factor fully real. On the other hand, sometimes there may be multiple overlapping wide-ranging root causes; then the general factor can be thought of as the sum of these causes.
But also commonly, people believe that the different outputs of the factor are mutually reinforcing, and that this is what is driving correlations. I think often people overestimate the relevance of mutual interactions. For instance, even if there are mutual interactions, there will often be a small "core" of interacting variables that drive most of the effect. And there may be factors influencing the general strengths of the mutual interactions, which may drive the overall dynamics. And in the limit of a large number of homogeneous mutually interacting variables, each individual variable would only be able to have a small effect, while factors that influence many of the individual variables would be carried through, generating a true general factor.
Suppose you are studying human behavior. The problem is that behavior is highly chaotic and contextual; it's impossible to classify each individual interaction and model them all. But these chaotic interactions add up, and so things that are held in common across interactions become driving forces, which may be easier to classify and easier to model.
More generally, modelling something requires features, and factors provide a rich and convenient source of features for modelling.
Come up with 3 examples of general factors that you have not thought of as general factors before. They don’t need to be good, they don’t need to be useful, they just need to be novel (to you). You can either take some observable variable that you know the effects of, where you thus know it functions as a general factor, or you can give examples of positive manifolds of correlated variables (which can be modelled as general factors, to various accuracies).
Any answer must include at least 3 to count, and they must be novel to you. That’s the challenge. We’re here to challenge ourselves, not just review examples we already know.
However, they don’t have to be very good answers or even correct answers. Posting wrong things on the internet is scary, but a very fast way to learn, and I will enforce a high bar for kindness in response-comments. I will personally default to upvoting every complete answer, even if parts of it are wrong, and I encourage others to do the same.
Post your answers inside of spoiler tags. (How do I do that?)
Celebrate others’ answers. This is really important, especially for tougher questions. Sharing exercises in public is a scary experience. I don’t want people to leave this having back-chained the experience “If I go outside my comfort zone, people will look down on me”. So be generous with those upvotes. I certainly will be.
If you comment on someone else’s answers, focus on making exciting, novel ideas work — instead of tearing apart worse ideas. Yes, And is encouraged.
Reward people for babbling — don’t punish them for not pruning.
I will remove comments which I deem insufficiently kind, even if I believe they are otherwise valuable comments. I want people to feel encouraged to try and fail here, and that means enforcing nicer norms than usual.
If you get stuck, look for:
- Cases where the same thing is present in multiple places or times
- Positive manifolds of consistently correlated variables
Bonus Exercise: for each of your three examples from the challenge, see if you can say something about one of the questions raised earlier in the post:
- Do people try to intervene on the variables, and if so do these interventions go through the general factor?
- Are there contexts where it seems reasonable to equate the factor with the average of its outputs? Or contexts where that could be misleading?
- Do some of the outputs provide a biased perspective of the underlying general factor value?
- If you've observed a positive manifold and inferred a general factor from this, do we have any knowledge or good guesses about how real the underlying general factor is?
- Are there chaotic interactions that can be approximated and simplified by understanding the underlying general factor?
You can pick and choose which question you want to answer for each of the examples you provide.
Thanks to Justis Mills for proofreading and feedback.
Further, many effects of aging will decrease your agency, which also adds up to make aging one of the most important determinants of agency.
I do not have a formal definition of relatedness; it is somewhat a matter of judgement. But I guess one example that can be said is, sometimes you "place" the factor in some physical location in the world; for instance, both aging and the g factor gets placed in an individual person. You'd then expect the outputs arrive from the same location as where you placed it.
A common critique here is that such a pattern of correlations could also be driven by mutual interactions; e.g. heat dissipates throughout objects, making the temperature of one part of an object correlated with temperatures of other parts. This is sometimes an important point, but often there will be various things that make the system act as a general factor.
In some settings, it can also be useful to ask how the visible outputs you choose skew your perception of the factor. If there are certain groups where the outputs you look at act differently, then these groups might confuse you about their general factor. For instance in the case of aging, many people hide visible symptoms of aging to look more attractive. In psychometrics, there is a set of properties called "measurement invariance" which are designed to test for these exact problems.
In the case of aging, John Wentworth's overview points to a small core of aging, mainly related to reactive oxygen species.
For instance, with cognitive abilities, some people propose that being good at one thing provides a foundation, which gives one the resources and knowledge to transfer abilities to other abilities, e.g. via analogy. Well, maybe (as I understand, this notion is not well supported by research, but I am not an expert), but there are innate individual differences in e.g. ability to make analogies, which would become a common root cause driving these mutual interactions.
You can think of this as separation of scale being well-approximated by the large-scale influencing the small-scale. E.g. consider the temperature of different pieces of an object standing outside. The different pieces are mutually interacting, with heat dissipating around. However, mostly, the temperature is determined by the heat impacted by the sun, either by directly shining light on the object, or by heating up the overall surroundings.
One important context where this may fail is with long-tailed distributions or nonlinear interactions. Here, individual parts of a network can have strong effects in ways that are tightly related to the network shape.