You label this a "standard leftist critique," but you haven't pointed out why this critique is invalid. You imply it doesn't hold water without actually providing an argument against it.
Whether or not the critique is valid, the action of criticizing can be invalid here. It is similar to how yelling, "fire!" in a packed theater is rarely the correct way of transmitting information about flaws in the world around you. I think most people would appreciate an explanation for why you are suppressing potentially true free speech, but the explanation does not necessarily need to come paired with the suppression. So, you can have something like this occur:
Alice: I think we ought to overthrow the government.
Bob: Shut up!
Bob may be worried about others overhearing this and turning them in, even if he agrees with Alice. But he can't say that, at least not in a public forum. I think the issue with "standard leftist critiques" is that these memes are highly virulent, and most people are not innoculated against them. Systems are hard to fix, so people who have been infected with these memes—even if they are true—may take worse actions than if they had never heard the critique.
This is to expand on my earlier comment. It really deserves to be part its own post, and in fact I was already in the process of writing that post when I came across Jessica's work here. Her explanation for the antilinear-linear inner product is interesting—and initially seemed like it was doing something right—but completely different from my own approach. Well actually, I didn't have an explanation for why the inner product was antilinear-linear, I had only gotten to the point where I knew there was a bilinear form for physical observables, so I was hoping to incorporate some of Jessica's ideas into my own post. However, there was one detail she was missing, and that was, "why is it a bilinear form in the first place?" I had the explanation, but I could not see it materializing from her approach. Ultimately, I've concluded that the reason her approach seems to work is coincidental, and the mirror, conjugate space is not related to the inner product.
So, why a bilinear form? It is because observations are gauge-invariant, and gauge-invariant things are generated by the field strength tensor , where is the contravariant derivative along the spacetime manifold. Why does rather than or ? Because .
The boundary or derivative operator comes from simplicial complexes:
These simplicial compelexes are in turn one piece of the tensor product. For example:
We can identify
which matches
It seems rather strange to do this, until you recall the Schur-Weyl duality. Consider the tensor space and group actions
These group actions commute, so they are mutual centralizers and decompose into irreducible representations together. We can decompose tensors into a direct sum of these irreducible representations, i.e. break them up into their "symmetric pieces":
The corresponding group actions for each piece are represented by . For the antisymmetric, or representation, is a one-dimensional vector space—it contains one basis vector, , and multiples of that vector. This means consists of matrices (or scalars), and has lots of nice properties due to being abelian. Rotations will be smooth and transitive, as speck1447 explains here. This is why the boundary operator typically uses this irreducible representation.
However, we can define a more general boundary as a map to a symmetric piece of one degree lower
where and . It looks exactly the same, but we use rather than to permute an index to the front and drop it. The reason for the representation is because there are permutations you can drop two indices in, and summing the different permutations gives you
In general, you will get when
The minimal is not always two, so other representations will ultimately lead to multilinear forms of different degrees. An equivalent way to look at it is from the side. Rotations are essentially continuous permutations, so the generalized idea of a rotation is an matrix where
(the kernel of ). Mirroring and other orthogonal matrices are generalized to
We are looking for a multilinear form that is invariant to orthogonal transformations, so for any and
(note: is applied diagonally). It is enough to find a single vector where for all . Then we can set and for non-multiples of , . The number of antisymmetrizers is the number of rows , so there will always be some where until . This gives the minimal .
This equivalent view is closer to what Jessica is doing. You can think of as the original vector space plus several "dual spaces". However, the "mirror images" come from , and
So, for any non-trivial and non-alternating representation,
There are way more mirror images than dual spaces! It is almost a lucky coincidence that
for the alternating representation, and in general, we should not expect to be able to find dual spaces by looking at mirror images.
I disagree. In general, you need a multilinear form of degree where , while the number of mirror spaces is the degree of , You are very lucky that the rotations and mirroring you use come from the alternating representation of where these happen to match, since and .
I am in the process of writing a much longer comment, but I think the primary question your post leaves unanswered is, "why a bilinear form, not any other degree?" and pulling on that thread unravels the understanding this post allegedly gives.
I think your proposal does not lead to branches getting removed. I think
I think having 後輩 or children you value could keep it your problem, enough so you still bother to remove the branch. However, your effort here would only marginally help millions of people, only one of which is the person you personally care about. It is individually more rational to focus your efforts on just the people you care about.
Alternatively, you could value removing such obstacles as a terminal goal. I think this exhibits in humans as an intolerance to such obstacles. The mental anguish significantly decreases once they are no longer bumping against the obstacle, so I think it is rather rare to find someone with a strong enough aversion to remove the branch from the other side, but was able push past a much stronger aversion to make it there.
Now, maybe the solution is to help people push past their aversions. Encourage them to still go to university, even if they believe the world ought to be different. But does this actually help fix the problem? It does not matter if they are 10x more able to fix it, if they care 10% as much. Maybe it relieves some of their suffering, but surely the system is producing much more unnecessary suffering than this proposal would relieve? Plus, it isn't suffering of people you care about. If you are being individually rational, it seems like your best move is to tell your friends and family to work within the system, while letting the deviants do their own thing to fix it.
I agree that the critique is a fairly standard wokist criticism of meritocracy. I think the author's two main failures are in adopting a self-centered moral position ("how I want the world to be is how it ought to be"), and poor training. The first is a common mistake, but leaves me wondering, "why should society cater to your whims? Why would they be better off, according to their desires?" It is not enough to say the system creates perverse incentives, if you cannot design a better system for people to adopt. That is his second issue. He does not have the mathematical training to design a better stable system, and reading more humanities papers will only reinforce that something is wrong, without helping him suggest improvements. This is why he can come across as whining or indulging in some superiority fantasy.
I think what you miss is people take different attitudes towards unnecessary suffering. I can imagine if you were hiking along a trail and came across a fallen tree, you would calmly duck under it or climb around it. It is the path you are walking on, and if you wish to reach your desired destination with the least effort, you have to pay these small prices. Others would kick the log for getting in their way, stubbing a toe and not solving anything, but then submit to the path laid out for them. The blessed few carry a hatchet, and would clear the way for those that come after them. Then there is the author, who is so indignant about fallen trees unnecessarily being in their way, but carry no hatchets, that they end up storming off the beaten path and try stomping out their own trail. It usually does not work out for them, and even if they succeed, the new trail is longer and worse than the original trail.
I think the first kind of person is defecting in this scenario, unless they put a few dollars towards fixing the trail when they leave. While broader society is not smart or informed enough to punish such defections—and thus defecting really can be your selfishly optimal move—that does mean it is not defection. Telling people to shut up and climb over the log is also defection. It is one thing to say, "the trail is not good, but there is not much you can do about it, so for your own sake, just climb over it." Jerret Ye (@L.M.Sherlock) has published another of the author's articles arguing essentally this. But it is entirely another thing to say, "yes, the trail is not good, but stop complaining already. Just shut up and climb over it." This is what you are doing, and if everyone did it, the trail would never improve.
I was wrong when I said you had completely missed the author's second narrative. I believe I was frustrated by what I saw as defection, and thought surely you wouldn't be defecting if you knew it was defection. But that is a modelling error, because there are many other reasons you might think your comment was prosocial. Most likely, if you can shut up complainers, it makes the system more stable and helps others not go through the author's pain. I would even agree with an argument that the author has not thought this through well enough to propose good solutions, and so it is better they say nothing at all than risk destroying an almost good thing. I do not know exactly why you wrote such a harsh critique, but I really wonder if there was a better way to achieve your goals.
I do think you and the author are playing a cooperative game here. Both of you have a goal that includes others living better lives due to your writing. In such cooperative games, honesty is usually the best policy. Just put a couple sentences at the top explaining your goals, maybe something like, "I believe these memes are harmful to spread, so I am purposefully being harsh with the author in the hopes that they and others like them put more thought into their writing before publishing." Of course, then all your readers will wonder why you think the memes are harmful to spread, and so you woud have to explain that, but I think this kind of process would significantly improve your comment.
I do want to expand on my point about the different kinds of people. Some people find it intolerable to move around an obstacle. I am much more similar to the author in this way, and really tried to avoid attending university. I applied to many tech jobs, but failed to get interviews except at some of the lowest-paying ones. To quote the author's experience, which matches mine,
I’m afraid the company’s HR backend probably didn’t even see my resume. A simple “filter by education,” and sorry, all those outside the selection are automatically deleted... How do you prove yourself to someone who can’t even see you? It’s impossible.
I had significantly higher quailfications on paper—a high school diploma, great standardized test scores, and better competition results—but even I could not be seen by a human. So, although I really hated wasting my time on paperwork, I went to MIT. I mostly tried to take classes I was interested in and luckily for me I was interested in a broad education. My largest frustrations were actually in some required science classes: introductory chemistry, probability, algorithms, and whatnot. I did my best to substitute them with more useful classes, but I was told for many of them, "you just have to take it. You cannot test out or substitute them." The whole process was very frustrating, especially because I am so intolerable to unnecessary obstacles that only exist because someone never bothered to design the system a little better. What made it worse is MIT had a better system (at least for me) fifty years ago, and it had evolved to be worse in terms of free learning, but better in terms of their typical student's income after graduation. Anyway, I remember thinking over and over, "why am I doing all this," and I know I would have dropped out if I did not graduate early. Even still, I almost dropped out to join a startup instead of doing my final semester.
I could make this a much longer post, but what I'm trying to highlight is there are some people who see problems that should not exist, and it causes them a ton of mental anguish. You seem generally unbothered, and can go around the obstacles in stride, but it is really painful for people like me. Logically, the additional effort is merely a lot, but mentally it is intolerable. And, the author has much more difficulty in this regard than I do. Part of it is a stricter system, part of it is less natural talent the system would reward them for, and part of it is having an even stronger mental aversion than me to disfunction.
I understand the sentiment. It feels like a waste of effort to study for exams you do not enjoy, do not make you more capable, and do not improve your self-image. I did not experience much of this, as the American standardized exams are significantly easier than the Chinese ones and cannot measure what the elite institutions are looking for, which forces a more holistic approach. It also causes other awful downstream effects, so I do not recommend imitating America here.
I also agree that a lottery system is better than assigning ranks and filtering for the top. There are many goals the university could have in mind when selecting candidates, which they abstract into some process that ranks candidates. If this process is well-known, such as with entrance exams, candidates can spend an inordinate amount of time optimizing for this proxy instead of aligning closer to the university's ideal student. An opaque process is worse, as part of the optimization process becomes uncovering the black box, which in America comes in the form of $20k "college admissions consultants". The solution is to make a transparent process for scoring or ranking candidates, but then to randomize the selection.
The author's proposal for the lottery is wrong. Most admissions, especially at elite institutions, follow a Pareto distribution. Most admissions, especially at elite institutions, are somewhat random, due to application reviewers being human and needing to eat lunch. Thus, almost everyone admitted to elite institutions won a thresholded lottery. These universities like to humble their students during orientation week by saying, "we could have chosen the next 1,000 students instead of you," which is somewhat false, but true for the bottom 900 of them. It may be different in China (or Taiwan), and this could be a quirk of the American system. However, the American system has a similar educational arms race, so a thresholded lottery is not the solution.
The solution is an exponential lottery. There are diminishing returns to optimizing a score function. To give an example with language learning, it takes about as long to understand 50% of the words on the screen as the next 25%. Scores are logarithmic with effort. If
candidates would be indifferent between the cost of studying and its expected reward. Only those that have a different, better reward in mind will continue studying, naturally killing the arms race. I believe the right temperature choice should keep the free energy constant across admissions cycles, but I am not sure.
I think there are two things the author is doing here. One of them is whining about not wanting to pay the price the system demands to reap its benefits. The other is arguing that all this pain and suffering is unnecessary for the system to function as it claims it intends to function. This is why they call it a "rat race" (内卷 [1] ), bring up the distinction between Pareto and Kaldor-Hicks improvements, talk about about how the system is self-reproducing—which means it survives despite disfunction—and mention how external negative pressure is the natural reaction. You seemed to have completely missed this second narrative. Although they may live in a disfunctional system, the world is a much bigger place. New systems can be created.
I think the term refers to runaway signal inflation. ↩︎
Fewer rows might not give interpretable/rules-based solutions an advantage. I tried training on only the first 100 or 20 rows, and I got CDEFMW (15.66) and EMOPSV (15.34) as the predicted best meals. Admittedly CDEFMW shows up in the first 100 rows scoring 18 points, but not EMOPSV. Maybe a human with 20 rows could do better by coming up with a lot of hypothetical rules, but it seems tough to beat the black box.
I think your main counterpoint to what I said is that people are doing an optimization process where they look at the data while simultaneously doing a search for a better theory. In fact, you cannot even disentangle their brain from the reality that created and runs it, so even a best attempt at theory first, observation second is doomed to fail.
I think the second, stronger sentence is mostly wrong. You do not need a universe similar enough to our universe to produce reasoning similar to ours, just one that can produce similar reasoning and has an incentive to. That incentive can be as little as, "I wonder what physics looks like in 3+1 dimensions?" just like our physicists wonder what it looks like in more or less dimensions, with different fundamental constants, with different laws of motion, with positive spacetime curvature, and so on. Or, we can just shove a bunch of data from our universe into theirs, and reward them for figuring it out (i.e. training LLMs).
As for the first, weaker sentence, yes this is true. Pretty much everyone has tight feedback loops, probably because the search space is too large to first categorize its entirety and then match the single branch you end up observing. I think the role of observation here is closer to moving attention to certain areas of the search space, rather than moving the search tree forward (see Richard Ngo's shortform on chess). The thing is, this process is unnecessary for simple things. You probably learned to solve TicTacToe by playing a bunch of games, but you could have just solved it. I think the concept of trees are relatively simple, though of course if you want a refined concept like its protein composition or DNA sequencing, yeah that space is too big and you probably have to just go out and observe it.
I don't really understand your point about unsupervised learning. With unsupervised learning, you can just run a bunch of data through your model until it learns something. That's the observation -> theory pipeline and it's astoundingly inefficient and bad at generalization. Humans could do the same with 100x fewer examples, which is the gap models need to clear to solve ARC-AGI. Humans are probably doing something closer to theory -> observation.
Perhaps I'm missing something you're saying, or you're missing something I'm saying. In general, the process of finding a "conjugate space" is not an involution. We do not have inner products or Hilbert spaces. There are no pairs. We have to motivate the inner product, and it is motivated by first motivating multilinear forms, and then motivating bilinear forms. But bilinear forms only arise in the alternating representation.
There is a terminology issue here, because "dual" literally means a paired, mirror space, so treating the inner product as a bilinear form in a space ⊗ its dual only works in the alternating representation. What you're actually doing to generate that dual space is to look at the connected components of
{A:ρλ(A)∈Im(πλ)}.
For the alternating representation, there is the component connected to the origin—SO(n)—as well as the mirrored component J−1SO(n)J. But there are many more connected components for other representations, exactly |Im(πλ)| of them. Maybe some of them have involutions to each other, but not all of them. Not all of them are "dual" or "conjugate" in the literal sense of the word. This is the terminology issue, and I think the main source of confusion.
Also, my analogy does not break down for odd n. Note that Vλ is an irreducible module of (Cn)⊗n, not the same space, and (Vλ)⊗k is an entirely new tensor product. I was trying to keep my comment from growing longer than it already was, so I may have left out many other little details like this that would help with interpretation.