What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers

Having read the original article, I was surprised at how long it was (compared to the brief excerpts), and how scathing it was, and how funny it was <3

Criticizing bad science from an abstract, 10000-foot view is pleasant: you hear about some stuff that doesn't replicate, some methodologies that seem a bit silly. "They should improve their methods", "p-hacking is bad", "we must change the incentives", you declare Zeuslike from your throne in the clouds, and then go on with your day.
But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers. As you walk up to the diving platform, the deformed attendant hands you a pair of flippers. Noticing your reticence, he gives a subtle nod as if to say: "come on then, jump in".
Social Capital Paradoxes

It is a free country. No apology necessary <3

Also, maybe I'm at fault for NOT publishing and perishing, but rather (it could be argued) lurking and then enacting some kind of morally dubious "gotcha" maneuver?

In any case, it has generally been my ambition to be cited more in the manner of Socrates than Plato ;-)

Social Capital Paradoxes

The viruses themselves are the prototypical genes that move horizontally. The bacteria use their control mostly to resist these "new genes", but sometimes they can't keep out the horizontal genes of the virus, and then the bacteria spends non-trivial energy generating viral particles that can invade other bacteria, and so on.

The bacterial genes that jump from one vertical linage to another (that make bacteria phylogenetic tree building a bit wonky) are sometimes carried by viruses. Incidental bacterial genes get packaged in viral capsids by accident, then those viral particles get into a bacteria and somehow fail to exploit the host optimally and the host has many descendants anyway. You are right that this is somewhat random/accidental. Neither viruses nor bacteria seem to generally "intend" it in coherent ways.

Sometimes bacteria have a second little genome called a plasmid, and these often contain genes for the construction of a tube that injects neighboring bacteria with the little secondary genome (but not the main genome). These "conjugative plasmids" are engaged in non-random horizontal gene transmission. Conjugative plasmids tend to be more aligned than viral prophages (that lurk in the host genome for several generations) which are more aligned than pure lytic viruses.

The more horizontal things are worse for the bacteria's genetic interests in very simple and concrete ways, related to the normal operation of the genes following their normal "selfish gene" lifecycles.

Social Capital Paradoxes

1. Why do so many good things have horizontal transmission structures?

Memetic horizontal transmission that is mediated by human normative judgement routes around this filter... in some manner. Maybe they are slightly hacking your perceptions of goodness? Also, maybe these filters improve things some.

Far be it from me to claim that modern horizontally transmitted cultural ideas are bad. I would never...

However... between 1800 and 1950 it would have seemed to little children that smoking was terrible, but then if their peers smoke, smoking starts to seem like a way to minimize the disgust, and shortly it begins to seem pretty great, and this becomes the widely shared common wisdom among adults in a society with very high smoking rates. With smoking there was careful centralized analysis, with data collection, and peer review, and careful reasoning about causal models. Eventually we figured out: nope. I can tell you a story about how my mom stopped smoking when I was a kid, and then my brother and I copied her by not starting.

I would argue that "good careful reasoning" is the exception that proves the rule in some sense, because lots of so-called Official Science(!) is pretty shit (parts of tongues that taste different things? wtf? is it all just gossip? when did academia give up on "nullius in verba"?) and the good stuff tends to be invented by a TINY group of people and spreads via *baroquely* cautious transmission patterns.

2. The conclusion seems severe and counterintuitive...

In memetics, this is what trusted priests or scholars are an attempted patch on, I think? I'm sorry. I don't know any good news here.

Biologically, viruses prey on bacteria. Both are made of nucleic acid but some nucleic acid content is aligned with the protein inside the membrane... and some isn't. The "better" viruses are prophages (integrating with the genome and conferring useful phenotypes)... but often they go lytic eventually... and then the infected bacteria's daughter's daughter's daugher's daughters have a regret-worthy outcome.

If people have lots of unprotected sex, a venereal disease eventually finds the niche created by that aggregate behavior. If people fly around in airplanes while sneezing on each other, an aerosolized disease eventually finds that niche. If people drink from a river downstream of where other people poop in the river (especially if some of the the drinkers then travel back upstream), cholera happens. When you feed cows to cows, prions grow exponentially and eventually there's mad cow disease. If elementary school teachers who have never been outside of the school system teach school children who become teachers who teach children who become teachers... You will end up with the curricular equivalent of bovine spongiform encephalopathy.

How long until twitter collapses? Has twitter died already? I'm sorry. The circle of life is best when it circles very VERY widely. Gotta turn it to mulch. Then have fungus eat it. Then let the fungus dry out in direct sun for two years. Then use it CAREFULLY. It makes me sad, but I think it is true. Do not recycle "vital" things!

3. What about The Moral Economy by Samuel Bowles?

I have not read the book you cite. I want to defy the data. I would suggest that high social capital causes prosperity and enables trusted third party mediation, then, because people socially trust the third party mediators, it enables quick interactions based on shared traditions (that affirm trust and that often rely on deeper "trust rails" go back decades or often centuries (often literally to shared ancestors)). This could cause correlations in single temporal snapshots of data. Massaging such snapshots in modern academic writing, people have an incentive to tell happy lies in public like "prosperity causes social capital". The traditional theories here (and the long term economic demography), suggest to me that great wealth is generally squandered by the fourth generation, so the data collection I'd like to see would span 6 generations over various cultural cross-sections, or else it would span maybe like 10 generations (to hopefully see two full cycles)? I would love to be wrong about this, but my priors are strong enough that I want to see very very rigorous data collection methods as part of the presentation of why my priors here should be weakened. Maybe writing a rigorous book review of the contents of The Moral Economy would be virtuous!

If there was a key countervailing idea here, for me it is "acceleration itself". Progress. The increase in the number of humans, and per capita energy use, and humane culture-making activities. Old functional things are being copied and the "oomph" has not burned out... yet! :-)

0. Where did this theory come from and is it horizontal or vertical itself?

I'm going to assume you asked this, and answer it! I invented the theory, basically.

The geminating idea is: Dawkins is just wrong. He used to go around constantly dunking on TRADITIONAL religion about how it was a virus, and he was just... wrong. Many many many generations of shared co-evolution often tames parasites by aligning them deeply with more "metabolic" vertical replicators. Mitochondria are tamed bacterial parasites. The V(D)J combinatorial immune system is a tamed viral parasite. Endosymbiosis is a thing, but it works in a certain way.

Tiny fast evolving things (like cults) are sources of novelty, and larger slower things (like 1000 year old civilizations with old co-evolved religions) must tame them, or be devoured. Novel horizontal culture is often pretty bad. I have extended this theory in various conversational domains going back maybe 15 years to before the launch of Overcoming Bias but it always seemed gauche (and inconsistent with the theory itself) to bring it up ONLINE in a community deeply built around "the rejection of the supernatural mumbo-jumbo of one's parents".

I have talked about the importance of vertically transmitted ideas with my parents (who are themselves second generation atraditionalists), and they roll their eyes, but are happy enough to tolerate my antics when I "larp" "filial piety". In the meantime, filial piety occurs in many religions. The Abrahamic injunction is obvious. If Confucianism has ONE PUNCH, that punch is arguably "filial piety". I have purposefully not talked about horizontal meme transmission where Google can see, but if that goal is to fail at horizontal at this particular historical junction during a horizontally transmitted global plague then I guess I'm ok with it? Naturally it would be better if my children could teach the theory "as taught by their mother" (me), but they do not exist (yet?), and so they can't.

(I would not strongly object if you deleted this post before it can be seen by Google and generally become less of a vertical meme and more of a horizontal meme... Evangelism just seems mildly evil to me, but I'm not evangelical about evangelism being bad... because that would kinda defeat the point? My interest here is mostly... credit assignment I guess? I'm a HUGE fan of thinking about The Credit Assignment Problem. If I have done wrongly, or well, then it seems generally proper that I be credited as having done wrongly or well. Similarly for you. Similarly for all choice-making beings.)

Covid-19 6/18: The Virus Goes South

Air conditioning! As near as I can tell, "indoor air conditioning" is the key mechanistic story for "covid in June".

You can skip the rest if you like, but for details and speculation... This result is actually kind of happy/surprising to me!

When the right was protesting against the covid shutdown I saw a lot of morbid covid speculation about how bad it would be on the left. Then the left was protesting against the police, and some on the right were complaining about how bad it could make covid... But I haven't been able to find any big structural/demographic signals related to any of these protests. We seem to have gotten lucky with the protesting: it didn't increase the plague much! I was worried about it, and so this feels like a relief to me.

More happy news in the structural department is a salon in Missouri that functioned as a natural experiment. Two hairdressers test positive. One was cutting while symptomatic and may have given it to the other somehow during a work week in the same room. Both wore masks. All customers wore masks. 140 customers tracked by computer. 46 of them consented to testing. All came back negative! Maybe there's structural censorship of key data (lying officials, or broken medical test) but if not then ZERO positive customers is a non-trivial signal about mask efficacy! :-)

The one person who seems to have gotten infected by the hairdresser was maybe the other hairdresser in the shop. This loops back around, in my mind, to "sharing a building" and and also calls attention to "air conditioning" as a key mechanistic driver...

As in the OP "Houston and Phoenix and Miami" are hotspots now and I think the common denominator is that in June those are all pretty hot places where the heat drives people indoors to get some AC (which tends to be recirculated air).

Spending a lot of time breathing recirculated indoor air looks like the boogey man to me at this point.

Credibility of the CDC on SARS-CoV-2
I don't recommend the site to friends or family because I know posts like this always pop up and I don't want to expose people to this...

This is just basically correct! Good job! :-)

Arguably, most thoughts that most humans have are either original or good but not both. People seriously attempting to have good, original, pragmatically relevant thoughts about nearly any topic normally just shoot themselves in the foot. This has been discussed ad nauseum.

This place is not good for cognitive children, and indeed it MIGHT not be good for ANYONE! It could be that "speech to persuade" is simply a cultural and biological adaptation of the brain which primarily exists to allow people to trick other people into giving them more resources, and the rest is just a spandrel at best.

It is admirable that you have restrained yourself from spreading links to this website to people you care about and you should continue this practice in the future. One experiment per family is probably more than enough.


HOWEVER, also, you should not try to regulate speech here so that it is safe for dumb people without the ability to calculate probabilities, detect irony, doubt things they read, or otherwise tolerate cognitive "ickiness" that may adhere to various ideas not normally explored or taught.

There is a possibility that original thinking is valuable, and it is possible that developing the capacity for such thinking through the consideration of complex topics is also valuable. This site presupposes the value of such cognitive experimentation, and then follows that impulse to whatever conclusions it leads to.

Regulating speech here to a level so low as to be "safe for anyone to be exposed to" would basically defeat the point of the site.

Credibility of the CDC on SARS-CoV-2

The word "cuarenta", in Spanish, means 40.

In English, if the word "quarantine" is applied to an infection-avoiding isolation period of either more or less than 40 days, that's arguably an abuse of linguistic tradition that reveals whoever says it to be in need of remedial education.

Maybe? *I* probably need remedial education, too! Very prestigious linguists have asserted here or there that linguistics is a descriptivist science, and so, from their very prestigious perspective, any use of language is as good as any other use of language...

Still, it does give one pause.

How many people in public health read or write latin anymore? Maybe there are some things that people used to take so MUCH for granted that no one thought to spell them out? Like "40 day periods should last 40 days" is basically a tautology. Should THAT go into a medical book and become testable knowledge for doctors?

It would be scary for medical inferences based in the obvious literal meaning of words to be valid, so they are probably not valid. I'm sure everything is fine.

The LessWrong 2018 Review

I hunted your comment down here and upvoted it strongly.

I basically only write comments, and when I write "comments for the ages" that I feel proud of, I consider it a good sign if they (1) get many upvotes (especially votes that arrive after lots of competing sibling comments already exist) and (2) do not get any responses (except "Wow! Good! Thanks!" kind of stuff).

Looking at "first level comments" to worthwhile OPs according to a measure like this might provide some interesting and reasonably brief postscripts.

Applying the same basic measure to posts themselves, if an OP gets a large number of direct replies that are highly upvoted that OP may not be dense with relatively useful and/or flawless content. (Though there are probably exceptions that could be detected by thoughtful curating... for example, if the OP is a request for ideas then a lot of highly voted comments are kinda the point.)

The unexpected difficulty of comparing AlphaStar to humans

I think the abstract question of how to cognitively manage a "large action space" and "fog of war" is central here.

In some sense StarCraft could be seen as turn based, with each turn lasting for 1 microsecond, but this framing makes the action space of a beginning-to-end game *enormous*. Maybe not so enormous that a bigger data center couldn't fix it? In some sense, brute force can eventually solve ANY problem tractable to a known "vaguely O(N*log(N))" algorithm.

BUT facing "a limit that forces meta-cognition" is a key idea for "the reason to apply AI to an RTS next, as opposed to a turn based game."

If DeepMind solves it with "merely a bigger data center" then there is a sense in which maybe DeepMind has not yet found the kinds of algorithms that deal with "nebulosity" as an explicit part of the action space (and which are expected by numerous people (including me) to be widely useful in many domains).

(Tangent: The Portia spider is relevant here because it seems that its whole schtick is that it scans with its (limited, but far seeing) eyes, builds up a model of the world via an accumulation of glances, re-uses (limited) neurons to slowly imagine a route through that space, and then follows the route to sneak up on other (similarly limited, but less "meta-cognitive"?) spiders which are its prey.)

No matter how fast something can think or react, SOME game could hypothetically be invented that forces a finitely speedy mind to need action space compression and (maybe) even compression of compression choices. Also, the physical world itself appears to contain huge computational depths.

In some sense then, the "idea of an AI getting good *at an RTS*" is an attempt (which might have failed or might be poorly motivated) to point at issues related to cognitive compression and meta-cognition. There is an implied research strategy aimed at learning to use a pragmatically finite mind to productively work on a pragmatically infinite challenge.

The hunch is that maybe object level compression choices should always have the capacity to suggest not just a move IN THE GAME of doing certain things, but also a move IN THE MIND to re-parse the action space, compress it differently, and hope to bring a different (and more appropriate) set of "reflexes" to bear.

The idea of a game with "fog of war" helps support this research vision. Some actions are pointless for the game, but essential to ensuring the game is "being understood correctly" and game designers adding fog of war to a video game could be seen as an attempt to represent this possibly universally inevitable cognitive limitation in a concretely-ludic symbolic form.

If an AI is trained by programmers "to learn to play an RTS" but that AI doesn't seem to be learning lessons about meta-cognition or clock/calendar management, then it feels a little bit like the AI is not learning what we hoped it was suppose to learn from "an RTS".

This is why these points made by maximkazhenkov in a neighboring comment are central:

The agents on [the public game] ladder don't scout much and can't react accordingly. They don't tech switch midgame and some of them get utterly confused in ways a human wouldn't.

I think this is conceptually linked (through the idea of having strategic access to the compression strategy currently employed) to this thing you said: can have a conversation with a starcraft player while he's playing. It will be clear the player is not paying you his full attention at particularly demanding moments, however... I considered using system 1 and 2 analogies, but because of certain resevations I have with the dichotomy... [that said] there is some deep strategical thinking being done at the instinctual level. This intelligence is just as real as system 2 intelligence and should not be dismissed as being merely reflexes.

In the story about metacognition, verbal powers seem to come up over and over.

I think a lot of people who think hard about this understand that "mere reflexes" are not mere (especially when deeply linked to a reasoning engine that has theories about reflexes).

Also, I think that human meta-cognitive processes might reveal themselves to some degree in the apparent fact that a verbal summary can be generated by a human *in parallel without disrupting the "reflexes" very much*... then sometimes there is a pause in the verbalization while a player concentrates on <something>, and then the verbalization resumes (possibly with a summary of the 'strategic meaning' of the actions that just occurred).

Arguably, to close the loop and make the system more like the general intelligence of a human, part of what should be happening is that any reasoning engine bolted onto the (constrained) reflex engine should be able to be queried by ML programmers to get advice about what kinds of "practice" or "training" needs to be attempted next.

The idea is that by *constraining* the "reflex engine" (to be INadequate for directly mastering the game) we might be forced to develop a reasoning engine for understanding the reflex engine and squeezing the most performance out of it in the face of constraints on what is known and how much time there is to correlate and integrate what is known.

A decent "reflexive reasoning engine" (ie a reasoning engine focused on reflexive engines) might be able to nudge the reflex engine (every 1-30 seconds or so?) to do things that allow the reflex engine to scout brand new maps or change tech trees or do whatever else "seems meta-cognitively important".

A good reasoning engine might be able to DESIGN new maps that would stress test a specific reflex repertoire that it thinks it is currently bad at.

A *great* reasoning engine might be able to predict in the first 30 seconds of a game that it is facing a "stronger player" (with a more relevant reflex engine for this game) such that it will probably lose the game for lack of "the right pre-computed way of thinking about the game".

A really FANTASTIC reflexive reasoning engine might even be able to notice a weaker opponent and then play a "teaching game" that shows that opponent a technique (a locally coherent part of the action space that is only sometimes relevant) that the opponent doesn't understand yet, in a way that might cause the opponent's own reflexive reasoning engine to understand its own weakness and be correctly motivated to practice a way to fix that weakness.

(Tangent: To recall the tangent above to the Portia spider. It preyed on other spiders with similar spider limits. One of the fears here is that all this metacognition, when it occurs in nature, is often deployed in service to competition, either with other members of the same species or else to catch prey. Giving these powers to software entities that ALREADY have better thinking hardware than humans in many ways... well... it certainly gives ME pause. Interesting to think about... but scary to imagine being deployed in the midst of WW3.)

It sounds, Mathias, like you understand a lot of the centrality and depth of "trained reflexes" intuitively from familiarity with BOTH StarCraft and ML both, and part of what I'm doing here is probably just restating large areas of agreement in a new way. Hopefully I am also pointing to other things that are relevant and unknown to some readers :-)

If what we really care about is proving that it can do long term thinking and planning in a game with a large actionspace and imperfect information, why choose starcraft? Why not select something like Frozen Synapse where the only way to win is to fundamentally understand these concepts?

Personally, I did not know that Frozen Synapse existed before I read your comment here. I suspect a lot of people didn't... and also I suspect that part of using StarCraft was simply for its PR value as a beloved RTS classic with a thriving pro scene and deep emotional engagement by many people.

I'm going to go explore Frozen Synapse now. Thank you for calling my attention to it!

The Power to Demolish Bad Arguments
"...go ahead and tell me your causal model and I'll probably cook up an obvious example to satisfy myself in the first minute of your explanation."

I think maybe we agree... verbosely... with different emphasis? :-)

At least I think we could communicate reasonably well. I feel like the danger, if any, would arise from playing example ping pong and having the serious disagreements arise from how we "cook (instantiate?)" examples into models, and "uncook (generalize?)" models into examples.

When people just say what their model "actually is", I really like it.

When people only point to instances I feel like the instances often under-determine the hypothetical underlying idea and leave me still confused as to how to generate novel instances for myself that they would assent to as predictions consistent with the idea that they "meant to mean" with the instances.

Maybe: intensive theories > extensive theories?

Load More