Wiki Contributions


Virtue signaling is sometimes the best or the only metric we have

I understand "virtue signalling" as a novel term that is only loosely related to the concepts of "virtue" or "signalling".

It's a little annoying to have to mentally translate it into "that thing that people mean when they say 'virtue signalling'" (or sometimes a little amusing, when there's an interesting contrast with the literal meaning that a person's actions have signalled their virtues).

Playing with DALL·E 2
  • this photo was taken with perfect timing as the water balloon was breaking
  • note the symmetry in this ancient Roman mosaic from the National Museum of Roman Art in Merida
  • hot air balloons over the rolling hills of Tuscany
  • literal puppy love, awwwww
  • a sculpture at the Museum of Modern Art based on a child's drawing of a monster
  • traffic on the busy streets of Sydney
Down By 30

Down by 30 I probably put in backups to make sure my best players don't get injured, and play to run out the clock. Winning would require a record-breaking comeback, and even if we go all-out to win the chances of pulling it off are tiny, maybe 1 in a million.

Though I guess if it's the playoffs then I keep playing for the win. Regular season it would be worth playing for the win if we're down by 20 instead of 30.

Generally teams do adjust their tactics in the right direction in these sorts of situations, but not by enough on average. NFL teams play faster when trailing by multiple scores, but they usually don't shift to the full-fledged hurry-up offense that they use when time is running low at the end of the half. Teams go for it more on 4th down, but there are still plenty of cowardly punts. Quarterbacks throw more interceptions due to trying to make more risky passes down the field into tight coverage - I guess I'm less sure about how to tell whether they do too little of that. We don't see many trick plays in these situations - something like a flea flicker might not be that effective when the defense is playing it safe, but I expect that using more laterals (like a hook & ladder play) would be worth the risk (though I'm not sure about the cost-benefit of putting those on tape vs. saving them for higher leverage situations).

Confidence Levels in Forecasts and Psychological Surveys

Response bias or "response style" is the keyword for phenomena like these. "Extreme responding" or "extreme response style" are terms for gravitating away from the neutral option.

I don't know if there's research on whether this is correlated with overconfidence.

Compute Trends — Comparison to OpenAI’s AI and Compute

It looks like the AlphaGo models play a huge role in the "trend" of large-scale models.

In your spreadsheet, AlphaGo Master (January 2017) was much larger than anything that came before it (22x the training compute). AlphaGo Zero doubled that later the same year (October 2017) and remained the largest model by training compute for almost 4 years, until August 2021. By late 2021 the trend had caught up and now 5 models have exceeded AlphaGo Zero, all language models, led by Megatron-Turing NLG 530B (October 2021) which has 4x the training compute.

The trend looked steeper when AlphaGo Master & Zero were near the end of the trend line, when OpenAI was analyzing it in 2018. The trend for the "large-scale era" looks shallower now that those models are in the first half of the era. To me it looks like those models are off-trend, even compared to other "large scale" models. DeepMind was much more willing to throw compute at the AlphaGo models than anyone has been with any other model.

On Bounded Distrust

The masks story fits the template of bounded distrust. You sum it up here as:

Let me tell you a story, in three acts.

  1. All masks don’t work unless you’re a health professional.
  2. All masks work.
  3. Cloth masks don’t work.

At each stage of this story, scientists got on television to tout the current line. At each stage of this story, the ‘science denier’ style labels got used and contrary views were considered ‘dangerous misinformation.’

Those were the gestalt of the era, not the words that the top experts were saying. Focusing in just on the first act:

Look at the words that Fauci or the CDC were saying in March 2020, and it wasn't "wearing a mask won't reduce your chances of getting covid at all."

The quote from the CDC which got discussed on LW was "CDC does not recommend that people who are well wear a facemask to protect themselves from respiratory diseases, including COVID-19." This isn't directly saying anything about the world (except about what the CDC is recommending); the CDC generally speaks in imperatives rather than declaratives.

Fauci said sentences like "When you’re in the middle of an outbreak, wearing a mask might make people feel a little bit better and it might even block a droplet, but it’s not providing the perfect protection that people think that it is" (from this 90 sec video - IMO watching the video gives a better sense of what Fauci is up to). The content of that sentence is that the risk reduction from masks is probably > 0% but is < 100%, but it's said in a way that can easily give an impression that rounds off to "don't work." Some interesting rhetorical moves here, like comparing to the standard of "perfect perfect", and switching to very concrete language ("block a droplet") and tentative phrasing ("might even") when discussing the benefits rather than using any word like "effective" or "helps" or "prevents" which fits neatly into the concept templates that people are ready to act on.

It's the sort of situation where, if you're half paying attention to the chatter and inclined to believe what you hear, then you'll come away thinking that masks don't work. But my recollection is that, once I had reason to think that think mask-wearing would reduce my chances of getting covid (late Feb?), I didn't come across any instances where a person who would know clearly said 'no, you're wrong about that, masks don't work at all.'

French long COVID study: Belief vs Infection

For comparison: imagine some medical researchers are interested in whether a particular medicine helps with a particular medical condition, so they set up a placebo controlled trial. A bunch of people with the medical condition all get their symptoms tested, then they flip a coin and half get pills with the medicine while the other half get sugar pills, and they don't know whether they have the real pills. Then, some time later, they all get their symptoms tested again.

Now, imagine that I'm interested in "placebo effects" - I want to see if the ritual of taking sugar pills which you think might be medicine improves people's health, or causes side effects, and I want to piggyback on this medical trial. I could just look at the pre vs post results for the set of people who got the sugar pills, but unfortunately this medical condition varies over time so I can't disentangle effects of the pill-taking ritual from changes over time. I wish the study had a third "no-pill" group who (knowingly) didn't get any treatment, in addition to the medical pill group and the inert pill group. Then I could just compare the results of the sugar pill group to the no pill group. But it doesn't.

So I have the clever idea of getting the researchers to add a question to the tests at the end of the study, where they ask the patients whether they think they got the medicine pill or the sugar pill. That gives me a nice 2x2 design, where patients differ both in whether they got the medicine pill or the sugar pill, and separately in whether they believe they got the medicine pill or the sugar pill. So I can look separately at each of the 4 groups to see how much their condition improved, and what side effects they got. Changes that are associated with beliefs, I can claim, are based on the psychological effects of this pill taking ritual rather than the physiological effects of the substance they ingested.

This is a terrible study design. Who's going to believe they got the real medicine? Well, people whose condition improved will tend to think they must've gotten the real medicine. And people who noticed physiological states like nausea or dry mouth will tend to think they've gotten the real medicine. This study design will say that improved condition & nausea are caused by people's beliefs about whether they got the medicine, when in reality it's the reverse: the beliefs are caused by these physical changes.

If I'm especially meddlesome, I might even tell the original researchers that they should use this 2x2 design to evaluate the original study. Instead of just comparing the outcomes for the medicine pill group and the sugar pill group, they should compare the outcomes while controlling for people's beliefs about whether they got the medicine. That would mess up their study. It would be asking how effective the medicine is, after removing any effects that allowed patients to realize that they'd gotten the medicine (as if belief-entangled effects couldn't be physiological effects of the substance).

French long COVID study: Belief vs Infection

It's tricky to run studies with beliefs as a variable, because beliefs have causes, so you're setting yourself up to have confounds. I haven't looked that closely at this study, but here are some possibilities:

Severity: people who had covid but believed that they didn't had mild symptoms. So 'severe cases have more long-term symptoms than mild/asymptomatic cases' would look like 'covid+belief leads to more reported long-term symptoms than covid without belief'.

Other illnesses: people who didn't have covid but thought the did had some other illness like flu or pneumonia. If there's long flu, then the long-term symptoms could be from that.

Long-term symptoms: a person who thinks that they probably just have a cold and not covid, but then is still fatigued a month later, might conclude that actually it probably was covid. So medium-to-long-term symptoms can cause belief, rather than belief causing long-term symptoms.

Testing inaccuracy: if the test that they're using to establish the ground truth of whether a person had covid isn't that accurate, then people who they're counting as 'covid but no belief' might actually be false positives, and people who they're counting as 'no covid but yes belief' might be false negatives.

Hypochondria: people who are to imagine that their health is worse than it actually is might mistakenly believe that they had covid (when they didn't) and also imagine that they have long-term symptoms like fatigue or difficulty breathing. If people who did get covid have similar reported long-term symptoms, that means that the actual long-term symptoms of people who had covid are as bad as the imagined level of symptoms among people who imagined they had covid.

Denial: the reverse of hypochondria - people who say they're fine even when they have some health symptoms might say that they didn't have covid even though they did, and then downplay their long-term symptoms. 

Trolling: if data slipped into the study from any people who find it funny to give extreme answers, they would look like hypochondriacs, claiming to have covid & long-term symptoms even if they didn't have covid.

The first few of these possibilities are cases where facts about the world influence beliefs, and those facts also influence long-term symptoms. The last few of these possibilities are where the person's traits influence their beliefs (or stated beliefs), and those traits also influence their reports of what long-term symptoms they've had.

If you wanted to independently assess the effects of getting covid and believing that you had covid, ideally (for scientific rigor unconstrained by ethics or practicality) you'd randomly assign some people to get covid or not and also randomly assign some people to believe they had covid or not (e.g. by lying to them). If you couldn't have perfect random assignment + blinding, then you'd want to measure a whole bunch of other variables to account for them statistically. In reality, without anything like random assignment, who gets covid is maybe close enough to random for an observational study to work, especially if you control for some high-level variables like age. Beliefs about whether you had covid are heavily entangled with relevant stuff, in a way that makes it really hard to study them as an independent variable.

Is there good reason to think that this study overcomes these problems?

Omicron Post #11

Chance that Omicron has a 100% or bigger transmission advantage in practice versus Delta: 65% → 70%.

The new study says 161% in vaccinated people, 266% in the boosted, 17% in the unvaccinated. If you average that out, it’s higher than 100% in the populations we care about, but it’s somewhat close. Thus I’m creeping back up a bit.

Can you show your work? My quick BOTEC (making up plausible numbers for the other inputs) came out to a bit under 100%.

Load More