COVID-19's Household Secondary Attack Rate Is Unknown

by jimrandomh5 min read16th Mar 202011 comments



For group houses, one of the most important factors when deciding how to relate to COVID-19 is the question: If one of my housemates gets infected, how likely am I to also get infected? This is known as the household secondary attack rate, and it determines how much you need to worry about your housemates' level of precaution (as compared to your own), and how much within-house social distancing is necessary.

Household secondary attack rate is context dependent; it could mean one of several things, which I will illustrate with two scenarios:

  • Scenario 1: A member of your household is returning from a trip to Wuhan, which you have heard is high risk. You react… however you react to that. You might find a way to be out of the house for awhile, or prepare an isolated room they can lock themself in. If they're your spouse, you might decide that isolation isn't worth it. Some time not long after they become symptomatic, they check into a hospital, where they remain until after their infectious period is over.
  • Scenario 2: You live in a group house. You all have separate rooms, but share a kitchen and living room. One of your housemates is infected during a trip to the grocery store. Some time later they become symptomatic, and you react… however you react to that. You might or might not have somewhere to move to, or somewhere to move them to, but they can't check into a hospital because they're all full. You might or might not have been exposed to presymptomatic transmission. If neither of you has the ability to move, you'll be in the same house until after they've recovered.

Right now there are two studies which purport to measure the household secondary attack rate:

I'll refer to these as the CDC report and the Shenzhen study, respectively. These are the only studies I have been able to find which make any quantitative claim about COVID-19's secondary attack rate, and all other claims I've found have traced back to either of these two. The CDC study finds a household secondary attack rate of 10%; the Shenzhen study finds a household secondary attack rate of 15%.

When I started writing this post, I thought I'd be focusing on the difference between scenarios 1 and 2. Unfortunately, as I dug into the studies in detail, I found evidence of severe problems which make me think that these two studies provide almost no evidence whatsoever about COVID-19's household secondary attack rate, even in scenario 1.

CDC Report

On March 3, CDC published a report on the results of a contact-tracing program started on January 20. The report statistics on contacts of the first 10 patients with travel-related confirmed COVID-19 reported in the US; presumably, all of these travellers came from Hubei on or after January 20. They trace 445 contacts total, of which 54 developed concerning symptoms, became "persons under investigation", and were tested. It doesn't sound like anyone besides those 54 were tested.

The 445 contacts break down as follows:

  • 222 were health care personnel
  • 100 were "community members who were exposed to a patient in a health care setting"
  • 104 were "community members who spent at least 10 minutes within 6 feet of a patient with confirmed disease"
  • 19 were members of a patient's household

Out of the 54 people who were tested, two were positive; of those two, both were members of a patient's household. The CDC report does not provide any further information about those two positive cases, but they can be pretty easily matched to public news coverage. The first case was in Illinois and is described in detail in this Lancet paper; the second was in San Benito and is described in this announcement from a local public health agency. Both transmissions were to the spouses of travellers who returned from Wuhan.

The Lancet paper describes the first instance of person-to-person spread in detail, with a complete timeline of travel, symptoms, and tests. A woman who returned from Wuhan to Illinois on January 13 tested positive on January 20; her husband tested positive on January 24. In the Lancet paper, a few things are striking.

The first striking thing is that the husband was not tested until he developed a fever, at which point his wife had been hospitalized with a positive test for 4 days. So, testing was very much not proactive.

The second striking thing is that they ran many different tests in parallel, and appear to have been grappling with false negatives.

The third striking thing about the Lancet paper involved monitoring of 372 contacts, of which 44 became PUIs and were tested. Of these 44, one was her husband and was positive, and this was the only household contact. The CDC report had 445 contacts and 54 people tested. So after subtracting out the Lancet study and the San Benito case, we're left with 17 household members, 56 miscellaneous contacts, and... only 9 people tested. There is no information on how those 9 tests were allocated, except that they were negative.

Of the 19 household members in the CDC study, five stayed in the house with an infected person after they were diagnosed. 

So to summarize: Two family members of index cases were tested and were positive. Nine more tests were allocated between 17 household members and 56 miscellaneous other contacts; none of those nine tests were positive. From this, the CDC report concludes that the household secondary attack rate is 2/19 (~10%).

I would say that this is laughable, but unfortunately it isn't funny. The practical upshot of all this is that the CDC report provides almost no information whatsoever about the household secondary attack rate.

Shenzhen Study

Shenzhen is a Chinese city in Guangdong province. The Shenzhen study looks at 391 cases and 1286 close contacts between Jan 14 and Feb 12, and estimates a household secondary attack rate of 15%.

298 (76%) of the index cases were travelers. Sick people were isolated an average of 2.57 days after symptom onset (if they were being monitored for symptoms because they had been labelled as at-risk by contact tracing) or 4.64 days after symptom onset (if they weren't). The study estimates R during the observation period to have been 0.4, implying successful containment.

I have a few concerns with this study.

My first concern is that the household secondary attack rate is an important factor in peoples' decision whether to stay put when a household member is sick, which might create political pressure to find a low number. If people tried to move out when their housemates got sick, they wouldn't lower their own risk much, but they would spread it wherever they moved to.

My second concern is that 9 days before the Shenzhen study was published as a preprint, the Report of the WHO-China Joint Mission on COVID-19 stated that

Household transmission studies are currently underway, but preliminary studies ongoing in Guangdong estimate the secondary attack rate in households ranges from 3-10%.

I believe the Shenzhen study is the preliminary study referred to (the geographic location matches, and I can find no other studies in that geographic region which attempt to measure the rate). This seems like evidence of political pressure to report a low attack rate. (10% was the household secondary attack rate for SARS, and was used in some preliminary modeling of COVID-19 transmission dynamics before data was available.)

My third concern is that the paper contains three different household secondary attack rates: 15% in the Findings section, 14.9% in the Transmission Characteristics section, and 12.9% in Table 3. I cannot reconcile these numbers, and my attempts to cross-check numbers between different sections and tables within the paper all ended in mismatches and muddle.

My fourth concern is that in table 3, adding up the numbers within the category labels implies a substantial amount of data is missing, in ways that make no sense. 19% of contacts are missing a gender, 17% are missing an age, 10% are missing the annotation of whether they're a household-member or not, and 14% are missing the annotation for whether they interacted with the contact rarely, moderately often, or often. I am having a hard time imagining what sort of data collection process could do this, without being such a mess that serious errors are likely.

My fifth concern is that during the period studied, China was having significant issues with false negatives. Feb 12, the last day covered in the Shenzhen study, is the day before China changed its diagnostic criteria and reported a 34% one-day increase in cases. The study itself states that it changed its definition of a confirmed case changed on Feb 7, to require symptoms, "but sensitivity analyses show that truncating the data at this point does not qualitatively impact results". The paper reports results for many variables, and does not state which variables had sensitivity analysis performed.

These issues add up to extremely low confidence in the paper. I might change my mind if the authors release data that someone else can analyze, or someone manages to make sense of the seeming inconsistencies within it. Either of these things would surprise me.


The unfortunate practical upshot is that there's no good quantitative estimate of the household secondary attack rate (or attack rates in general). My belief, based on priors and on the observed large values for R0, is that it's probably quite high, and I will be acting accordingly; but even a small amount of non-terrible evidence could shift this belief greatly.