[UPDATED] COVID-19 cabin secondary attack rates on Diamond Princess

Thank you, this is exactly the sort of clever analysis I was hoping people would come up with when I wrote my post.

This site has floor-plan images of Diamond Princess cabins, from which we can make a few inferences about cabin occupancy. Most of the cabin layouts contain a single bed which fits two people, so two-person cabins will almost exclusively couples sharing a bed. If I assume the rate at which people in single-person cabins get infected (8%) is the rate of infection outside the cabin, and that the higher rate of infection in two-person cabins is caused entirely by within-cabin secondary transmission, then it looks like each person would have to infect their partner an average of 1.5 times each. This also tells us that the transmission rate between elderly couples sharing a bed is likely to be extremely high, and also that people in single-person cabins must be different in some way--perhaps they spent less time in the ship's common areas.

Three- and four-person cabins seem harder to interpret. These would originally have been couples with children, but there aren't many children aboard as of Feb 5th, and they probably moved people around to free up single cabins for extra-vulnerable people and for confirmed cases that they needed to isolate.

[-]Bucky6y*40

If I assume the rate at which people in single-person cabins get infected (8%) is the rate of infection outside the cabin, and that the higher rate of infection in two-person cabins is caused entirely by within-cabin secondary transmission, then it looks like each person would have to infect their partner an average of 1.5 times each. This also tells us that the transmission rate between elderly couples sharing a bed is likely to be extremely high, and also that people in single-person cabins must be different in some way--perhaps they spent less time in the ship's common areas.

This was my original thought too. However, as the 8% is based on only 6 positive cases it isn't a very precise figure.

As an example, the maximum likelihood for any pair of variables for my models comes at background infection rate of 0.133, secondary attack rate=0.55 with no tertiary attack (I didn't mention this in the OP for fear of people taking the 0.55 to be especially relevant). In this case the probability of getting 6 or fewer infections in 1-berth cabins would be 0.11 - unlikely but not massively so.

The corresponding probabilities for 2, 3 and 4-berth cabins are 0.68, 0.14 and 0.50. Those 4 numbers seem fairly random, suggesting that there's no need to stipulate base rates which vary based on cabin size to explain the data.

In truth I suspect that there may be differences in the base rate between cabin sizes but wouldn't have known in advance which size would have had a higher base rate. With only 4 data points even using 2 variables in the model is pushing it - if I used anymore I could have explained almost anything!

***

Edit: Section below is no longer endorsed

Regarding the effect of quarantine measures, only 115 of the 536 passenger infections analysed had onset after the quarantine started. Figure 1 here suggests to me that almost all of the infections occurred before quarantine and onset was delayed by incubation period.

[-]johnswentworth6y90

I went back-and-forth with Bucky a bit, looked at the formulas, and I now think the current graph is correct. The main surprising thing was that the likelihood isn't sharper; apparently there's actually pretty few 1-berth cabins, so we don't have a sharp estimate for the background infection rate. Most of the uncertainty in the secondary rate is tightly coupled to the uncertainty in the background rate.

[-]johnswentworth6y50

That graph looks fishy. Wouldn't a secondary attack rate of 1 mean that everyone in a cabin with someone sick catches it immediately? Shouldn't that be deterministically ruled out by the data, and therefore have exactly-zero likelihood?

Also, in general, seeing likelihood graphed on a linear scale makes me think something is very wrong.

Maybe a bug somewhere?

[-]Bucky6y60

attack rate = 1 within a cabin would be everyone catches it at some point (but not necessarily immediately) provided that someone brings it in in the first place - its a rate per sick person rather than per unit time. I don't have data on whether this is the case although I doubt it.

Technically I suppose having 18 cases in 4-berth cabins does rule that out. My model isn't sophisticated enough to catch something like that - I look at average illness rate as an input to the binomial distribution, I never check whether the total number is likely. Adding that complexity might help narrow down the true secondary attack rate.

I've added a log graph.

LESSWRONG
LW

LESSWRONG
LW

50

[UPDATED] COVID-19 cabin secondary attack rates on Diamond Princess

50

50

Introduction

Data

Method

Results

Discussion

Conclusion