Yesterday I read Zvi's 11/5 Covid update which had:

Researchers claim to have a machine learning system that can diagnose Covid-19 over the phone by analyzing recordings of a forced cough (pdf). They claim sensitivity of 98.5% and specificity of 94.2%, and for asymptomatic cases a sensitivity of 100% (?!?) and specificity of 83.2%. I'm curious to what extent errors in the test are correlated from day to day. This can all be offered at unlimited scale for essentially zero additional cost. So, of course, it will presumably be illegal indefinitely because it's not as good as accurate as a PCR test and/or hasn't gone through the proper approval process, and no one will ever use it. Then again, if one were to somehow download or recreate such a program and run it, who would know?

You can see how they collected data, and contribute your own cough at It doesn't seem to me like they actually have great data on which recordings correspond to people who have it? For example, I submitted mine this morning, and said that I don't think I have Covid. If, however, I later learned that I did have Covid at the time I submitted the sample, there doesn't seem to be any way for me to tell them.

You could get better data, however, by collecting alongside regular Covid tests. Have everyone record a sample of a forced cough when you test them, label the cough with their results once you have it, and you end up with high-quality labeled samples. They trained their AI on 5,320 samples, but at current testing rates we could get 80k samples in a single day in just Massachusetts.

It might turn out that even with higher quality data you still end up with a test that is less accurate than the standard of care, and so are unable to convince the FDA to allow it it. (This is an unreasonable threshold, since even a less accurate test can be very useful as a screening tool, but my understanding is the FDA is very set on this point.) Is there another way we could scale out auditory diagnosis?

Very roughly, their system is one where you take lots of samples of coughs, labeled with whether you think they were produced by someone with coronavirus, and train a neural network to predict the label from the sample. What if instead of artificial neural networks, we used brains?

People are really smart, and I suspect that if you spent some time with a program that played you a sample and you guessed which one it was, and then were told whether you were right, you could learn to be quite good at telling them apart. You could speed up the process by starting with prototypical standard and Covid coughs, as identified by the AI, and then showing progressively borderline ones as people get better at it. In fact, I suspect many medical professionals who have worked on a Covid ward already have a good sense of what the cough sounds like.

I don't know the regulations around what needs a license, but I think there's a good chance that hosting a tool like this does not require one, or that it requires one that is relatively practical to get? If so, we could train medical professionals (or even the general public?) to identify these coughs.

Automated screening would be much better, since the cost is so much lower and it could be rolled out extremely widely. But teaching humans to discriminate would be substantially cheaper than what we have today, and with much weaker supply restrictions.

(I looked to see whether the researchers made their samples available, but they don't seem too. Listening to some would've been a great way to check how practical this seems.)

New to LessWrong?

New Comment
8 comments, sorted by Click to highlight new comments since: Today at 2:27 PM

It's the new Captcha!  "please identify which of these sounds is a COVID-19 cough".  More seriously, I think your fundamental premise is incorrect.

People are really smart

Umm, no.  People have very wide domains of talent and learning, and are insanely flexible in how they can generalize (sometimes even usefully) and apply models across domains.  They're strictly worse than computers in precision and cost at narrow, well-defined classification tasks.

If this replicates, I expect diagnose-by-phone to become a thing.  But it won't be humans doing most of it (it may include human experts for annotation and data prep).

It’s unlikely that a human would be better than an AI at diagnosing covid as we already know that AI outperforms doctors on other diagnoses including breast cancer for example

It sounds like they identify coughs of people who know/believe they have COVID. I suspect that might affect how those people perform their forced cough as opposed to people who don't think they have COVID.

Reposting my comment under Zvi's post: 

Due to the online collection method I suspect that most of the positive samples were already quite advanced in their disease progression.  Since Covid-19 deposits in the lungs mainly in the latter part of the disease it is easier to identify them at that point, but also not that useful anymore because most of the transmission happens during the earlier part of the infection (both for symptomatic and asymptomatic people). 

These researchers  had a much better sample procedure, cough samples were mostly acquired at testing sites, where participants did not know yet whether they have Covid (much less risk of subconscious bias) and were presumably at an earlier stage of their disease. They also had much worse results, which I suspect are more realistic for a real world setting.  

What actually needs to be done is to do a longitudinal analysis, i.e. you record your baseline cough when you are healthy. Then if you want to check if you are infected, you cough again and compare that "potentially sick" cough against your baseline "non-covid cough". The potential of this approach is much higher since  baseline characteristics of the cough can be accounted for (smoker, asthmatic, crappy mic in phone). 

I have been thinking that it should be possible to gather training data for this quickly by identifying a subset of people that are somewhat likely to get sick in the near future like e.g. people participating in big parties, and acquire coughs from them prior and subsequent to infection. If somebody has ideas how to collect such data quickly, feel free to share.

I don't know the regulations around what needs a license, but I think there's a good chance that hosting a tool like this does not require one, or that it requires one that is relatively practical to get?

In Germany you need approval when you either diagnose or cure illnesses. I would expect that it works similarly in the US. COVID-19 clearly is an illness and as such diagnosing it will require approval.

I think you would likely need to have the tool hosted by an organization in a different jurisdiction then the EU or the US or go to the FDA and pray.

How exactly do the regulations work here? What happens if someone releases an app that converts sound data into a binary result, and while it doesn't claim to be a Covid test, it just-so-happens that when a cough sound is input the True/False output correlates very highly with Covid infection?

From the paper they are actually planning on trying this within a Fortune 100 company so this at least must be allowable.

To that end, we have reached an agreement with a Fortune 100 company to demonstrate the value of our tool as part of their COVID-19 management practices. As we have shown there are cultural and age differences in coughs, future work could focus on tailoring the model to different age groups and regions of the world using the metadata captured, something we would like to test at the company site. 

The regulation problem may be primarily in the handling/collection of training data.  Sound files are considered (at least by the big voice-product providers) to be personally-identifiable, and medical information about whether someone has (or probably-has, or claims-to-have) COVID-19 is very restricted in lots of jurisdictions.  

This research data and the production system should perhaps be based outside of the US or Europe.