Since the world began its battle against the outbreak of Covid-19, mobile apps have promised to do it all: pinpoint infections, predict who may be at highest risk, learn how long the virus survives on surfaces, estimate the fraction of asymptomatic carriers, target medical resources, prevent people from being exposed, the list goes on. And while some mobile apps can indeed be useful as we adapt to life with this virus, there is also evidence that by skewing our understanding of this disease, certain apps are more harmful than helpful.
Kaiser Fung has been the data science lead at various companies. He is the author of Numbers Rule Your World. You can find all three installments of his comprehensive review of this study on his blog, Junk Charts.
Currently, there are no fewer than seven major Covid-19-related apps in the US, if we count only those backed by governments or reputable health care organizations. Most will, of course, attract few users and fade away—but there is one mobile app that has already garnered attention for its surprising discoveries. The COVID Symptom Tracker was first launched in the United Kingdom by a research team at King’s College, and it has been promoted in the US by Harvard and Stanford Medical Schools. The Symptom Tracker boasted over 1.6 million downloads in its first week of launch in late March. The response was so rapid and remarkable that the researchers needed just five days of data to fire off the first preprint of scientific findings. But if the initial analytics coming out of the COVID Symptom Tracker are a sign of what’s to come, then app developers have much heavy lifting ahead as they battle a large volume of low-quality data.
Each day, users of the Covid Symptom Tracker are asked to file a report of their health. They can also see an estimate of cases in their area. The app offers a self-diagnosis of Covid-19, which is not necessarily accurate but undoubtedly useful while testing is triaged and rationed by governments. (Facing a shortage of diagnostic testing, both the UK and the US governments have restricted tests to people with severe symptoms.) An undesirable side effect of targeted testing, however, is contaminating the data feeding downstream analytics, such as estimating the population prevalence of Covid-19 and identifying the most relevant symptoms. This harm is laid bare in the preprint of scientific findings published by the Tracker app team. As this app and others like it become more popular, it’s critical that we understand what the data coming out of self-reporting symptom trackers can and cannot tell us.
At the outset, the researchers’ goal was to use the self-reported symptoms to predict test results. They began by assembling an analytical sample (also known as the training data) containing symptoms and self-reported diagnostic test results. Given test rationing, we can assume that all of the users included in the sample experienced severe symptoms.
The Tracker App study provided the first scientific evidence to support loss of smell and taste, or anosmia, as a symptom of Covid-19, which some doctors and patients had suspected since the early days of the pandemic. The researchers went further, though, declaring anosmia the single best predictor of infection—even better than the usual suspects like persistent cough. But anosmia’s predictive prowess is nothing more than a mirage of triage testing.
Notably, about one-third of the analytical sample tested positive for coronavirus while two-thirds tested negative, meaning a perfect predictive model would flag one in three of the app users as positive for the virus. The obvious symptoms, such as persistent cough, afflicted half of the analytical sample, so using cough as a single predictor would have meant flagging half of the users as likely to be infected. So, out of every 100 users, 50 are predicted positive, but since 67 have reported testing negative, we know immediately that 17 out of every 50 positives are false positives. To have the best chance of correctly predicting all true positives, while making the fewest possible false-negative errors, the symptom must affect about a third of the sample. What proportion of users are reporting loss of smell and taste? You guessed it—one out of three. No wonder anosmia is the Nate Silver of Covid-19 diagnosis.
Anosmia appears to predict infection better than coughing only because it’s less common in the analytical sample. And it’s only less common because of triage testing. Unlike coughing, loss of smell and taste wasn’t yet recognized as a symptom of Covid-19 in March. So in order to get a test, you had to have a cough, but not necessarily loss of smell and taste. And since the analytical sample included only people who had been tested, it overrepresented the qualifying symptoms. But this finding is unlikely to apply to the larger population.