Tuesday, February 26, 2013

Flu surveillance news - and using data to do it


Update, March 8: The New York Times has run an article about companies using similar searches to identify unreported drug interactions.

A few weeks ago I posted a link to the Google Flu Trends website, along with links to the research Google cites stating that Google Flu trends does a good job anticipating flu surveillance data. A couple of weeks ago, though, the journal Nature published an update titled "When Google got flu wrong." It seems that Google's algorithms suggested a sharp uptick in flu prevalence in the US in early January 2013 - which the surveillance data did not bear out. That has prompted some comments and columns in the press about the accuracy of the Google site.

How to interpret all this? I think it's inaccurate to say, as Nick Bilton did in his Times column, that Google's algorithm was looking only at the numbers, not the context of the search results. Only humans can look at context - that's why it's important to review numbers, not take them at face value. That's exactly what the author of the first column I cited did - he raised questions. But it's not as if the Nature article says that everything about Google Flu Trends or other crowd-sourced flu information. Declan Butler writes:
Google Flu Trends has continued to perform remarkably well, and researchers in many countries have confirmed that its ILI estimates are accurate. But the latest US flu season seems to have confounded its algorithms. Its estimate for the Christmas national peak of flu is almost double the CDC’s (see ‘Fever peaks’), and some of its state data show even larger discrepancies.
It is not the first time that a flu season has tripped Google up. In 2009, Flu Trends had to tweak its algorithms after its models badly underestimated ILI [influenza-like illness] in the United States at the start of the H1N1 (swine flu) pandemic — a glitch attributed to changes in people’s search behaviour as a result of the exceptional nature of the pandemic (S. Cook et al. PLoS ONE 6, e23610; 2011).
Google would not comment on thisyear’s difficulties. But several researchers suggest that the problems may be due to widespread media coverage of this year’s severe US flu season, including the declaration of a public-health emergency by New York state last month. The press reports may have triggered many flu-related searches by people who were not ill. Few doubt that Google Flu will bounce back after its models are refined, however. [Links in original]

One of the US websites is Flu Near You. You join the website (you can use your Facebook login, and as far as I can tell they don't send an annoying notice to everyone you know that you've joined.) And then you report weekly on the flu status of some or all of your household. That's a screenshot of its recent reports in the northeast. According to its website, Flu Near You has more than 44,000 participants in the US; Nature reports that the participants are representative in terms of age distribution. And this is only the beginning:
Already, web data mining and crowdsourced tracking systems are becoming a part of the flu-surveillance landscape. “I’m in charge of flu surveillance in the United States and I look at Google Flu Trends and Flu Near You all the time, in addition to looking at US-supported surveillance systems,” says [Lyn] Finelli. “I want to see what’s happening and if there is something that we are missing, or whether there is a signal represented somewhat differently in one of these other systems that I could learn from.”

No comments:

Popular Posts