Update, March 8: The New York Times has run an
article about companies using similar searches to identify unreported drug interactions.
A few weeks ago I
posted a link to the Google Flu Trends website, along with links to the research Google cites stating that Google Flu trends does a good job anticipating flu surveillance data. A couple of weeks ago, though, the journal
Nature published an
update titled "When Google got flu wrong." It seems that Google's
algorithms suggested a sharp uptick in flu prevalence in the US in early January 2013 - which the surveillance data did not bear out. That has prompted some
comments and
columns in the press about the accuracy of the Google site.
How to interpret all this? I think it's inaccurate to say, as Nick Bilton did in his
Times column, that Google's algorithm was looking only at the numbers, not the context of the search results. Only humans can look at context - that's why it's important to review numbers, not take them at face value. That's exactly what the author of the first column I cited did - he raised questions. But it's not as if the
Nature article says that everything about Google Flu Trends or other crowd-sourced flu information. Declan Butler writes:
Google Flu Trends has continued to perform remarkably well, and
researchers in many countries have confirmed that its ILI estimates are
accurate. But the latest US flu season seems to have confounded its
algorithms. Its estimate for the Christmas national peak of flu is
almost double the CDC’s (see ‘Fever peaks’), and some of its state data show even larger discrepancies.
It is not the first time that a flu season has tripped
Google up. In 2009, Flu Trends had to tweak its algorithms after its
models badly underestimated ILI [influenza-like illness] in the United States at the start of the
H1N1 (swine flu) pandemic — a glitch attributed to changes in people’s
search behaviour as a result of the exceptional nature of the pandemic (S. Cook et al. PLoS ONE 6, e23610; 2011).
Google would not comment on thisyear’s difficulties. But
several researchers suggest that the problems may be due to widespread
media coverage of this year’s severe US flu season, including the
declaration of a public-health emergency by New York state last month.
The press reports may have triggered many flu-related searches by people
who were not ill. Few doubt that Google Flu will bounce back after its
models are refined, however. [Links in original]
One of the US websites is
Flu Near You. You join the website (you can use your Facebook login, and as far as I can tell they don't send an annoying notice to everyone you know that you've joined.) And then you report weekly on the flu status of some or all of your household. That's a screenshot of its recent reports in the northeast. According to its website, Flu Near You has more than 44,000 participants in the US;
Nature reports that the participants are representative in terms of age distribution. And this is only the beginning:
Already, web data mining and crowdsourced tracking systems are
becoming a part of the flu-surveillance landscape. “I’m in charge of flu
surveillance in the United States and I look at Google Flu Trends and
Flu Near You all the time, in addition to looking at US-supported
surveillance systems,” says [Lyn] Finelli. “I want to see what’s happening and
if there is something that we are missing, or whether there is a signal
represented somewhat differently in one of these other systems that I
could learn from.”