{"id":642972,"date":"2020-03-13T10:22:06","date_gmt":"2020-03-13T17:22:06","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=642972"},"modified":"2020-03-13T16:59:33","modified_gmt":"2020-03-13T23:59:33","slug":"covid-19-highlights-the-wisdom-of-the-academic-crowd","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/covid-19-highlights-the-wisdom-of-the-academic-crowd\/","title":{"rendered":"COVID-19 Highlights the Wisdom of the Academic Crowd"},"content":{"rendered":"
For more than a quarter century, we have lived a life where our online activities are closely monitored, collected, and exploited by internet giants to offer useful services without charging fees. Today, it is unfathomable to have to pay for Apple\u2019s Facetime, Google\u2019s Hangout, Facebook\u2019s Messenger or What\u2019s App, etc., the same way that we have to pay heftily to our wireless providers for the similar services in phone calls and text messages. These internet businesses prosper because the user data are worth a lot in our connected lives. Aside from creating detailed user profiles to place more targeted advertisements, the data holds the keys to identifying trends and project needs for new products and services.<\/p>\n
One example is Google Flu Trends (GFT) which made headlines around the world in February 2013. It was reported that web search queries, with their timestamps and locations, could serve as a good indicator of the flu epidemics. Unfortunately, it was later found (opens in new tab)<\/span><\/a> that GFT\u2019s predictions often have a wide gap from the Center of Disease Control (CDC) data, even though GFT was specifically trained on CDC reports. Nevertheless, the \u201cwisdom of the crowd\u201d hidden in search queries is certainly valuable and a recent approach (opens in new tab)<\/span><\/a> finds it beneficial to combine search data with electronic health records to improve the prediction accuracy.<\/p>\n To be sure, web search data are tricky to use correctly, especially in reliably tracking epidemics. At the time of writing, COVID-19 is rampaging around the world; with Italy and China both having to take drastic measures, such as locking down cities and cancelling schools to slow down the infections. In Microsoft\u2019s backyard the city of Kirkland, made famous by Costco as its store brand, has seen deaths exceeding 25 with worrisome evidence of community spread since the first confirmed case was reported on January 21st<\/sup>, 3 weeks after the outbreak was noticed in Wuhan China and one day after China\u2019s CDC declared an emergency.<\/p>\n According to Google Trends, however, search volume about coronavirus was insignificant until the day China declared the emergency, but the interests subsided in 10 days. For the first three weeks of February, the query volume continued to drop. It was not until February 21st<\/sup>, 10 days after the official name COVID-19 was announced by the World Health Organization (WHO), did the search query start to increase again, as can be seen in the Figure below. To ensure the trend plot is interpreted correctly, we contrast the search for \u201ccoronavirus\u201d with \u201cgoogle.\u201d The latter query is likely a result of a portion of internet users typing \u201cgoogle\u201d into their web browser address bar, indicating a search intent. Nevertheless, it tracks the daily search activities and shows the cyclic nature of the search queries. The activity-normalized curve for \u201ccoronavirus\u201d is shown as the dotted line against the secondary axis in the Figure.<\/p>\n <\/p>\n <\/p>\n In the meantime, the research community has no illusions of the danger this novel coronavirus can pose to the world. Articles sounding the alarm began to be published in the journals the second week of January, one full week before China\u2019s emergency declaration. Scientific activities prior to January 20th<\/sup> include events highlighted below:<\/p>\n