[PDF] Continuous Artificial Prediction Markets as a Syndromic Surveillance Technique

Abstract

The main goal of syndromic surveillance systems is early detection of an outbreak in a society using available data sources. In this paper, we discuss what are the challenges of syndromic surveillance systems and how continuous Artificial Prediction Market [Jahedpari et al., 2017] can effectively be applied to the problem of syndromic surveillance. We use two well-known models of (i) Google Flu Trends, and (ii) the latest improvement of Google Flu Trends model, named as GP [Lampos et al., 2015], as our case study and we show how c-APM can improve upon their performance. Our results demonstrate that c-APM typically has a lower MAE to that of Google Flu Trends in each year. Though this difference is relatively small in some years like 2004 and 2007, it is relatively large in most years and very large between 2011 and 2013.

Full PDF

CContinuous Artiﬁcial Prediction Markets as a SyndromicSurveillance Technique

Fatemeh Jahedpari

Appearance of highly virulent viruses warrant early detection of outbreaks to protect community health.The main goal of public health surveillance and more speciﬁcally ‘syndromic surveillance systems’ is earlydetection of an outbreak in a society using available data sources.In this paper, we discuss what are the challenges of syndromic surveillance systems and how continuousArtiﬁcial Prediction Market (c-APM) [Jahedpari et al., 2017] can eﬀectively be applied to the problemof syndromic surveillance. c-APM can eﬀectively be applied to the problem of syndromic surveillanceby analysing each data source with a selection of algorithms and integrating their results according toan adaptive weighting scheme. Section 2 provides an introduction and explains syndromic surveillance.Then, we discuss the syndromic surveillance data sources in Section 3 and present some syndromicsurveillance systems in Section 4. The statement of the problem in this ﬁeld is covered in Section 5.After that, we discuss Google Flu Trends (GFT) and GP model [Lampos et al., 2015], which is proposedby Google Flu Trends team to improve GFT engine performance, in Section 6 and 7 respectively. Also,in these sections, we evaluate the performance of c-APM as a syndromic surveillance system. Finally,Section 8 provides the conclusion of this paper.

According to the World Health Organisation (WHO) [World Health Organization, 2013], the UnitedNations directing and coordinating health authority, public health surveillance is:The continuous, systematic collection, analysis and interpretation of health-related dataneeded for the planning, implementation, and evaluation of public health practice.Public health surveillance practice has evolved over time. Although it was limited to pen and paper atthe beginning of 20th century, it is now facilitated by huge advances in informatics. Information technol-ogy enhancements have changed the traditional approaches of capturing, storing, sharing and analysing ofdata and resulted eﬃcient and reliable health surveillance techniques [Lombardo and Buckeridge, 2007].The main objective and challenge of a health surveillance system is the earliest possible detection of adisease outbreak within a society for the purpose of protecting community health.In the past, before the widespread deployment of computers, health surveillance was based on reportsreceived from medical care centres and laboratories. Although they are very speciﬁc , they decrease thetimeliness and sensitivity of a surveillance system [Lombardo and Buckeridge, 2007], while preventionof mortality of infected people for some diseases requires rapid identiﬁcation and treatment. Clearly, theearlier a health threat within a population is detected, the lower the morbidity and the higher numberof the saved lives. Consequently, syndromic surveillance systems have been created to monitor indirectsignals of disease activity such as call volume to telephone triage advice lines and over-the-counter drugsales to provide faster detection [Ginsberg et al., 2008].Syndromic Surveillance is an alternative to the traditional health surveillance system, which mainlydepends on conﬁrmed diagnoses, and aim to detect an outbreak as early as possible. Syndromic surveil-lance refers to techniques relying on population health indicators which are apparent before conﬁrmatorydiagnostic tests become available [Mandl et al., 2004]. Syndromic surveillance systems mostly concen-trate on infectious diseases such as severe acute respiratory syndrome (SARS), anthrax and inﬂuenza. In Speciﬁcity: the proportion of people without the disease that a test ﬁnds negative Sensitivity: the proportion of people with the disease that a test ﬁnds positive a r X i v : . [ c s . C Y ] S e p rder to decide whether an outbreak is evolving, syndromic surveillance systems monitor the quantity ofpatients with similar syndromes since indicators of a disease appear.Syndromic surveillance aims to exploit information which is not primarily generated for the purposeof public health, but can be an indicator of an abnormal health event. Syndromic surveillance datasources include, but are not limited to, coding of diagnoses at admission to or discharge from emergencydepartments, conﬁrmatory diagnostic cases, medical encounter pre-diagnostic data, absentee rates atschools and workplaces, over-the-counter pharmacy sales and posts on social media. Each of these datasources can generate a signal during disease development. Figure 1 shows the timeline of diﬀerent datasources to detect an outbreak. The following section describes some of the syndromic surveillance datasources in more details.Figure 1: Conceptual timeline of pre-diagnosis data types and sources for syndromic surveillance [Chenet al., 2010]. Syndromic surveillance data sources should supply timely and pre-diagnosis health indicators. Most ofthis data is originally collected for other purposes and now serves a dual purpose [Chen et al., 2010].Syndromic surveillance data sources include:1. Chief complaint record: These records include signs and symptoms of patient illness from emergencydepartments (ED) and ambulatory visits to hospitals. These records normally become available onthe same day as the patient is seen.2. Over the counter (OTC) sales: since some people may consider visiting a pharmacy rather than aphysician in their early stage of sickness, these data might be more timely. They include detailedinformation and are available in near real time in electronic format. However, they might be aﬀectedby factors such as sales promotions, stockpiling of medicines during a season, and product placementchanges in pharmacies. 2. School or work absenteeism: Although absenteeism data seems to have good timeliness, their lackof medical detail complicates interpretation [Van den Wijngaard et al., 2008].4. Hospital admission records: These data are not suﬃciently timely as it might take several daysfrom a patient’s ﬁrst visit until his/her hospitalisation.5. Pre-diagnostic clinical data: These are indications by an illness before being conﬁrmed via labora-tory tests and include comments of health care practitioners, patient encounter information, triagenurse calls, 911 calls and ambulance dispatch calls. They are relatively timely.6. International Classiﬁcation of Disease 9th edition (ICD-9) and International Classiﬁcation of Dis-ease, 9th edition, Clinical Modiﬁcation (ICD-9-CM): These are widely used in many syndromicsurveillance systems due to their electronic format. They are usually generated for billing andinsurance reimbursement purposes.7. Laboratory test orders and results: Although laboratory test results are very reliable, they lacktimeliness as they usually take a week to be completed.8. Emergency Department (ED) diagnostic data: These are regularly available in electronic formatbut takes several days to be prepared.9. Internet and open source information: These contain a huge source of health information and can beobtained via discussion forums, social media, government websites, news outlets, blogs, discussionsites, individual search queries, web crawling, use of click stream data, mass media and news report.For example, some approaches have applied data mining techniques to • Search engine logs[Eysenbach, 2006],[Polgreen et al., 2008],[Eysenbach, 2009], [Ginsberg et al., 2009], [Lamposand Cristianini, 2010] and [Lampos et al., 2015] • Twitter[Culotta, 2010],[Achrekar et al., 2011],[Signorini et al., 2011],[Culotta, 2013] and [Paul et al.,2014] • News articles[Reilly et al., 1968], [Grishman et al., 2002],[Mawudeku and Blench, 2006],[Brownstein et al.,2008], [Collier et al., 2008] and [Linge et al., 2009] • Web browsing patterns[Johnson et al., 2004]) and blogs ([Corley et al., 2010])Figure 2, graphs the popularity of various data sources in existing syndromic surveillance systems inthe USA. As can be seen from the ﬁgure, while emergency department visit reports are widely used insuch systems, work absenteeism is the least popular source.

In recent years, a number of syndromic surveillance approaches have been proposed. Roughly 100 syn-dromic surveillance systems were deployed in the USA done by 2003 [Buehler et al., 2003]. Althoughthey share similar goals, they are diﬀerent in their system architecture, information processing, analysisalgorithms, disease focus, and cover diﬀerent geographic locations. Chen et al. [2010] summarises themain international and USA local, state and national syndromic surveillance systems. In Europe, aninventory of syndromic surveillance systems is delivered through a new Public Health Action Programmecalled Triple-S (Syndromic Surveillance Survey, Assessment towards Guidelines for Europe).The following two sections survey some of the major existing syndromic surveillance systems aroundthe globe. Based on the utilised data sources, we divide the existing syndromic surveillance systems intotwo categories of i) traditional syndromic surveillance systems, described in Section 4.1 and ii) modernsyndromic surveillance systems, described in Section 4.2. We refer to syndromic surveillance systems that do not utilise social media and internet based data astraditional syndromic surveillance. Some of them are listed below:1. Early Notiﬁcation of Community-based Epidemics (ESSENCE) [Lewis et al., 2002] is a syndromicsurveillance system in the Washington D.C. area, undertaken by Department of Defense with theprimary goal of early detection disease outbreak due to bioterrorism attacks.2. Real time Outbreak and Disease Surveillance (RODS) [Tsui et al., 2003] is a public health surveil-lance system, in operation in western Pennsylvania since 1999, developed at the RODS laboratoryof the Center for Biomedical Informatics at the University of Pittsburgh.3. Composite Occupational Health and Operational Risk Tracking (COHORT) [Reichard et al., 2004]delivers real-time surveillance of the medical care of speciﬁed groups of military employees world-wide.4. Syndromic Surveillance Information Collection (SSIC) has been developed by the association of theClinical Information Research Group at the University of Washington and Public Health-Seattleand King County [Lober et al., 2003].5. Infectious Disease Surveillance Information System (ISIS) [Widdowson et al., 2003] is an automatedoutbreak detection system for all types of pathogens in the Netherlands.6. Early Aberration Reporting System (EARS) is developed by Center for Disease Control (CDC) [Hut-wagner et al., 2003] and enables national, state and local health departments to analyse public healthsurveillance data using a collection of anomaly detection methods.4. Japan National Institute of Infectious Diseases (NIID) [Ohkusa et al., 2005] has developed syndromicsurveillance system to analyse over the counter sales data, outpatient visits, and ambulance transferdata in Tokyo.We now provide a detailed description of two of the popular traditional syndromic surveillance system,namely BioSense and PHE ReSST.

BioSense

BioSense is a syndromic surveillance system in the United State which is part of CDC’s Public HealthInformation Network framework. By monitoring the size, location and rate of spread of an outbreak,it detects an outbreak at the local, state and national levels. It monitors seasonal trends for inﬂuenzaand other disease indicators. BioSense concentrates on syndrome categories including fever, respiratory,gastrointestinal illness (GI), hemorrhagic illness, localised cutaneous lesion, lymphadenitis, neurologic,rash, severe illness and death, speciﬁc infection, and botulism.BioSense collects and shares information on emergency department visits, hospitalisations, clinicallaboratory test orders, over-the-counter (OTC) drug sales and other health related data from multiplesources, including the Department of Veterans Aﬀairs (VA), the Department of Defense (DoD), andcivilian hospitals from around the USA. BioSense uses multiple analysing methods such as CUSUM [Page,1954], EWMA [Roberts, 1959] and SMART [Kleinman et al., 2004]. PHE ReSST

The Public Health England (PHE) Real-time Syndromic Surveillance Team (ReSST) generates regularsyndromic surveillance reports by collaborating with numerous national syndromic surveillance systemsincluding the NHS Direct syndromic surveillance system. The NHS Direct syndromic surveillance systemmonitors the nurse-led telephone helpline data collected electronically by NHS Direct sites and generatesalarms when call numbers are considerably higher than preceding years, after considering holiday andseasonal eﬀects. It has the potential to detect large scale events, but is less likely to detect smaller andlocalised outbreaks [Doroshenko et al., 2005]. In addition, ReSST obtains data from GP In-Hours andGP Out-of-Hours syndromic surveillance systems which monitor daily consultations for a range of clinicalsyndromic indicators and community-based morbidity, recorded by GP practices inside and outside ofroutine surgery opening times, respectively.

There are other real-time disease event detection systems which employ diﬀerent approaches from thesystems discussed in Section 4.1. They monitor online media from global sources, instead of monitoringdisease cases reported by health related organisations such as hospitals and clinics. These “systems arebuilt on top of open sources, exemplifying an idea of open development for public health informaticsapplications” [Chen et al., 2010]. Though the modern systems are faster than traditional syndromicsurveillance systems in detecting an anomaly in public health [Signorini et al., 2011, Ginsberg et al.,2008], they are vulnerable to a high rate of false positives in case of an unusual event within a popula-tion [Ginsberg et al., 2008]. This section describes some of the well known modern syndromic surveillancesystems.

Google Flu Trends

Google Flu Trends , established by Google, is a Web-based tool for near real-time detection of regionaloutbreaks of inﬂuenza [Ginsberg et al., 2008]. It monitors and analyses health-care seeking behaviourin the form of queries to its online search engine. According to Carneiro and Mylonakis [2009] “allthe people searching for inﬂuenza-related topics are not ill, but trends emerge when all inﬂuenza-relatedsearches are added together”; Consequently, there is a close relationship between the number of peoplesearching for inﬂuenza-related topics and those who have inﬂuenza symptoms. Section 6 provides moreinformation about Google Flu Trends. rgus The Argus system is a web-based global biosurveillance system designed to report and track the de-velopment of biological events threatening human, plant and animal health globally, excluding theUSA [on Homeland Security. Subcommittee on Emerging Threats and Cybersecurity, 2009]. It is devel-oped at Georgetown University and funded by the United States Government.It automatically collects local and native language internet media reports including blogs and oﬃcialsources such as World Health Organisation (WHO) and World Organisation for Animal Health (OIE)and infers their importance according to keywords appropriate to infectious disease surveillance [Nelsonet al., 2010]. It relies on a human team of multilingual data analysts to assess the relations between theonline media and presence of adverse health events [Chen et al., 2010]. In particular, the data analystsmonitor several thousand Internet sources daily. Then, six time in each day, they use Boolean keywordsearching and Bayesian model tools [McCallum and Nigam, 1998] to select relevant media reports [Nelsonet al., 2010]. Based on the selected media reports, they write their own report and post them on a secureInternet portal to be accesses with Argus users.Since its operation in July 2000, “it has logged more than 30,000 biological events involving pathogenssuch as avian inﬂuenza, the Ebola virus, cholera, and other unusual pathogens that have caused varyingstates of social disruption throughout the world” [CDC, First Quarter 2008].

GermTrax

GermTrax is a freely accessible website which gathers sickness and disease data from people worldwideand exhibits trends through an interactive map. More speciﬁcally, GermTrax is a collaborative diseasetracking system which primarily relies on reports ﬁlled by ordinary people who are sick. This systemcollects information through user personal updates on social media websites such as Facebook and Twitter.Then, the system saves user geo-location data, while the users connect their social media accounts withthe site. According to their website, GermTrax can help people by informing them of places where theymight get sick and help health experts to discover large-scale sickness trends. Since it principally relies ondisease reports from ordinary people, it is suitable for non-speciﬁc conditions such as colds and ﬂu [Lanet al., 2012]. Health Map

Health Map is a multi stream real-time surveillance system and freely accessible. It monitors onlineinformation in order to obtain a comprehensive view of current infectious disease outbreaks globally. Itobserves, ﬁlters, visualises, and distributes online information about emerging infectious diseases for thebeneﬁt of diverse audience from public health oﬃcials to international tourists [Lemon et al., 2007]. HealthMap gathers reports from 14 sources, which in turn embody information from over 20,000 web sites everyhour. Information is obtained automatically through screen scraping, natural language interpretation,text mining, and parsing [Brownstein et al., 2008]. More speciﬁcally, Health Map use multiple web baseddata sources including online news sources, expert-curated discussion, and validated oﬃcial reports fromorganisations such as the World Health Organisation (WHO ). Then, the alerts are classiﬁed by locationand disease using automated text processing algorithms. Next, the system overlays the alerts on aninteractive geographic map. According to Freifeld et al. [2008] “The ﬁltering and visualization features ofHealthMap thus serve to bring structure to an otherwise overwhelming amount of information, enablingthe user to quickly and easily see those elements pertinent to her area of interest”. While traditional syndromic surveillance systems can detect an outbreak with high accuracy, they suﬀerfrom slow response. For example, Centers for Disease Control and Prevention (CDC) publishes USA na-tional and regional data typically with a 1-2 week reporting lag using outpatient reporting and virologicaltest results provided by laboratories nationally [Culotta, 2010, 2013, Ginsberg et al., 2008]. Therefore,such systems cannot predict an outbreak, but only can detect them after the onset.

6n the other hand, modern syndromic surveillance systems monitor online media from global sources.Such modern syndromic surveillance systems resort to internet based data such as search engine queries,health news, and people posts on social networks to predict an outbreak earlier [Signorini et al., 2011,Carneiro and Mylonakis, 2009, Corley et al., 2010]. While some of them claim that they could achievehigh accuracy, the rate of false alarms is unknown. Ginsberg et al. [2008] state, regarding Google FluTrends, that “Despite strong historical correlations, our system remains susceptible to false alerts causedby a sudden increase in ILI-related queries. An unusual event, such as a drug recall for a popular cold orﬂu remedy, could cause such a false alert”. Therefore, an issue with internet based data sources is thattheir data quality ﬂuctuates over time.Moreover, most of these modern syndromic surveillance systems rely on one type of internet baseddata sources and disregard the advantage of other type of data sources, which are discussed in Section 3(page 2). Consequently, they are only suitable for places where their source data is suﬃciently available.For example, Twitter based systems cannot have a high accuracy for places where using twitter is notvery common, if accessible. In addition, the quality and availability of data sources may change overtime. For instance, Twitter may lose its popularity or become inaccessible in a place. Hence, integratingavailable data sources according to an adaptive weighting scheme over time seems necessary.The other area that has received attention in the syndromic surveillance literature is the topic ofalternative analysis algorithms for a given data sources. Given that the quality of data sources changeover time, and the most suitable algorithm for a given data source is not known a priori , a reasonableresponse is to consider analysing each data source with a variety of algorithms and integrate their results.Against this background, we believe, based on plentiful available data sources and analysis techniques,a state of the art syndromic surveillance mechanism should:1. Perform as an ensemble to combine various analysis algorithms with the objective of increasingsyndromic surveillance system performance. There are many diﬀerent techniques with diﬀerentstrengths and weaknesses. An ensemble which utilises a combination of them seems likely to beable provide higher performance than systems which are depended on only one technique.2. Extract information which resides in diﬀerent data sources. In addition to obtaining information,it should be capable of integrating them according to their relevance and varying quality.3. Be ﬂexible to changes in composition of algorithms and data sources over time as any of them mightbe deleted, temporarily unavailable, or added to the system at any time.4. Be able to adapt to its corresponding monitored population behaviour and habits. For example, ifpeople of a particular region are more prone to tweet their feeling in social media such as Twitterthan searching for a solution using online search engines, then a syndromic surveillance systemshould weight twitter results higher than a search engine queries in that particular region.5. Be able to adapt to the changes of its corresponding population behaviour. For example, if twitterbecome more popular in a place and people start tweeting their sickness symptoms earlier, ratherthan visiting a physician, the system must give more attention and weight to twitter than previously.6. Minimise the eﬀect of misleading factors and noise such as advertisement, promotions, and holidayson diﬀerent data sources and, consequently, diminish the rate false positives.Jahedpari et al. [2017] proposed Continuous Artiﬁcial Prediction Market (c-APM), which utilizes theconcept of prediction markets in which the traders are modeled as intelligent agents. The model can beused as a machine learning ensemble by integrating diﬀerent data sources and techniques.In here, we suggest that c-APM can be used as a syndromic surveillance technique as it fulﬁlls theaforementioned requirements as we discuss below:1) c-APM can behave as an ensemble method by including numerous agents, each having diﬀerentanalysis algorithms.2) Prediction markets are specially designed for the purpose of information aggregation [Perols et al.,2009]. c-APM adapt the prediction markets’ concepts and incentives it participating agents to sharetheir private information through market mechanism, hence make accurate prediction. In addition,c-APM dynamically weights the prediction of diﬀerent agents according to their varying quality.7) In c-APM, market and other agents operate independently and hence absence or presence of an agentdoes not impact the system considerably. Therefore, if one of the existing data sources becomesunavailable for any reasons, c-APM can simply respond to the issue. If a new data source or a modelis discovered, c-APM can simply create an agent to access that data source or model to participatein the market and share its knowledge.4) In c-APM, the agents can be trained in the market using historical data of that place and, consequentlywill be adapted to behaviour of people in that place.5) c-APM can respond to the changes of its corresponding population behaviour since its agents keeplearning and their weights keep changing according to their current performance in each market.6) c-APM can minimise the eﬀect of misleading factors and noise by fusing various data sources andmodels using an adoptable scheme.In the following sections, we use two well-known models of (i) Google Flu Trends, and (ii) the latestimprovement of Google Flu Trends model, named as GP [Lampos et al., 2015], as our case study and weshow how c-APM can improve upon their performance.

Google Flu Trends (GFT) was launched by Google in 2008 to alert health professionals to outbreaksearly by indicating when and where inﬂuenza is striking in real time using aggregate web searches. GFTpublishes ﬂu predictions (ILI rate) for more than 25 countries. Google Flu Trends is typically moreimmediate, up to 2 weeks ahead of traditional methods such as the CDC’s oﬃcial reports. The basic ideabehind GFT is that when people get sick, they turn to the Web for information.Google Flu Trends algorithms recognise a small subgroup of the millions of search engine query termsthat deliver the maximum correlation with the CDC published ILI rate. Then a subset of these querieswhich ﬁt the historical CDC ILI rate data most accurately are chosen. Finally, univariate linear regressionmodel is trained to be used in predicting future ILI rate using each day queries. According to Copelandet al. [2013] the challenge of their approach is the varying volumes of a particular query over time. Forinstance, during the holiday season, more people search for ‘gift’ than at any other period. Similarly,overall usage of Google search varies throughout the year and is growing over time. GFT used the oﬃcialCDC data only in the initial training and did not use it to re-train its model regularly .The early Google paper indicated that the Google Flu Trends predictions were 97% accurate com-paring with CDC data [Ginsberg et al., 2009]. However, in 2013, Olson et al. [2013] and Butler [2013]reported that GFT was predicting more than double that of CDC published. Later in 2014, Lazer et al.[2014] stated that GFT has been overestimating ﬂu occurrence for most weeks after August 2011 and by avery large margin in the 2011-2012 ﬂu season. He continued stating GFT can achieve better performanceby combining its prediction with other near realtime health data such as lagged CDC data. Also, GoogleFlu Trend team announced “We found that heightened media coverage on the severity of the ﬂu season resulted in anextended period in which users were searching for terms we’ve identiﬁed as correlated withﬂu levels. In early 2013, we saw more ﬂu-related searches in the US than ever before.”GFT subsequently updated the model in response to concerns about accuracy. In 9th August 2015,GFT stopped publishing ﬂu predictions without formally presenting any reasons. However, GFT historicalprediction are still available for download. In this section, we use c-APM as a syndromic surveillance system and compare the performance of c-APMand Google Flue Trend. http://googleresearch.blogspot.ae/2014/10/google-flu-trends-gets-brand-new-engine.html (Retrieved Oct 4,2015). http://blog.google.org/2013/10/ﬂu-trends-updates-model-to-help.html (Retrieved Oct 4, 2015). odel Full Name Mdoel Short Name Bagged CART treebagConditional Inference Random Forest cforestRandom Forest rfMulti-Layer Perceptron mlpModel Averaged Neural Network avNNetBoosted Generalized Linear Model glmboostBoosted Tree Linear Regression blackboostLinear Regression lmRadial Basis Function Network rbfGaussian Process gaussprLinearCART rpartGeneralized Linear Model glmk-Nearest Neighbors knnGaussian Process with Polynomial Kernel gaussprPolyMultivariate Adaptive Regression Spline earthSelf-Organizing Map bdkTable 1: R’s caret package models. c-APM instantiates one participant for each of these models.

In these experiments, c-APM predicts the disease activity level of inﬂuenza-like illnesses (ILI) in a givenweek in the whole of the USA using publicly available data sources. The data used here contains morethan 100 real data sources covering the period 4th January 2004 (when GFT provides data for most ofUSA states and cities) to 9th August 2015 (when GFT stopped publishing their results online), from thetwo data sources of Google Flu Trends (GFT) and Centers for Disease Control and Prevention (CDC).

Data Sources

In these experiments, we use weekly Google Flu Prediction for diﬀerent areas of the United Statesincluding states, cities and regions , for which GFT data is available since January 2004.In here, we use the calendar deﬁnition of year where a year starts on 1st January and ﬁnishes on 31stDecember.The CDC Inﬂuenza Division produces a weekly report on inﬂuenza-like illness activity in the USA .We use CDC statistics including: i) ILI rate disaggregated by age groups (0-4 years, 5-24 years, 25-64years, and older than 65 years), ii) USA national ILI rate, iii) total number of patients and iv) totalnumber of outpatient healthcare providers in U.S. Outpatient Inﬂuenza-like Illness Surveillance Network(ILI network). Since CDC reports ILI rates with a two-week time lag, we use CDC data of two weeksearlier for each week of the experimentation period. In this way, we can align CDC data with the otherdata sources used in these experiments. Models

We use diﬀerent machine learning models in R’s caret package (version 6.0-37), which are capable ofperforming regression. Table 1 presents the models we use in this experiment. Model parameters are setto their default values.

Experiment Settings

We constructed an c-APM in which every agent has a unique analysis model corresponding to one of themodels listed in Table 1. The data source for each agent is the entire data set. All agents use Q-learningtrading strategy, which is proposed in Jahedpari et al. [2017]. The results are based on one run only, as This data can be accessed from . (Retrieved Oct 4, 2015). ILI is deﬁned as fever (temperature of 100 ◦ F [37.8 ◦ C] or greater) and a cough and/or a sore throat without a knowncause other than inﬂuenza ( ) (Retrieved Oct 4, 2015). This data can be accessed from http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html . (Retrieved Oct 4,2015).

004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 M AE ACPMGFT

Figure 3: Comparing the Performance of c-APM and GFT for diﬀerent periods using Mean AbsoluteError (MAE).they are deterministic. All c-APM parameters are set to their default parameters (see [Jahedpari, 2016]).Hence:i) The number of rounds is set to 2,ii)

M axRP T and

M inRP T is set to 90%, in the ﬁrst round, andiii)

M inRP T and

M axRP T are set to 0 .

01% and 1% respectively, in the second round.We measure the performance of c-APM by comparing the prediction of c-APM against the groundtruth, which is the weekly ILI rate published by CDC. We use Mean Absolute Error (MAE), which is acommon measure, in this literature.

In this section, we compare the performance of c-APM and Google Flu Trends. Figure 3 and Figure 4compare the error of c-APM and Google Flu Trend for the period between 2004 to 2015.As Figure 3 shows, c-APM typically has a lower, sometimes much lower, MAE to that of GoogleFlu Trends in each year. Though this diﬀerence is relatively small in some years like 2004 and 2007, itis relatively large in most years and very large between 2011 and 2013. Table 2 shows the exact MAEvalue of c-APM and GFT in addition to t-test p-values. The null hypothesis is that the two accuraciescompared are not signiﬁcantly diﬀerent. Therefore, within a tolerance α = 0 .

05, when p-value < . . E −

004 2005 2006 2007

Figure 4: c-APM and GFT error in predicting ILI rate from 2004 to 2015data with two weeks time lags, c-APM uses the CDC data of the previous two weeks. This explains theexistence of two weeks time lag between c-APM and GFT error in some periods such as early 2008 andlate 2012.

Lampos et al. [2015] published a paper in Nature Scientiﬁc Reports on 3rd August 2015 proposing a newmodel, called ‘GP’. Their model includes three improvements to the original Google Flu Trend. Firstly,they expand and re-weight the set of queries which are originally used by GFT. Then, they expand thisimprovement by using a nonlinear regression framework based on a Gaussian Process (GP) to investigatenonlinear relationship between query fractions and the ground truth (CDC ILI rate). Finally, they utilisetime series structure. More speciﬁcally, they use ARMAX model [Hyndman and Khandakar, 2008] toﬁnd a relationship between previously available data and the current one.They perform an evaluation using ﬁve consecutive inﬂuenza seasons, as deﬁned by CDC, from 2008to 2013. Based on their experiments, they conclude that GP approach performs better than GFT and awell established model, namely Elastic Net. They also mentioned that 2009-10 ﬂu season is a unique ﬂuperiod since during the peak of that ﬂu season, GFT over-predicted the ILI rate, while GP and ElasticNet underestimated the ILI rate.

This section compares the performance of c-APM and the model proposed by Lampos et al. [2015], knownas the ‘GP’ model. We contacted the author and received their exact prediction for each experimentedperiod to use in our experiments.

All settings are similar to the settings covered in Section 6.1.1 (page 9), except the part that c-APMincludes on additional agent which uses GP prediction as its data source. The agent uses a simplealgorithm which gives the prediction equal to the receiving data, hence no analysis is performed by theagent on that data. 11 eriods c-APMMAE × GFTMAE × P-value −

09 2009 −

10 2010 −

11 2011 −

12 2012 − M AE ACPMGP

Figure 5: Comparing the Performance of c-APM and GP for diﬀerent periods using Mean Absolute Error(MAE).In these experiments, we follow the same evaluation format as the work by Lampos et al. [2015],therefore we compare the performance of c-APM and GP in the ﬂu seasons 2008 to 2013 as deﬁned byCDC. These ﬂu seasons include diﬀerent numbers of weeks (see Table 3).12

CPMGP E rr o r . . . . . Figure 6: c-APM and GP error in predicting ILI rate from 2004 to 2015

Figures 5 and Table 3 compare the performance of c-APM and GP for diﬀerent inﬂuenza seasons between2008 and 2013. Figure 6 compares the error of c-APM and GP in each week of the entire period. InTable 3, the ﬁrst column shows the experimented inﬂuenza seasons and the second column presents thenumber of weeks in each season. The third and the fourth columns show the Mean Absolute Error (MAE)of c-APM and GP respectively. The last column shows p-values for the paired t-tests comparing the errorof c-APM and GP.As Table 3 and Figure 5 show c-APM outperforms GP in most years except 2012-2013, where c-APMachieves MAE of 0 .

220 and GP achieves MAE of 0 . c-APM outperforms both the Google Flu Trend and GP models because:13 eriod Weeks c-APMMAE × GPMAE × P-value

References

H. Achrekar, A. Gandhe, R. Lazarus, S.-H. Yu, and B. Liu. Predicting ﬂu trends using twitter data.In

Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference on , pages702–707. IEEE, 2011.J. S. Brownstein, C. C. Freifeld, B. Y. Reis, and K. D. Mandl. Surveillance sans frontieres: Internet-basedemerging infectious disease intelligence and the healthmap project.

PLoS medicine , 5(7):e151, 2008.J. W. Buehler, R. L. Berkelman, D. M. Hartley, and C. J. Peters. Syndromic surveillance and bioterrorism-related epidemics.

Emerging infectious diseases , 9(10):1197, 2003.14. W. Buehler, A. Sonricker, M. Paladini, P. Soper, and F. Mostashari. Syndromic surveillance practicein the united states: ﬁndings from a survey of state, territorial, and selected local health departments.

Advances in Disease Surveillance , 6(3):1–20, 2008.D. Butler. When google got ﬂu wrong.

Nature , 494(7436):155, 2013.H. A. Carneiro and E. Mylonakis. Google trends: a web-based tool for real-time surveillance of diseaseoutbreaks.

Clinical infectious diseases , 49(10):1557–1564, 2009.CDC. CDC global health E-Brief, building usg interagency collaboration through global health engage-ment, First Quarter 2008.H. Chen, D. Zeng, P. Yan, and P. Yan.

Infectious Disease Informatics: Syndromic Surveillance for PublicHealth and Biodefense . Integrated series in information systems. Springer Science + Business Media,2010.N. Collier, S. Doan, A. Kawazoe, R. M. Goodwin, M. Conway, Y. Tateno, Q.-H. Ngo, D. Dien, A. Kaw-trakul, K. Takeuchi, et al. Biocaster: detecting public health rumors with a web-based text miningsystem.

Bioinformatics , 24(24):2940–2941, 2008.P. Copeland, R. Romano, T. Zhang, G. Hecht, D. Zigmond, and C. Stefansen. Google disease trends: anupdate.

Nature , 457:1012–1014, 2013.C. D. Corley, D. J. Cook, A. R. Mikler, and K. P. Singh. Text and structural data mining of inﬂuenzamentions in web and social media.

International journal of environmental research and public health ,7(2):596–615, 2010.A. Culotta. Detecting inﬂuenza outbreaks by analyzing twitter messages.

CoRR , abs/1007.4748, 2010.A. Culotta. Lightweight methods to estimate inﬂuenza rates and alcohol sales volume from twittermessages.

Language Resources and Evaluation , pages 1–22, 2013.A. Doroshenko, D. Cooper, G. Smith, E. Gerard, F. Chinemana, N. Verlander, A. Nicoll, et al. Evaluationof syndromic surveillance based on national health service direct derived data: England and wales.

MMWR Morb Mortal Wkly Rep , 54(Suppl):117–122, 2005.G. Eysenbach. Infodemiology: tracking ﬂu-related searches on the web for syndromic surveillance. In

AMIA Annual Symposium Proceedings , volume 2006, page 244. American Medical Informatics Associ-ation, 2006.G. Eysenbach. Infodemiology and infoveillance: framework for an emerging set of public health informaticsmethods to analyze search, communication and publication behavior on the internet.

Journal of medicalInternet research , 11(1), 2009.C. C. Freifeld, K. D. Mandl, B. Y. Reis, and J. S. Brownstein. Healthmap: global infectious diseasemonitoring through automated classiﬁcation and visualization of internet media reports.

Journal ofthe American Medical Informatics Association , 15(2):150–157, 2008.J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant. Detectinginﬂuenza epidemics using search engine query data.

Nature , 457(7232):1012–1014, 2008.J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant. Detectinginﬂuenza epidemics using search engine query data.

Nature , 457(7232):1012–1014, 2009.R. Grishman, S. Huttunen, and R. Yangarber. Information extraction for enhanced access to diseaseoutbreak reports.

Journal of biomedical informatics , 35(4):236–246, 2002.M. L. Hutwagner, M. W. Thompson, G. M. Seeman, and T. Treadwell. The bioterrorism preparednessand response early aberration reporting system (ears).

Journal of Urban Health , 80(1):i89–i96, 2003.R. Hyndman and Y. Khandakar. Automatic time series forecasting: The forecast package for r.

Journalof Statistical Software , 27(1):1–22, 2008. 15. Jahedpari.

Artiﬁcial prediction markets for online prediction of continuous variables . PhD thesis,University of Bath, 2016.F. Jahedpari, T. Rahwan, S. Hashemi, T. P. Michalak, M. De Vos, J. Padget, and W. L. Woon. Onlineprediction via continuous artiﬁcial prediction markets.

IEEE Intelligent Systems , 32(1):61–68, 2017.H. A. Johnson, M. M. Wagner, W. R. Hogan, W. Chapman, R. T. Olszewski, J. Dowling, G. Barnas,et al. Analysis of web access logs for surveillance of inﬂuenza.

Stud Health Technol Inform , 107(Pt 2):1202–1206, 2004.K. Kleinman, R. Lazarus, and R. Platt. A generalized linear mixed models approach for detecting incidentclusters of disease in small areas, with an application to biological terrorism.

American Journal ofEpidemiology , 159(3):217–224, 2004.V. Lampos and N. Cristianini. Tracking the ﬂu pandemic by monitoring the social web. In

CognitiveInformation Processing (CIP), 2010 2nd International Workshop on , pages 411–416. IEEE, 2010.V. Lampos, A. C. Miller, S. Crossan, and C. Stefansen. Advances in nowcasting inﬂuenza-like illnessrates using search query logs.

Scientiﬁc reports , 5, 2015.R. Lan, M. D. Lieberman, and H. Samet. The picture of health: map-based, collaborative spatio-temporaldisease tracking. In

Proceedings of the First ACM SIGSPATIAL International Workshop on Use ofGIS in Public Health , pages 27–35. ACM, 2012.D. Lazer, R. Kennedy, G. King, and A. Vespignani. The parable of google ﬂu: traps in big data analysis.

Science , 343(14 March), 2014.S. M. Lemon, M. A. Hamburg, P. F. Sparling, E. R. Choﬀnes, A. Mack, et al.

Global Infectious Dis-ease Surveillance and Detection: Assessing the Challenges–Finding Solutions, Workshop Summary .National Academies Press, 2007.M. D. Lewis, J. A. Pavlin, J. L. Mansﬁeld, S. O?Brien, L. G. Boomsma, Y. Elbert, and P. W. Kelley.Disease outbreak detection system using syndromic data in the greater Washington DC area.

Americanjournal of preventive medicine , 23(3):180–186, 2002.J. P. Linge, R. Steinberger, T. Weber, R. Yangarber, E. van der Goot, D. Al Khudhairy, and N. Stilianakis.Internet surveillance systems for early alerting of health threats.

Euro surveillance , 14(AVRJUIN):200–201, 2009.W. B. Lober, M. L. J. Trigg, B. T. Karras, M. D. Bliss, J. Ciliberti, M. L. Stewart, and J. S. Duchin.Syndromic surveillance using automated collection of computerized discharge diagnoses.

Journal ofUrban Health , 80(1):i97–i106, 2003.J. Lombardo and D. Buckeridge.

Disease Surveillance: A Public Health Informatics Approach . Wiley,2007.K. D. Mandl, J. M. Overhage, M. M. Wagner, W. B. Lober, P. Sebastiani, F. Mostashari, J. A. Pavlin,P. H. Gesteland, T. Treadwell, E. Koski, et al. Implementing syndromic surveillance: a practical guideinformed by the early experience.

Journal of the American Medical Informatics Association , 11(2):141–150, 2004.A. Mawudeku and M. Blench. Global public health intelligence network (gphin). In , pages 8–12, 2006.A. McCallum and K. Nigam. A comparison of event models for naive bayes text classiﬁcation. In

AAAI-98workshop on learning for text categorization , volume 752, pages 41–48, 1998.N. Nelson, J. Brownstein, D. Hartley, et al. Event-based biosurveillance of respiratory disease in mexico,2007-2009: connection to the 2009 inﬂuenza a (h1n1) pandemic.

Euro Surveill , 15(30), 2010.Y. Ohkusa, M. Shigematsu, K. Taniguchi, and N. Okabe. Experimental surveillance using data on salesof over-the-counter medications-japan, November 2003–April 2004.

MMWR Morb Mortal Wkly Rep ,54:47–52, 2005. 16. R. Olson, K. J. Konty, M. Paladini, C. Viboud, and L. Simonsen. Reassessing google ﬂu trends data fordetection of seasonal and pandemic inﬂuenza: a comparative epidemiological study at three geographicscales.

PLoS Comput Biol , 9(10):e1003256, 2013.U. S. C. H. C. on Homeland Security. Subcommittee on Emerging Threats and Cybersecurity.

One yearlater: implementing the biosurveillance requirements of the 9/11 Act: hearing before the Subcommitteeon Emerging Threats, Cybersecurity, and Science and Technology of the Committee on Homeland Secu-rity, House of Representatives, One Hundred Tenth Congress, second session, July 16, 2008 , volume 4.Government Printing Oﬃce, 2009.E. Page. Continuous inspection schemes.

Biometrika , 41(1/2):100–115, 1954.M. J. Paul, M. Dredze, and D. Broniatowski. Twitter improves inﬂuenza forecasting.

PLoS currents , 6,2014.J. Perols, K. Chari, and M. Agrawal. Information market-based decision fusion.

Management Science ,55(5):827–842, 2009.P. M. Polgreen, Y. Chen, D. M. Pennock, F. D. Nelson, and R. A. Weinstein. Using internet searches forinﬂuenza surveillance.

Clinical infectious diseases , 47(11):1443–1448, 2008.G. Reichard, P. Demitry, and J. Catalino. Cohort: An integrated information approach to decisionsupport for military subpopulation health care. Technical report, Air Force Medical Operations Agency,AFMOA/SGZI,5201 Leesburg Pike Sky 3, Ste 1400,Falls Church,VA,22041-3203, 2004.A. R. Reilly, E. A. Iarocci, C. M. Jung, D. M. Hartley, and N. P. Nelson. Indications and warning ofpandemic inﬂuenza compared to seasonal inﬂuenza.

Indicator , 1967(128), 1968.S. W. Roberts. Control chart tests based on geometric moving averages.

Technometrics , 1(3):239–250,1959.A. Signorini, A. M. Segre, and P. M. Polgreen. The use of twitter to track levels of disease activity andpublic concern in the U.S. during the Inﬂuenza A H1N1 Pandemic.

PLoS ONE , 6:e19467, 05 2011.C. Stefansen. Flu trends updates model to help estimate ﬂu levels in the us, 10 2013.F.-C. Tsui, J. U. Espino, V. M. Dato, P. H. Gesteland, J. Hutman, and M. M. Wagner. Technicaldescription of rods: a real-time public health surveillance system.

Journal of the American MedicalInformatics Association , 10(5):399–408, 2003.C. Van den Wijngaard, L. Van Asten, W. Van Pelt, N. J. Nagelkerke, R. Verheij, A. J. De Neeling,A. Dekkers, M. A. Van der Sande, H. Van Vliet, and M. P. Koopmans. Validation of syndromicsurveillance for respiratory pathogen activity.

Emerging infectious diseases , 14(6):917, 2008.M.-A. Widdowson, A. Bosman, E. van Straten, M. Tinga, S. Chaves, L. van Eerden, and W. van Pelt.Automated, laboratory-based system using the internet for disease outbreak detection, the netherlands.