Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew Walsh is active.

Publication


Featured researches published by Andrew Walsh.


Online Journal of Public Health Informatics | 2018

Nonparametric Models for Identifying Gaps in Message Feeds

Andrew Walsh

Objective Characterize the behavior of nonparametric regression models for message arrival probability as outage detection tools. Introduction Timely and accurate syndromic surveillance depends on continuous data feeds from healthcare facilities. Typical outlier detection methodologies in syndromic surveillance compare predictions of counts for an interval to observed event counts, either to detect increases in volume associated with public health incidents or decreases in volume associated with compromised data transmission. Accurate predictions of total facility volume need to account for significant variance associated with the time of day and week; at the extreme are facilities which are only open during limited hours and on select days. Models need to account for the cross-product of all hours and days, creating a significant data burden. Timely detection of outages may require sub-hour aggregation, increasing this burden by increasing the number of intervals for which parameters need to be estimated. Nonparametric models for the probability of message arrival offer an alternative approach to generating predictions. The data requirements are reduced by assuming some time-dependent structure in the data rather than allowing each interval to be independent of all others, allowing for predictions at sub-hour intervals. Methods Healthcare facility data was collected as HL7 messages via the EpiCenter syndromic surveillance system from June 1, 2017 through August 31, 2017. 713 facilities sent at least 1,000 messages during this period and were included in the analysis. Standard Poisson regression models were fit to counts of messages per quarter hour. Predictors were indicators for day of week, hour of day, and quarter of hour, along with interaction terms between them. Nonparametric logistic regression models were fit to data on the presence or absence of any message for each minute of the first two months of the study period, using the minute within the week as a predictor. The last month of data was scanned for outages at 15-minute intervals and calculating the probability of no messages since the last received message per facility as: P(Gap from m last to m now ) = ∏ t 1 - P message (t) Four consecutive intervals with probability below 1 -10 were considered outages. Results A total of 12,710,275 ADT A04 messages were received from 713 facilities from June 1, 2017 through August 31, 2017. Estimation of Poisson regression models averaged 1 minute, while nonparametric models averaged 1.5 minutes to estimate. Poisson models required 672 parameters to specify, whereas nonparametric models required 29. Calculating predictions from fitted models averaged 0.2 seconds for Poisson models and 2 seconds for nonparametric models. Although predictions from the two models are not on identical scales and thus not directly comparable, they did correlate well with each other with an average correlation of 0.8. The nonparametric regression method detected 175 resolved outages and 9 open outages in August, 2017. The resolved outages lasted an average of 1.5 days (1.75 hours to 15 days). The likelihood of these outages averaged 6e-13 (3e-160 to 4e-11). Figure 1 illustrates how the nonparametric models can be used in a dashboard for all 713 connections. Likelihood of an outage is available for each facility based on how long it has been since the last message was received; this can be updated every minute as needed. Figure 2 illustrates the predictions from a nonparametric model for a single facility and a detected outage. Conclusions Nonparametric regression models of message arrival demonstrated suitable performance for use in detecting connection outages. Compared to standard Poisson regression models, computation time for nonparametric models was longer but within acceptable ranges for operational needs and storage was significantly reduced. Further, storage and computation time for standard models will increase if greater time granularity is desired, whereas the nonparametric models require no additional storage or computation. Model predictions were sufficiently similar between both models for the two to give comparable performance in detecting outages. Given the greater time flexibility of the nonparametric models and the smaller data requirements for initial model estimation (due to fewer estimated parameters), the nonparametric approach represents a promising new option for monitoring syndromic surveillance data quality.


Online Journal of Public Health Informatics | 2017

Comparing Emergency Department Gunshot Wound Data with Mass Casualty Shooting Reports

Andrew Walsh

Objective To determine whether mass casualty shooting events are captured via syndromic surveillance data. Introduction Shootings with multiple victims are a concern for public safety and public health. The precise impact of such events and the trends associated with them is dependent on which events are counted. Some reports only consider events with multiple deaths, typically four or more, while other reports also include events with multiple victims and at least one death. 1 Underreporting is also a concern. Some commonly cited databases for these events are based on media reports of shootings which may or may not capture the complete set of events that meet whatever criteria are being considered. Many gunshot wounds are treated in the emergency department setting. Emergency department registrations routinely collected for syndromic surveillance will capture all of those visits. Analysis of that data may be useful as a supplement to mass shooting databases by identifying unreported events. In addition, clusters of gunshot wound incidents which are not the result of a single shooting event but still represent significant public safety and public health concerns may also be identified. Methods Emergency department registration data was collected from hospitals via the EpiCenter syndromic surveillance system. Gunshot- related visits were identified based on chief complaint contents using EpiCenter’s regular expression-based classification system. The gunshot wound classifier attempts to exclude patients with pre- existing wounds and shooting incidents involving weapon classes that are lesser concerns for public safety, such as nail guns and toy guns. Gunshot-related visits were clustered by day of registration and separately by facility, by patient home zip code, and by patient home county. The largest clusters of each type were compared via manual search against media reports of shootings and against the Gun Violence Archive mass shooting database. Results A total of 23,132 gunshot-related visits were identified from 635 healthcare facilities from 2013 to 2015. From these, the five largest clusters by facility, by zip code, and by county were identified. The clusters included 112 gunshot wounds in total, ranging in size from 4 to 12 with a median of 7. Of the 5 facility clusters, 5 had a corresponding media story and 2 were located in the shooting database. Of the 5 zip code clusters, 1 had a corresponding media story and none were located in the shooting database. Of the 5 county clusters, 4 had a corresponding media story and 1 was located in the shooting database. Conclusions Multiple gunshot wound patients being treated on the same day were not necessarily all shot during the same incident or by the same shooter. The information available in a syndromic surveillance feed does not allow for direct identification of the shooter or shooters. Given that limitation, a complete correspondence between clusters identified in syndromic surveillance data and mass shootings was not expected. The strong correlation between clusters and media coverage indicates that the news is a reasonable source for shooting data. The smaller overlap with the mass shooting database is likely due to the more stringent criteria required for an incident to qualify as a mass shooting. It is still notable that the majority of gunshot clusters were not associated with any particular mass shooting incident. This serves as a reminder that mass shootings represent only a small portion of the total gun violence in the United States. Healthcare data represents a significant additional data source for understanding the complete impact of gun violence on public health and safety. Weekly time series of gunshot-related emergency department visits


Online Journal of Public Health Informatics | 2016

Enhancing EpiCenter Data Quality Analytics with R

Andrew Walsh

The R ecosystem provides a wide range of analytic methods. Previous efforts to integrate R in EpiCenter were hindered by the state of the R/Java interface. By using the PL/R extension to make R available to EpiCenter via the PostgreSQL backend, this limitation can be avoided. Regression models were used to improve data quality analytics options, demonstrating the advantages and potential of making R available within EpiCenter.


Online Journal of Public Health Informatics | 2016

Automating Ambulatory Practice Surveillance for Influenza-Like Illness

Andrew Walsh

ILINet data is a central element of influenza surveillance, but data collection is resource-intensive. Increasingly, ambulatory practices are submitting data automatically to syndromic surveillance systems. These syndromic surveillance feeds could potentially provide data to ILINet for a larger number of practices due to the reduced burden on the practices. This work demonstrates that syndromic surveillance data can demonstrate comparable trends to existing ILINet data. However, some allowances in ILI definition need to be made to account for symptom summarization by registrars.


Online Journal of Public Health Informatics | 2015

Game Plan: Communicable Disease Surveillance for Super Bowl XLVIII– New Jersey, 2014.

Teresa Hamby; Andrew Walsh; Lisa McHugh; Stella Tsai; Edward Lifshitz

This oral presentation will describe the surveillance planning and activities for a large-scale event (Super Bowl XLVIII) using New Jersey syndromic surveillance system (EpiCenter).


Online Journal of Public Health Informatics | 2015

Using Ambulatory Syndromic Surveillance Data for Chronic Disease: A BMI Case Study

Andrew Walsh

Ambulatory practice syndromic surveillance data needs to demonstrate utility beyond infectious disease outbreak detection to warrant integration into existing systems. The nature of ambulatory practice care makes it well suited for monitoring health domains not covered by emergency departments. This project demonstrates collection of height and weight measurements from ambulatory practice syndromic surveillance data. These data are used to calculate patient BMI, an important risk factor for many chronic diseases. This work is presented as a proof-of-principle for applying syndromic surveillance data to additional health domains.


Online Journal of Public Health Informatics | 2013

Paralysis Analysis: Investigating Paralysis Visit Anomalies in New Jersey

Teresa Hamby; Stella Tsai; Carol Genese; Andrew Walsh; Lauren Bradford; Edward Lifshitz

Objective To describe the investigation of a statewide anomaly detected by a newly established state syndromic surveillance system and usage of that system. Introduction On July 11, 2012, New Jersey Department of Health (DOH) Communicable Disease Service (CDS) surveillance staff received email notification of a statewide anomaly in EpiCenter for Paralysis. Two additional anomalies followed within three hours. Since Paralysis Anomalies are uncommon, staff initiated an investigation to determine if there was an outbreak or other event of concern taking place. Also at question was whether receipt of multiple anomalies in such a short time span was statistically or epidemiologically significant. Methods In New Jersey, 68 of 81 total acute care and satellite Emergency Departments (EDs) are connected to EpiCenter, an online syndromic surveillance system developed by Health Monitoring Systems, Inc (HMS) that incorporates statistical management and analytical techniques to process health-related data in real time. Chief complaint text is classified, using text recognition methods, into various public health-related and other categories. Anomalies occur when any of several statistical methods detect increases in incoming data that are outside of established thresholds. After receiving three anomaly notifications related to Paralysis in a 4-hour time period, NJDOH surveillance data staff enlisted CDS and local epidemiologist colleagues to review the data and determine if there was an infectious cause. Results The first EpiCenter anomaly notification was received on July 11, 2012 at 1:22 pm as a result of increased ED visits classified as Paralysis based on facility location for the period beginning at noon on July 10, 2012. Using Cusum EMA analysis, 76 reported interactions exceeded the predicted value of 50.49 and the threshold of 70.72. The second anomaly, also based on facility location, was received at 3:20 pm and the third anomaly notification, based on home location, was received at 4:32 pm. Cusum EMA and Exponential Moving Average analysis methods detected these anomalies. Table 1 describes the anomalies in more detail. Compiled data from all anomalies were reviewed by CDS epidemiology and surveillance staff to determine whether there was a public health event taking place. A total of 89 patients were seen in 39 (57%) of the 68 NJ facilities reporting to EpiCenter with no geographic centralization. Age and gender of patients were reviewed with no clear pattern discerned. Figure 1 shows the time distribution of these visits. Upon further investigation, it was determined that a moderate increase in Paralysis visits over a relatively short time span was sufficient to create an anomaly under the default threshold for those visits. Multiple analysis methods created multiple anomalies which gave an impression the event was of greater significance compared to a single anomaly. To follow up, NJDOH requested that local epidemiologists investigate within their jurisdictions by contacting hospitals directly where EpiCenter data proved inconclusive. Their reports confirmed NJDOH’s findings that the anomalies did not signal an event of public health concern. Conclusions This investigation of three Paralysis anomalies is an important introduction to the newly implemented system’s capabilities in anomaly detection, and also to anomaly investigation procedures developed by NJDOH for local surveillance staff. As a result of this experience, these anomaly investigation procedures are being fine-tuned. The fact that these sequential anomalies resulted in an investigation being undertaken highlights the importance in setting investigation- generating alert thresholds within EpiCenter at a level that will minimize “false” positives without risking the missing of “true” positives.


Online Journal of Public Health Informatics | 2013

Evaluation of Heat-related Illness Surveillance Based on Chief Complaint Data from New Jersey Hospital Emergency Rooms

Michael Berry; Jerald Fagliano; Stella Tsai; Katharine McGreevy; Andrew Walsh; Teresa Hamby


Online Journal of Public Health Informatics | 2014

Identifying Clusters of Rare and Novel Words in Emergency Department Chief Complaints.

Andrew Walsh; Teresa Hamby; Tonya Lowery St. John


Online Journal of Public Health Informatics | 2018

Assessing Prior Pain Visits and Medical History Risk Factors for Opioid Overdose

Andrew Walsh

Collaboration


Dive into the Andrew Walsh's collaboration.

Top Co-Authors

Avatar

Teresa Hamby

New Jersey Department of Health and Senior Services

View shared research outputs
Top Co-Authors

Avatar

Stella Tsai

New Jersey Department of Health and Senior Services

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tonya Lowery St. John

Oklahoma State Department of Health

View shared research outputs
Top Co-Authors

Avatar

Victor Pomary

New Jersey Department of Health and Senior Services

View shared research outputs
Researchain Logo
Decentralizing Knowledge