Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthew Scotch is active.

Publication


Featured researches published by Matthew Scotch.


Journal of Biomedical Informatics | 2008

Methodological Review: HCLS 2.0/3.0: Health care and life sciences data mashup using Web 2.0/3.0

Kei-Hoi Cheung; Kevin Y. Yip; Jeffrey P. Townsend; Matthew Scotch

We describe the potential of current Web 2.0 technologies to achieve data mashup in the health care and life sciences (HCLS) domains, and compare that potential to the nascent trend of performing semantic mashup. After providing an overview of Web 2.0, we demonstrate two scenarios of data mashup, facilitated by the following Web 2.0 tools and sites: Yahoo! Pipes, Dapper, Google Maps and GeoCommons. In the first scenario, we exploited Dapper and Yahoo! Pipes to implement a challenging data integration task in the context of DNA microarray research. In the second scenario, we exploited Yahoo! Pipes, Google Maps, and GeoCommons to create a geographic information system (GIS) interface that allows visualization and integration of diverse categories of public health data, including cancer incidence and pollution prevalence data. Based on these two scenarios, we discuss the strengths and weaknesses of these Web 2.0 mashup technologies. We then describe Semantic Web, the mainstream Web 3.0 technology that enables more powerful data integration over the Web. We discuss the areas of intersection of Web 2.0 and Semantic Web, and describe the potential benefits that can be brought to HCLS research by combining these two sets of technologies.


Journal of the American Medical Informatics Association | 2011

The Yale cTAKES extensions for document classification: architecture and application.

Vijay Garla; Vincent Lo Re; Zachariah Dorey-Stein; Farah Kidwai; Matthew Scotch; Julie A. Womack; Amy C. Justice; Cynthia Brandt

BACKGROUND Open-source clinical natural-language-processing (NLP) systems have lowered the barrier to the development of effective clinical document classification systems. Clinical natural-language-processing systems annotate the syntax and semantics of clinical text; however, feature extraction and representation for document classification pose technical challenges. METHODS The authors developed extensions to the clinical Text Analysis and Knowledge Extraction System (cTAKES) that simplify feature extraction, experimentation with various feature representations, and the development of both rule and machine-learning based document classifiers. The authors describe and evaluate their system, the Yale cTAKES Extensions (YTEX), on the classification of radiology reports that contain findings suggestive of hepatic decompensation. RESULTS AND DISCUSSION The F(1)-Score of the system for the retrieval of abdominal radiology reports was 96%, and was 79%, 91%, and 95% for the presence of liver masses, ascites, and varices, respectively. The authors released YTEX as open source, available at http://code.google.com/p/ytex.


hawaii international conference on system sciences | 2005

SOVAT: Spatial OLAP Visualization and Analysis Tool

Matthew Scotch; Bambang Parmanto

Community health research is practiced with disparate independently used tools that by themselves do not allow for the type of comprehensive and thorough analysis needed for effective public health evaluation. The Spatial OLAP Visualization and Analysis Tool (SOVAT) is a new type of research application for community health assessments. SOVAT integrates into one system many of the necessary characteristics needed to make comprehensive community health decisions. By combining On-Line Analytical Processing (OLAP) with Geospatial Information System (GIS) capabilities, our system can handle large amounts of data, perform geospatial and statistical calculations, and then display this information in both a numerical and spatial view within the same interface. It is anticipated that this unique system will provide researchers with the ability to perform more comprehensive assessments while enabling for more informed public health decisions.


BMC Bioinformatics | 2014

Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks

Michael J. Kane; Natalie Price; Matthew Scotch; Peter M. Rabinowitz

BackgroundTime series models can play an important role in disease prediction. Incidence data can be used to predict the future occurrence of disease events. Developments in modeling approaches provide an opportunity to compare different time series models for predictive power.ResultsWe applied ARIMA and Random Forest time series models to incidence data of outbreaks of highly pathogenic avian influenza (H5N1) in Egypt, available through the online EMPRES-I system. We found that the Random Forest model outperformed the ARIMA model in predictive ability. Furthermore, we found that the Random Forest model is effective for predicting outbreaks of H5N1 in Egypt.ConclusionsRandom Forest time series modeling provides enhanced predictive ability over existing time series models for the prediction of infectious disease outbreaks. This result, along with those showing the concordance between bird and human outbreaks (Rabinowitz et al. 2012), provides a new approach to predicting these dangerous outbreaks in bird populations based on existing, freely available data. Our analysis uncovers the time-series structure of outbreak severity for highly pathogenic avain influenza (H5N1) in Egypt.


International Journal of Medical Informatics | 2006

Development of SOVAT: A numerical-spatial decision support system for community health assessment research

Matthew Scotch; Bambang Parmanto

INTRODUCTION The development of numerical-spatial routines is frequently required to solve complex community health problems. Community health assessment (CHA) professionals who use information technology need a complete system that is capable of supporting the development of numerical-spatial routines. BACKGROUND Currently, there is no decision support system (DSS) that is effectively able to accomplish this task as the majority of public health geospatial information systems (GIS) are based on traditional (relational) database architecture. On-Line Analytical Processing (OLAP) is a multidimensional data warehouse technique that is commonly used as a decision support system in standard industry. OLAP alone is not sufficient for solving numerical-spatial problems that frequently occur in CHA research. Coupling it with GIS technology offers the potential for a very powerful and useful system. METHODOLOGY A community health OLAP cube was created by integrating health and population data from various sources. OLAP and GIS technologies were then combined to develop the Spatial OLAP Visualization and Analysis Tool (SOVAT). RESULTS The synergy of numerical and spatial environments within SOVAT is shown through an elaborate and easy-to-use drag and drop and direct manipulation graphical user interface (GUI). Community health problem-solving examples (routines) using SOVAT are shown through a series of screen shots. DISCUSSION The impact of the difference between SOVAT and existing GIS public health applications can be seen by considering the numerical-spatial problem-solving examples. These examples are facilitated using OLAP-GIS functions. These functions can be mimicked in existing GIS public applications, but their performance and system response would be significantly worse since GIS is based on traditional (relational) backend. CONCLUSION OLAP-GIS system offer great potential for powerful numerical-spatial decision support in community health analysis. The functionality of an OLAP-GIS system has been shown through a series of example community health numerical-spatial problems. Efforts are now focused on determining its usability during human-computer interaction (HCI). Later work will focus on performing summative evaluations comparing SOVAT to existing decision support tools used during community health assessment research.


Medical Care | 2010

Rural Residence Is Associated With Delayed Care Entry and Increased Mortality Among Veterans With Human Immunodeficiency Virus Infection

Michael E. Ohl; Janet P. Tate; Mona Duggal; Melissa Skanderson; Matthew Scotch; Peter J. Kaboli; Mary Vaughan-Sarrazin; Amy C. Justice

Context:Rural persons with human immunodeficiency virus (HIV) face many barriers to care, but little is known about rural-urban variation in HIV outcomes. Objective:To determine the association between rural residence and HIV outcomes. Design, Setting, and Patients:Retrospective cohort study of mortality among persons initiating HIV care in Veterans Administration (VA) during 1998–2006, with mortality follow-up through 2008. Rural residence was determined using Rural Urban Commuting Area codes. We identified 8489 persons initiating HIV care in VA with no evidence of combination antiretroviral therapy (cART) use at care entry, of whom 705 (8.3%) were rural. Outcome Measure:All-cause mortality. Results:At care entry, rural persons were less likely than urban persons to have drug use problems (10.6% vs. 19.5%, P < 0.001) or hepatitis C (34.3% vs. 41.2%, P = 0.001), but had more advanced HIV infection (median CD4: 186 vs. 246, P < 0.001). By 2 years after care entry, 5874 persons had initiated cART (528 rural [74.9%] and 5346 urban [68.7%], P = 0.001), and there were 1022 deaths (108 rural [15.3%] and 914 urban [11.7%], P = 0.004). The mortality hazard ratio for rural persons compared with urban was 1.34 (95% confidence interval: 1.05–1.69). The hazard ratio decreased to 1.18 (95% confidence interval: 0.93–1.50) after adjustment for HIV severity (CD4 and AIDS-defining illnesses) at care entry, and was 1.17 (95% confidence interval: 0.92–1.50) in a model adjusting for age, HIV severity at care entry, substance use, hepatitis B or C diagnoses, and cART initiation. Conclusions:Later entry into care drives increased mortality for rural compared with urban veterans with HIV. Future studies should explore the person, care system, and community-level determinants of late care entry for rural persons with HIV.


International Journal of Health Geographics | 2009

Risk factors for human infection with West Nile Virus in Connecticut: a multi-year analysis

Ann Liu; Vivian Lee; Deron Galusha; Martin D. Slade; Maria A. Diuk-Wasser; Theodore G. Andreadis; Matthew Scotch; Peter M. Rabinowitz

BackgroundThe optimal method for early prediction of human West Nile virus (WNV) infection risk remains controversial. We analyzed the predictive utility of risk factor data for human WNV over a six-year period in Connecticut.Results and DiscussionUsing only environmental variables or animal sentinel data was less predictive than a model that considered all variables. In the final parsimonious model, population density, growing degree-days, temperature, WNV positive mosquitoes, dead birds and WNV positive birds were significant predictors of human infection risk, with an ROC value of 0.75.ConclusionA real-time model using climate, land use, and animal surveillance data to predict WNV risk appears feasible. The dynamic patterns of WNV infection suggest a need to periodically refine such prediction systems.MethodsUsing multiple logistic regression, the 30-day risk of human WNV infection by town was modeled using environmental variables as well as mosquito and wild bird surveillance.


Journal of Medical Internet Research | 2010

Biomedical Informatics Techniques for Processing and Analyzing Web Blogs of Military Service Members

Sergiy Konovalov; Matthew Scotch; Lori A. Post; Cynthia Brandt

Introduction Web logs (“blogs”) have become a popular mechanism for people to express their daily thoughts, feelings, and emotions. Many of these expressions contain health care-related themes, both physical and mental, similar to information discussed during a clinical interview or medical consultation. Thus, some of the information contained in blogs might be important for health care research, especially in mental health where stress-related conditions may be difficult and expensive to diagnose and where early recognition is often key to successful treatment. In the field of biomedical informatics, techniques such as information retrieval (IR) and natural language processing (NLP) are often used to unlock information contained in free-text notes. These methods might assist the clinical research community to better understand feelings and emotions post deployment and the burden of symptoms of stress among US military service members. Methods In total, 90 military blog posts describing deployment situations and 60 control posts of Operation Enduring Freedom/Operation Iraqi Freedom (OEF/OIF) were collected. After “stop” word exclusion and stemming, a “bag-of-words” representation and term weighting was performed, and the most relevant words were manually selected out of the high-weight words. A pilot ontology was created using Collaborative Protégé, a knowledge management application. The word lists and the ontology were then used within General Architecture for Text Engineering (GATE), an NLP framework, to create an automated pipeline for recognition and analysis of blogs related to combat exposure. An independent expert opinion was used to create a reference standard and evaluate the results of the GATE pipeline. Results The 2 dimensions of combat exposure descriptors identified were: words dealing with physical exposure and the soldiers’ emotional reactions to it. GATE pipeline was able to retrieve blog texts describing combat exposure with precision 0.9, recall 0.75, and F-score 0.82. Discussion Natural language processing and automated information retrieval might potentially provide valuable tools for retrieving and analyzing military blog posts and uncovering military service members’ emotions and experiences of combat exposure.


Journal of Biomedical Informatics | 2011

Enhancing phylogeography by improving geographical information from GenBank

Matthew Scotch; Indra Neil Sarkar; Changjiang Mei; Robert Leaman; Kei-Hoi Cheung; Pierina Ortiz; Ashutosh Singraur; Graciela Gonzalez

Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the species over time within a geographical area, the impact of geography over the evolutionary history, or the expected population of the species within the area. Molecular sequence data from NCBI, specifically GenBank, provide an abundance of available sequence data for phylogeography. However, geographical data is inconsistently represented and sparse across GenBank entries. This can impede analysis and in situations where the geographical information is inferred, and potentially lead to erroneous results. In this paper, we describe the current state of geographical data in GenBank, and illustrate how automated processing techniques such as named entity recognition, can enhance the geographical data available for phylogeographic studies.


PLOS ONE | 2012

Comparison of Human and Animal Surveillance Data for H5N1 Influenza A in Egypt 2006–2011

Peter M. Rabinowitz; Deron Galusha; Sally Vegso; Jennifer Michalove; Seppo T. Rinne; Matthew Scotch; Michael J. Kane

Background The majority of emerging infectious diseases are zoonotic (transmissible between animals and humans) in origin, and therefore integrated surveillance of disease events in humans and animals has been recommended to support effective global response to disease emergence. While in the past decade there has been extensive global surveillance for highly pathogenic avian influenza (HPAI) infection in both animals and humans, there have been few attempts to compare these data streams and evaluate the utility of such integration. Methodology We compared reports of bird outbreaks of HPAI H5N1 in Egypt for 2006–2011 compiled by the World Organisation for Animal Health (OIE) and the UN Food and Agriculture Organization (FAO) EMPRESi reporting system with confirmed human H5N1 cases reported to the World Health Organization (WHO) for Egypt during the same time period. Principal Findings Both human cases and bird outbreaks showed a cyclic pattern for the country as a whole, and there was a statistically significant temporal correlation between the data streams. At the governorate level, the first outbreak in birds in a season usually but not always preceded the first human case, and the time lag between events varied widely, suggesting regional differences in zoonotic risk and/or surveillance effectiveness. In a multivariate risk model, lower temperature, lower urbanization, higher poultry density, and the recent occurrence of a bird outbreak were associated with increased risk of a human case of HPAI in the same governorate, although the positive predictive value of a bird outbreak was low. Conclusions Integrating data streams of surveillance for human and animal cases of zoonotic disease holds promise for better prediction of disease risk and identification of environmental and regional factors that can affect risk. Such efforts can also point out gaps in human and animal surveillance systems and generate hypotheses regarding disease transmission.

Collaboration


Dive into the Matthew Scotch's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel Magee

Arizona State University

View shared research outputs
Top Co-Authors

Avatar

Graciela Gonzalez

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rachel Beard

Arizona State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tasnia Tahsin

Arizona State University

View shared research outputs
Top Co-Authors

Avatar

Changjiang Mei

Arizona State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge