Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marcus A. Badgeley is active.

Publication


Featured researches published by Marcus A. Badgeley.


Briefings in Bioinformatics | 2017

Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams

Khader Shameer; Marcus A. Badgeley; Riccardo Miotto; Benjamin S. Glicksberg; Joseph W. Morgan; Joel T. Dudley

Abstract Monitoring and modeling biomedical, health care and wellness data from individuals and converging data on a population scale have tremendous potential to improve understanding of the transition to the healthy state of human physiology to disease setting. Wellness monitoring devices and companion software applications capable of generating alerts and sharing data with health care providers or social networks are now available. The accessibility and clinical utility of such data for disease or wellness research are currently limited. Designing methods for streaming data capture, real‐time data aggregation, machine learning, predictive analytics and visualization solutions to integrate wellness or health monitoring data elements with the electronic medical records (EMRs) maintained by health care providers permits better utilization. Integration of population‐scale biomedical, health care and wellness data would help to stratify patients for active health management and to understand clinically asymptomatic patients and underlying illness trajectories. In this article, we discuss various health‐monitoring devices, their ability to capture the unique state of health represented in a patient and their application in individualized diagnostics, prognosis, clinical or wellness intervention. We also discuss examples of translational bioinformatics approaches to integrating patient‐generated data with existing EMRs, personal health records, patient portals and clinical data repositories. Briefly, translational bioinformatics methods, tools and resources are at the center of these advances in implementing real‐time biomedical and health care analytics in the clinical setting. Furthermore, these advances are poised to play a significant role in clinical decision‐making and implementation of data‐driven medicine and wellness care.


Nature Biotechnology | 2017

The Asthma Mobile Health Study, a large-scale clinical observational study using ResearchKit

Yu-Feng Yvonne Chan; Pei Wang; Linda Rogers; Nicole Tignor; Micol Zweig; Steven Gregory Hershman; Nicholas Genes; Erick R. Scott; Eric Krock; Marcus A. Badgeley; Ron Edgar; Samantha Violante; Rosalind J. Wright; Charles A. Powell; Joel T. Dudley; Eric E. Schadt

The feasibility of using mobile health applications to conduct observational clinical studies requires rigorous validation. Here, we report initial findings from the Asthma Mobile Health Study, a research study, including recruitment, consent, and enrollment, conducted entirely remotely by smartphone. We achieved secure bidirectional data flow between investigators and 7,593 participants from across the United States, including many with severe asthma. Our platform enabled prospective collection of longitudinal, multidimensional data (e.g., surveys, devices, geolocation, and air quality) in a subset of users over the 6-month study period. Consistent trending and correlation of interrelated variables support the quality of data obtained via this method. We detected increased reporting of asthma symptoms in regions affected by heat, pollen, and wildfires. Potential challenges with this technology include selection bias, low retention rates, reporting bias, and data security. These issues require attention to realize the full potential of mobile platforms in research and patient care.


BMJ Open | 2016

EHDViz: clinical dashboard development using open-source technologies

Marcus A. Badgeley; Khader Shameer; Benjamin S. Glicksberg; Max S Tomlinson; Patrick J. McCormick; Andrew Kasarskis; David L. Reich; Joel T. Dudley

Objective To design, develop and prototype clinical dashboards to integrate high-frequency health and wellness data streams using interactive and real-time data visualisation and analytics modalities. Materials and methods We developed a clinical dashboard development framework called electronic healthcare data visualization (EHDViz) toolkit for generating web-based, real-time clinical dashboards for visualising heterogeneous biomedical, healthcare and wellness data. The EHDViz is an extensible toolkit that uses R packages for data management, normalisation and producing high-quality visualisations over the web using R/Shiny web server architecture. We have developed use cases to illustrate utility of EHDViz in different scenarios of clinical and wellness setting as a visualisation aid for improving healthcare delivery. Results Using EHDViz, we prototyped clinical dashboards to demonstrate the contextual versatility of EHDViz toolkit. An outpatient cohort was used to visualise population health management tasks (n=14 221), and an inpatient cohort was used to visualise real-time acuity risk in a clinical unit (n=445), and a quantified-self example using wellness data from a fitness activity monitor worn by a single individual was also discussed (n-of-1). The back-end system retrieves relevant data from data source, populates the main panel of the application and integrates user-defined data features in real-time and renders output using modern web browsers. The visualisation elements can be customised using health features, disease names, procedure names or medical codes to populate the visualisations. The source code of EHDViz and various prototypes developed using EHDViz are available in the public domain at http://ehdviz.dudleylab.org. Conclusions Collaborative data visualisations, wellness trend predictions, risk estimation, proactive acuity status monitoring and knowledge of complex disease indicators are essential components of implementing data-driven precision medicine. As an open-source visualisation framework capable of integrating health assessment, EHDViz aims to be a valuable toolkit for rapid design, development and implementation of scalable clinical data visualisation dashboards.


Bioinformatics | 2016

Comparative analyses of population-scale phenomic data in electronic medical records reveal race-specific disease networks

Benjamin S. Glicksberg; Li Li; Marcus A. Badgeley; Khader Shameer; Roman Kosoy; Noam D. Beckmann; Nam H. Pho; Jörg Hakenberg; Meng Ma; Kristin L. Ayers; Gabriel E. Hoffman; Shuyu Dan Li; Eric E. Schadt; Chirag Patel; Rong Chen; Joel T. Dudley

Motivation: Underrepresentation of racial groups represents an important challenge and major gap in phenomics research. Most of the current human phenomics research is based primarily on European populations; hence it is an important challenge to expand it to consider other population groups. One approach is to utilize data from EMR databases that contain patient data from diverse demographics and ancestries. The implications of this racial underrepresentation of data can be profound regarding effects on the healthcare delivery and actionability. To the best of our knowledge, our work is the first attempt to perform comparative, population-scale analyses of disease networks across three different populations, namely Caucasian (EA), African American (AA) and Hispanic/Latino (HL). Results: We compared susceptibility profiles and temporal connectivity patterns for 1988 diseases and 37 282 disease pairs represented in a clinical population of 1 025 573 patients. Accordingly, we revealed appreciable differences in disease susceptibility, temporal patterns, network structure and underlying disease connections between EA, AA and HL populations. We found 2158 significantly comorbid diseases for the EA cohort, 3265 for AA and 672 for HL. We further outlined key disease pair associations unique to each population as well as categorical enrichments of these pairs. Finally, we identified 51 key ‘hub’ diseases that are the focal points in the race-centric networks and of particular clinical importance. Incorporating race-specific disease comorbidity patterns will produce a more accurate and complete picture of the disease landscape overall and could support more precise understanding of disease relationships and patient management towards improved clinical outcomes. Contacts: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Briefings in Bioinformatics | 2018

Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning

Khader Shameer; Benjamin S. Glicksberg; Rachel Hodos; Kipp W. Johnson; Marcus A. Badgeley; Ben Readhead; Max S Tomlinson; Timothy O’Connor; Riccardo Miotto; Brian Kidd; Rong Chen; Avi Ma’ayan; Joel T. Dudley

&NA; Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug‐resistant pathogens, drug‐resistant cancers (cisplatin‐resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta‐analyses could augment therapeutic development.


Bioinformatics | 2015

Hybrid Bayesian-Rank Integration Approach Improves the Predictive Power of Genomic Dataset Aggregation

Marcus A. Badgeley; Stuart C. Sealfon; Maria Chikina

MOTIVATION Modern molecular technologies allow the collection of large amounts of high-throughput data on the functional attributes of genes. Often multiple technologies and study designs are used to address the same biological question such as which genes are overexpressed in a specific disease state. Consequently, there is considerable interest in methods that can integrate across datasets to present a unified set of predictions. RESULTS An important aspect of data integration is being able to account for the fact that datasets may differ in how accurately they capture the biological signal of interest. While many methods to address this problem exist, they always rely either on dataset internal statistics, which reflect data structure and not necessarily biological relevance, or external gold standards, which may not always be available. We present a new rank aggregation method for data integration that requires neither external standards nor internal statistics but relies on Bayesian reasoning to assess dataset relevance. We demonstrate that our method outperforms established techniques and significantly improves the predictive power of rank-based aggregations. We show that our method, which does not require an external gold standard, provides reliable estimates of dataset relevance and allows the same set of data to be integrated differently depending on the specific signal of interest. AVAILABILITY The method is implemented in R and is freely available at http://www.pitt.edu/~mchikina/BIRRA/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Radiology | 2018

Natural Language–based Machine Learning Models for the Annotation of Clinical Radiology Reports

John Zech; Margaret Pain; J. Titano; Marcus A. Badgeley; Javin Schefflein; Andres Su; Anthony B. Costa; Joshua B. Bederson; Joseph Lehar; Eric K. Oermann

Purpose To compare different methods for generating features from radiology reports and to develop a method to automatically identify findings in these reports. Materials and Methods In this study, 96 303 head computed tomography (CT) reports were obtained. The linguistic complexity of these reports was compared with that of alternative corpora. Head CT reports were preprocessed, and machine-analyzable features were constructed by using bag-of-words (BOW), word embedding, and Latent Dirichlet allocation-based approaches. Ultimately, 1004 head CT reports were manually labeled for findings of interest by physicians, and a subset of these were deemed critical findings. Lasso logistic regression was used to train models for physician-assigned labels on 602 of 1004 head CT reports (60%) using the constructed features, and the performance of these models was validated on a held-out 402 of 1004 reports (40%). Models were scored by area under the receiver operating characteristic curve (AUC), and aggregate AUC statistics were reported for (a) all labels, (b) critical labels, and (c) the presence of any critical finding in a report. Sensitivity, specificity, accuracy, and F1 score were reported for the best performing models (a) predictions of all labels and (b) identification of reports containing critical findings. Results The best-performing model (BOW with unigrams, bigrams, and trigrams plus average word embeddings vector) had a held-out AUC of 0.966 for identifying the presence of any critical head CT finding and an average 0.957 AUC across all head CT findings. Sensitivity and specificity for identifying the presence of any critical finding were 92.59% (175 of 189) and 89.67% (191 of 213), respectively. Average sensitivity and specificity across all findings were 90.25% (1898 of 2103) and 91.72% (18 351 of 20 007), respectively. Simpler BOW methods achieved results competitive with those of more sophisticated approaches, with an average AUC for presence of any critical finding of 0.951 for unigram BOW versus 0.966 for the best-performing model. The Yule I of the head CT corpus was 34, markedly lower than that of the Reuters corpus (at 103) or I2B2 discharge summaries (at 271), indicating lower linguistic complexity. Conclusion Automated methods can be used to identify findings in radiology reports. The success of this approach benefits from the standardized language of these reports. With this method, a large labeled corpus can be generated for applications such as deep learning.


Nature Medicine | 2018

Automated deep-neural-network surveillance of cranial images for acute neurologic events

J. Titano; Marcus A. Badgeley; Javin Schefflein; Margaret Pain; Andres Su; Michael Cai; Nathaniel C. Swinburne; John Zech; Jun Kim; Joshua B. Bederson; J Mocco; Burton P. Drayer; Joseph Lehar; Samuel K. Cho; Anthony B. Costa; Eric K. Oermann

Rapid diagnosis and treatment of acute neurological illnesses such as stroke, hemorrhage, and hydrocephalus are critical to achieving positive outcomes and preserving neurologic function—‘time is brain’1–5. Although these disorders are often recognizable by their symptoms, the critical means of their diagnosis is rapid imaging6–10. Computer-aided surveillance of acute neurologic events in cranial imaging has the potential to triage radiology workflow, thus decreasing time to treatment and improving outcomes. Substantial clinical work has focused on computer-assisted diagnosis (CAD), whereas technical work in volumetric image analysis has focused primarily on segmentation. 3D convolutional neural networks (3D-CNNs) have primarily been used for supervised classification on 3D modeling and light detection and ranging (LiDAR) data11–15. Here, we demonstrate a 3D-CNN architecture that performs weakly supervised classification to screen head CT images for acute neurologic events. Features were automatically learned from a clinical radiology dataset comprising 37,236 head CTs and were annotated with a semisupervised natural-language processing (NLP) framework16. We demonstrate the effectiveness of our approach to triage radiology workflow and accelerate the time to diagnosis from minutes to seconds through a randomized, double-blinded, prospective trial in a simulated clinical environment.A deep-learning algorithm is developed to provide rapid and accurate diagnosis of clinical 3D head CT-scan images to triage and prioritize urgent neurological events, thus potentially accelerating time to diagnosis and care in clinical settings.


Journal of NeuroInterventional Surgery | 2017

Deep learning guided stroke management: a review of clinical applications

Rui Feng; Marcus A. Badgeley; J Mocco; Eric K. Oermann

Stroke is a leading cause of long-term disability, and outcome is directly related to timely intervention. Not all patients benefit from rapid intervention, however. Thus a significant amount of attention has been paid to using neuroimaging to assess potential benefit by identifying areas of ischemia that have not yet experienced cellular death. The perfusion–diffusion mismatch, is used as a simple metric for potential benefit with timely intervention, yet penumbral patterns provide an inaccurate predictor of clinical outcome. Machine learning research in the form of deep learning (artificial intelligence) techniques using deep neural networks (DNNs) excel at working with complex inputs. The key areas where deep learning may be imminently applied to stroke management are image segmentation, automated featurization (radiomics), and multimodal prognostication. The application of convolutional neural networks, the family of DNN architectures designed to work with images, to stroke imaging data is a perfect match between a mature deep learning technique and a data type that is naturally suited to benefit from deep learning’s strengths. These powerful tools have opened up exciting opportunities for data-driven stroke management for acute intervention and for guiding prognosis. Deep learning techniques are useful for the speed and power of results they can deliver and will become an increasingly standard tool in the modern stroke specialist’s arsenal for delivering personalized medicine to patients with ischemic stroke.


Bioinformatics | 2018

CANDI: an R package and Shiny app for annotating radiographs and evaluating computer-aided diagnosis

Marcus A. Badgeley; Manway Liu; Benjamin S. Glicksberg; Mark Shervey; John Zech; Khader Shameer; Joseph Lehar; Eric K. Oermann; Michael V. McConnell; Thomas M Snyder; Joel T. Dudley

Abstract Motivation Radiologists have used algorithms for Computer-Aided Diagnosis (CAD) for decades. These algorithms use machine learning with engineered features, and there have been mixed findings on whether they improve radiologists’ interpretations. Deep learning offers superior performance but requires more training data and has not been evaluated in joint algorithm-radiologist decision systems. Results We developed the Computer-Aided Note and Diagnosis Interface (CANDI) for collaboratively annotating radiographs and evaluating how algorithms alter human interpretation. The annotation app collects classification, segmentation, and image captioning training data, and the evaluation app randomizes the availability of CAD tools to facilitate clinical trials on radiologist enhancement. Availability and implementation Demonstrations and source code are hosted at (https://candi.nextgenhealthcare.org), and (https://github.com/mbadge/candi), respectively, under GPL-3 license. Supplementary information Supplementary material is available at Bioinformatics online.

Collaboration


Dive into the Marcus A. Badgeley's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Khader Shameer

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Eric K. Oermann

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Anthony B. Costa

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

John Zech

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Andrew Kasarskis

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

J. Titano

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Li Li

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge