Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kipp W. Johnson is active.

Publication


Featured researches published by Kipp W. Johnson.


Heart | 2018

Machine learning in cardiovascular medicine: are we there yet?

Khader Shameer; Kipp W. Johnson; Benjamin S. Glicksberg; Joel T. Dudley; Partho P. Sengupta

Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine.


Briefings in Bioinformatics | 2018

Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning

Khader Shameer; Benjamin S. Glicksberg; Rachel Hodos; Kipp W. Johnson; Marcus A. Badgeley; Ben Readhead; Max S Tomlinson; Timothy O’Connor; Riccardo Miotto; Brian Kidd; Rong Chen; Avi Ma’ayan; Joel T. Dudley

&NA; Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug‐resistant pathogens, drug‐resistant cancers (cisplatin‐resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta‐analyses could augment therapeutic development.


pacific symposium on biocomputing | 2017

PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.

Khader Shameer; Kipp W. Johnson; Alexandre Yahi; Riccardo Miotto; Li Li; Doran Ricks; Jebakumar Jebakaran; Patricia Kovatch; Partho P. Sengupta; Sengupta Gelijns; Alan Moskovitz; Bruce Darrow; David L David; Andrew Kasarskis; Nicholas P. Tatonetti; Sean Pinney; Joel T. Dudley

Reduction of preventable hospital readmissions that result from chronic or acute conditions like stroke, heart failure, myocardial infarction and pneumonia remains a significant challenge for improving the outcomes and decreasing the cost of healthcare delivery in the United States. Patient readmission rates are relatively high for conditions like heart failure (HF) despite the implementation of high-quality healthcare delivery operation guidelines created by regulatory authorities. Multiple predictive models are currently available to evaluate potential 30-day readmission rates of patients. Most of these models are hypothesis driven and repetitively assess the predictive abilities of the same set of biomarkers as predictive features. In this manuscript, we discuss our attempt to develop a data-driven, electronic-medical record-wide (EMR-wide) feature selection approach and subsequent machine learning to predict readmission probabilities. We have assessed a large repertoire of variables from electronic medical records of heart failure patients in a single center. The cohort included 1,068 patients with 178 patients were readmitted within a 30-day interval (16.66% readmission rate). A total of 4,205 variables were extracted from EMR including diagnosis codes (n=1,763), medications (n=1,028), laboratory measurements (n=846), surgical procedures (n=564) and vital signs (n=4). We designed a multistep modeling strategy using the Naïve Bayes algorithm. In the first step, we created individual models to classify the cases (readmitted) and controls (non-readmitted). In the second step, features contributing to predictive risk from independent models were combined into a composite model using a correlation-based feature selection (CFS) method. All models were trained and tested using a 5-fold cross-validation method, with 70% of the cohort used for training and the remaining 30% for testing. Compared to existing predictive models for HF readmission rates (AUCs in the range of 0.6-0.7), results from our EMR-wide predictive model (AUC=0.78; Accuracy=83.19%) and phenome-wide feature selection strategies are encouraging and reveal the utility of such datadriven machine learning. Fine tuning of the model, replication using multi-center cohorts and prospective clinical trial to evaluate the clinical utility would help the adoption of the model as a clinical decision system for evaluating readmission status.


JACC: Basic to Translational Science | 2017

Enabling Precision Cardiology Through Multiscale Biology and Systems Medicine

Kipp W. Johnson; Khader Shameer; Benjamin S. Glicksberg; Ben Readhead; Partho P. Sengupta; Johan Björkegren; Jason C. Kovacic; Joel T. Dudley

Summary The traditional paradigm of cardiovascular disease research derives insight from large-scale, broadly inclusive clinical studies of well-characterized pathologies. These insights are then put into practice according to standardized clinical guidelines. However, stagnation in the development of new cardiovascular therapies and variability in therapeutic response implies that this paradigm is insufficient for reducing the cardiovascular disease burden. In this state-of-the-art review, we examine 3 interconnected ideas we put forth as key concepts for enabling a transition to precision cardiology: 1) precision characterization of cardiovascular disease with machine learning methods; 2) the application of network models of disease to embrace disease complexity; and 3) using insights from the previous 2 ideas to enable pharmacology and polypharmacology systems for more precise drug-to-patient matching and patient-disease stratification. We conclude by exploring the challenges of applying a precision approach to cardiology, which arise from a deficit of the required resources and infrastructure, and emerging evidence for the clinical effectiveness of this nascent approach.


Proceedings of the Pacific Symposium | 2018

Causal inference on electronic health records to assess blood pressure treatment targets: an application of the parametric g formula

Kipp W. Johnson; Benjamin S. Glicksberg; Rachel Hodos; Khader Shameer; Joel T. Dudley

Hypertension is a major risk factor for ischemic cardiovascular disease and cerebrovascular disease, which are respectively the primary and secondary most common causes of morbidity and mortality across the globe. To alleviate the risks of hypertension, there are a number of effective antihypertensive drugs available. However, the optimal treatment blood pressure goal for antihypertensive therapy remains an area of controversy. The results of the recent Systolic Blood Pressure Intervention Trial (SPRINT) trial, which found benefits for intensive lowering of systolic blood pressure, have been debated for several reasons. We aimed to assess the benefits of treating to four different blood pressure targets and to compare our results to those of SPRINT using a method for causal inference called the parametric g formula. We applied this method to blood pressure measurements obtained from the electronic health records of approximately 200,000 patients who visited the Mount Sinai Hospital in New York, NY. We simulated the effect of four clinically relevant dynamic treatment regimes, assessing the effectiveness of treating to four different blood pressure targets: 150 mmHg, 140 mmHg, 130 mmHg, and 120 mmHg. In contrast to current American Heart Association guidelines and in concordance with SPRINT, we find that targeting 120 mmHg systolic blood pressure is significantly associated with decreased incidence of major adverse cardiovascular events. Causal inference methods applied to electronic methods are a powerful and flexible technique and medicine may benefit from their increased usage.


Proceedings of the Pacific Symposium | 2018

Automated disease cohort selection using word embeddings from Electronic Health Records

Benjamin S. Glicksberg; Riccardo Miotto; Kipp W. Johnson; Khader Shameer; Li Li; Rong Chen; Joel T. Dudley

Accurate and robust cohort definition is critical to biomedical discovery using Electronic Health Records (EHR). Similar to prospective study designs, high quality EHR-based research requires rigorous selection criteria to designate case/control status particular to each disease. Electronic phenotyping algorithms, which are manually built and validated per disease, have been successful in filling this need. However, these approaches are time-consuming, leading to only a relatively small amount of algorithms for diseases developed. Methodologies that automatically learn features from EHRs have been used for cohort selection as well. To date, however, there has been no systematic analysis of how these methods perform against current gold standards. Accordingly, this paper compares the performance of a state-of-the-art automated feature learning method to extracting research-grade cohorts for five diseases against their established electronic phenotyping algorithms. In particular, we use word2vec to create unsupervised embeddings of the phenotype space within an EHR system. Using medical concepts as a query, we then rank patients by their proximity in the embedding space and automatically extract putative disease cohorts via a distance threshold. Experimental evaluation shows promising results with average F-score of 0.57 and AUC-ROC of 0.98. However, we noticed that results varied considerably between diseases, thus necessitating further investigation and/or phenotype-specific refinement of the approach before being readily deployed across all diseases.


Scientific Reports | 2017

Intracoronary Imaging, Cholesterol Efflux, and Transcriptomics after Intensive Statin Treatment in Diabetes

Surbhi Chamaria; Kipp W. Johnson; Yuliya Vengrenyuk; Usman Baber; Khader Shameer; Aparna A. Divaraniya; Benjamin S. Glicksberg; Li Li; Samit Bhatheja; Pedro R. Moreno; Akiko Maehara; Roxana Mehran; Joel T. Dudley; Jagat Narula; Samin K. Sharma; Annapoorna Kini

Residual atherothrombotic risk remains higher in patients with versus without diabetes mellitus (DM) despite statin therapy. The underlying mechanisms are unclear. This is a retrospective post-hoc analysis of the YELLOW II trial, comparing patients with and without DM (non-DM) who received rosuvastatin 40 mg for 8–12 weeks and underwent intracoronary multimodality imaging of an obstructive nonculprit lesion, before and after therapy. In addition, blood samples were drawn to assess cholesterol efflux capacity (CEC) and changes in gene expression in peripheral blood mononuclear cells (PBMC). There was a significant reduction in low density lipoprotein-cholesterol (LDL-C), an increase in CEC and beneficial changes in plaque morphology including increase in fibrous cap thickness and decrease in the prevalence of thin cap fibro-atheroma by optical coherence tomography in DM and non-DM patients. While differential gene expression analysis did not demonstrate differences in PBMC transcriptome between the two groups on the single-gene level, weighted gene coexpression network analysis revealed two modules of coexpressed genes associated with DM, Collagen Module and Platelet Module, related to collagen catabolism and platelet function respectively. Bayesian network analysis revealed key driver genes within these modules. These transcriptomic findings might provide potential mechanisms responsible for the higher cardiovascular risk in DM patients.


bioRxiv | 2018

Rapid Therapeutic Recommendations in the Context of a Global Public Health Crisis using Translational Bioinformatics Approaches: A proof-of-concept study using Nipah Virus Infection

Khader Shameer; Kipp W. Johnson; Ben Readhead; Benjamin S. Glicksberg; Claire McCallum; Amjesh R; Jamie S. Hirsch; Kevin Bock; John Chelico; Negin Hajizadeh; Michael I. Oppenheim; Joel Dudley

We live in a world of emerging new diseases and old diseases resurging in more aggressive forms. Drug development by pharmaceutical companies is a market-driven and costly endeavor, and thus it is often a challenge when drugs are needed for diseases endemic only to certain regions or which affect only a few patients. However, biomedical open data is accessible and reusable for reanalysis and generation of a new hypotheses and discovery. In this study, we leverage biomedical data and tools to analyze available data on Nipah Virus (NiV) infection. NiV infection is an emerging zoonosis that is transmissible to humans and is associated with high mortality rates. In this study, explored the application of computational drug repositioning and chemogenomic enrichment analyses using host transcriptome data to match drugs that could reverse the virus-induced gene signature. We performed analyses using two gene signatures: i) A previously published gene signature (n=34), and ii) a gene signature generated using the characteristic direction method (n= 5,533). Our predictive framework suggests that several drugs including FDA approved therapies like beclometasone, trihexyphenidyl, S-propranolol etc. could modulate the NiV infection induced gene signatures in endothelial cells. A target specific analysis of CXCL10 also suggests the potential application of Eldelumab, an investigative therapy for Crohn’s disease and ulcerative colitis, as a putative candidate for drug repositioning. To conclude, we also discuss challenges and opportunities in clinical trials (n-of-1 and adaptive trials) for repositioned drugs. Further follow-up studies including biochemical assays and clinical trials are required to identify effective therapies for clinical use. Our proof-of-concept study highlights that translational bioinformatics methods including gene expression analyses and computational drug repositioning could augment epidemiological investigations in the context of an emerging disease with no effective treatment.


bioRxiv | 2018

Prioritizing Small Molecule as Candidates for Drug Repositioning using Machine Learning

Khader Shameer; Kipp W. Johnson; Benjamin S. Glicksberg; Rachel Hodos; Ben Readhead; Max S Tomlinson; Joel Dudley

Drug repositioning, i.e. identifying new uses for existing drugs and research compounds, is a cost-effective drug discovery strategy that is continuing to grow in popularity. Prioritizing and identifying drugs capable of being repositioned may improve the productivity and success rate of the drug discovery cycle, especially if the drug has already proven to be safe in humans. In previous work, we have shown that drugs that have been successfully repositioned have different chemical properties than those that have not. Hence, there is an opportunity to use machine learning to prioritize drug-like molecules as candidates for future repositioning studies. We have developed a feature engineering and machine learning that leverages data from publicly available drug discovery resources: RepurposeDB and DrugBank. ChemVec is the chemoinformatics-based feature engineering strategy designed to compile molecular features representing the chemical space of all drug molecules in the study. ChemVec was trained through a variety of supervised classification algorithms (Naïve Bayes, Random Forest, Support Vector Machines and an ensemble model combining the three algorithms). Models were created using various combinations of datasets as Connectivity Map based model, DrugBank Approved compounds based model, and DrugBank full set of compounds; of which RandomForest trained using Connectivity Map based data performed the best (AUC=0.674). Briefly, our study represents a novel approach to evaluate a small molecule for drug repositioning opportunity and may further improve discovery of pleiotropic drugs, or those to treat multiple indications.


Vaccine | 2018

Incidence and aetiology of bacterial meningitis among children aged 1–59 months in South Asia: systematic review and meta-analysis

Mohsin Ali; Brian Chang; Kipp W. Johnson; Shaun K. Morris

BACKGROUND Bacterial meningitis is a significant cause of morbidity and mortality worldwide among children aged 1-59 months. We aimed to describe its burden in South Asia, focusing on vaccine-preventable aetiologies. METHODS We searched five databases for studies published from January 1, 1990, to April 25, 2017. We estimated incidence and aetiology-specific proportions using random-effects meta-analysis. In secondary analyses, we described vaccine impact and pneumococcal meningitis serotypes. RESULTS We included 48 articles cumulatively reporting 20,707 cases from 1987 to 2013. Mean annual incidence was 105 (95% confidence interval [CI], 53-173) cases per 100,000 children. On average, Haemophilus influenzae type b (Hib) accounted for 13% (95% CI, 8-19%) of cases, pneumococcus for 10% (95% CI, 6-15%), and meningococcus for 1% (95% CI, 0-2%). These meta-analyses had substantial between-study heterogeneity (I2 > 78%, P < 0.0001). Among studies reporting only confirmed cases, these three bacteria caused a median of 78% cases (IQR, 50-87%). Hib meningitis incidence declined by 72-83% at sentinel hospitals in Pakistan and Bangladesh, respectively, within two years of implementing nationwide vaccination. On average, PCV10 covered 49% (95% CI, 39-58%), PCV13 covered 51% (95% CI, 40-61%), and PPSV23 covered 74% (95% CI, 67-80%) of pneumococcal meningitis serotypes. Lower PCV10 and PCV13 serotype coverage in Bangladesh was associated with higher prevalence of serotype 2, compared to India and Pakistan. CONCLUSIONS South Asia has relatively high incidence of bacterial meningitis among children aged 1-59 months, with vaccine-preventable bacteria causing a substantial proportion. These estimates are likely underestimates due to multiple epidemiological and microbiological factors. Further research on vaccine impact and distribution of pneumococcal serotypes will inform vaccine policymaking and implementation.

Collaboration


Dive into the Kipp W. Johnson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Khader Shameer

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ben Readhead

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Li Li

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Jagat Narula

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Andrew Kasarskis

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Annapoorna Kini

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Partho P. Sengupta

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Riccardo Miotto

Icahn School of Medicine at Mount Sinai

View shared research outputs
Researchain Logo
Decentralizing Knowledge