Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Riccardo Miotto is active.

Publication


Featured researches published by Riccardo Miotto.


Scientific Reports | 2016

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

Riccardo Miotto; Li Li; Brian A. Kidd; Joel T. Dudley

Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name “deep patient”. We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.


Briefings in Bioinformatics | 2017

Deep learning for healthcare: review, opportunities and challenges.

Riccardo Miotto; Fei Wang; Shuang Wang; Xiaoqian Jiang; Joel T. Dudley

Gaining knowledge and actionable insights from complex, high-dimensional and heterogeneous biomedical data remains a key challenge in transforming health care. Various types of data have been emerging in modern biomedical research, including electronic health records, imaging, -omics, sensor data and text, which are complex, heterogeneous, poorly annotated and generally unstructured. Traditional data mining and statistical learning approaches typically need to first perform feature engineering to obtain effective and more robust features from those data, and then build prediction or clustering models on top of them. There are lots of challenges on both steps in a scenario of complicated data and lacking of sufficient domain knowledge. The latest advances in deep learning technologies provide new effective paradigms to obtain end-to-end learning models from complex data. In this article, we review the recent literature on applying deep learning technologies to advance the health care domain. Based on the analyzed work, we suggest that deep learning approaches could be the vehicle for translating big biomedical data into improved human health. However, we also note limitations and needs for improved methods development and applications, especially in terms of ease-of-understanding for domain experts and citizen scientists. We discuss such challenges and suggest developing holistic and meaningful interpretable architectures to bridge deep learning models and human interpretability.


Briefings in Bioinformatics | 2017

Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams

Khader Shameer; Marcus A. Badgeley; Riccardo Miotto; Benjamin S. Glicksberg; Joseph W. Morgan; Joel T. Dudley

Abstract Monitoring and modeling biomedical, health care and wellness data from individuals and converging data on a population scale have tremendous potential to improve understanding of the transition to the healthy state of human physiology to disease setting. Wellness monitoring devices and companion software applications capable of generating alerts and sharing data with health care providers or social networks are now available. The accessibility and clinical utility of such data for disease or wellness research are currently limited. Designing methods for streaming data capture, real‐time data aggregation, machine learning, predictive analytics and visualization solutions to integrate wellness or health monitoring data elements with the electronic medical records (EMRs) maintained by health care providers permits better utilization. Integration of population‐scale biomedical, health care and wellness data would help to stratify patients for active health management and to understand clinically asymptomatic patients and underlying illness trajectories. In this article, we discuss various health‐monitoring devices, their ability to capture the unique state of health represented in a patient and their application in individualized diagnostics, prognosis, clinical or wellness intervention. We also discuss examples of translational bioinformatics approaches to integrating patient‐generated data with existing EMRs, personal health records, patient portals and clinical data repositories. Briefly, translational bioinformatics methods, tools and resources are at the center of these advances in implementing real‐time biomedical and health care analytics in the clinical setting. Furthermore, these advances are poised to play a significant role in clinical decision‐making and implementation of data‐driven medicine and wellness care.


Journal of Biomedical Informatics | 2013

A human-computer collaborative approach to identifying common data elements in clinical trial eligibility criteria

Zhihui Luo; Riccardo Miotto; Chunhua Weng

OBJECTIVE To identify Common Data Elements (CDEs) in eligibility criteria of multiple clinical trials studying the same disease using a human-computer collaborative approach. DESIGN A set of free-text eligibility criteria from clinical trials on two representative diseases, breast cancer and cardiovascular diseases, was sampled to identify disease-specific eligibility criteria CDEs. In this proposed approach, a semantic annotator is used to recognize Unified Medical Language Systems (UMLSs) terms within the eligibility criteria text. The Apriori algorithm is applied to mine frequent disease-specific UMLS terms, which are then filtered by a list of preferred UMLS semantic types, grouped by similarity based on the Dice coefficient, and, finally, manually reviewed. MEASUREMENTS Standard precision, recall, and F-score of the CDEs recommended by the proposed approach were measured with respect to manually identified CDEs. RESULTS Average precision and recall of the recommended CDEs for the two diseases were 0.823 and 0.797, respectively, leading to an average F-score of 0.810. In addition, the machine-powered CDEs covered 80% of the cardiovascular CDEs published by The American Heart Association and assigned by human experts. CONCLUSION It is feasible and effort saving to use a human-computer collaborative approach to augment domain experts for identifying disease-specific CDEs from free-text clinical trial eligibility criteria.


Journal of Biomedical Informatics | 2013

Unsupervised mining of frequent tags for clinical eligibility text indexing

Riccardo Miotto; Chunhua Weng

Clinical text, such as clinical trial eligibility criteria, is largely underused in state-of-the-art medical search engines due to difficulties of accurate parsing. This paper proposes a novel methodology to derive a semantic index for clinical eligibility documents based on a controlled vocabulary of frequent tags, which are automatically mined from the text. We applied this method to eligibility criteria on ClinicalTrials.gov and report that frequent tags (1) define an effective and efficient index of clinical trials and (2) are unlikely to grow radically when the repository increases. We proposed to apply the semantic index to filter clinical trial search results and we concluded that frequent tags reduce the result space more efficiently than an uncontrolled set of UMLS concepts. Overall, unsupervised mining of frequent tags from clinical text leads to an effective semantic index for the clinical eligibility documents and promotes their computational reuse.


Journal of the American Medical Informatics Association | 2015

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Riccardo Miotto; Chunhua Weng

Abstract Objective To develop a cost-effective, case-based reasoning framework for clinical research eligibility screening by only reusing the electronic health records (EHRs) of minimal enrolled participants to represent the target patient for each trial under consideration. Materials and Methods The EHR data—specifically diagnosis, medications, laboratory results, and clinical notes—of known clinical trial participants were aggregated to profile the “target patient” for a trial, which was used to discover new eligible patients for that trial. The EHR data of unseen patients were matched to this “target patient” to determine their relevance to the trial; the higher the relevance, the more likely the patient was eligible. Relevance scores were a weighted linear combination of cosine similarities computed over individual EHR data types. For evaluation, we identified 262 participants of 13 diversified clinical trials conducted at Columbia University as our gold standard. We ran a 2-fold cross validation with half of the participants used for training and the other half used for testing along with other 30 000 patients selected at random from our clinical database. We performed binary classification and ranking experiments. Results The overall area under the ROC curve for classification was 0.95, enabling the highlight of eligible patients with good precision. Ranking showed satisfactory results especially at the top of the recommended list, with each trial having at least one eligible patient in the top five positions. Conclusions This relevance-based method can potentially be used to identify eligible patients for clinical trials by processing patient EHR data alone without parsing free-text eligibility criteria, and shows promise of efficient “case-based reasoning” modeled only on minimal trial participants.


Journal of Biomedical Informatics | 2013

eTACTS: A Method for Dynamically Filtering Clinical Trial Search Results

Riccardo Miotto; Silis Y. Jiang; Chunhua Weng

OBJECTIVE Information overload is a significant problem facing online clinical trial searchers. We present eTACTS, a novel interactive retrieval framework using common eligibility tags to dynamically filter clinical trial search results. MATERIALS AND METHODS eTACTS mines frequent eligibility tags from free-text clinical trial eligibility criteria and uses these tags for trial indexing. After an initial search, eTACTS presents to the user a tag cloud representing the current results. When the user selects a tag, eTACTS retains only those trials containing that tag in their eligibility criteria and generates a new cloud based on tag frequency and co-occurrences in the remaining trials. The user can then select a new tag or unselect a previous tag. The process iterates until a manageable number of trials is returned. We evaluated eTACTS in terms of filtering efficiency, diversity of the search results, and user eligibility to the filtered trials using both qualitative and quantitative methods. RESULTS eTACTS (1) rapidly reduced search results from over a thousand trials to ten; (2) highlighted trials that are generally not top-ranked by conventional search engines; and (3) retrieved a greater number of suitable trials than existing search engines. DISCUSSION eTACTS enables intuitive clinical trial searches by indexing eligibility criteria with effective tags. User evaluation was limited to one case study and a small group of evaluators due to the long duration of the experiment. Although a larger-scale evaluation could be conducted, this feasibility study demonstrated significant advantages of eTACTS over existing clinical trial search engines. CONCLUSION A dynamic eligibility tag cloud can potentially enhance state-of-the-art clinical trial search engines by allowing intuitive and efficient filtering of the search result space.


Methods of Information in Medicine | 2013

Feasibility of feature-based indexing, clustering, and search of clinical trials. A case study of breast cancer trials from ClinicalTrials.gov.

Mary Regina Boland; Riccardo Miotto; Junfeng Gao; Chunhua Weng

BACKGROUND When standard therapies fail, clinical trials provide experimental treatment opportunities for patients with drug-resistant illnesses or terminal diseases. Clinical Trials can also provide free treatment and education for individuals who otherwise may not have access to such care. To find relevant clinical trials, patients often search online; however, they often encounter a significant barrier due to the large number of trials and in-effective indexing methods for reducing the trial search space. OBJECTIVES This study explores the feasibility of feature-based indexing, clustering, and search of clinical trials and informs designs to automate these processes. METHODS We decomposed 80 randomly selected stage III breast cancer clinical trials into a vector of eligibility features, which were organized into a hierarchy. We clustered trials based on their eligibility feature similarities. In a simulated search process, manually selected features were used to generate specific eligibility questions to filter trials iteratively. RESULTS We extracted 1,437 distinct eligibility features and achieved an inter-rater agreement of 0.73 for feature extraction for 37 frequent features occurring in more than 20 trials. Using all the 1,437 features we stratified the 80 trials into six clusters containing trials recruiting similar patients by patient-characteristic features, five clusters by disease-characteristic features, and two clusters by mixed features. Most of the features were mapped to one or more Unified Medical Language System (UMLS) concepts, demonstrating the utility of named entity recognition prior to mapping with the UMLS for automatic feature extraction. CONCLUSIONS It is feasible to develop feature-based indexing and clustering methods for clinical trials to identify trials with similar target populations and to improve trial search efficiency.


Briefings in Bioinformatics | 2018

Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning

Khader Shameer; Benjamin S. Glicksberg; Rachel Hodos; Kipp W. Johnson; Marcus A. Badgeley; Ben Readhead; Max S Tomlinson; Timothy O’Connor; Riccardo Miotto; Brian Kidd; Rong Chen; Avi Ma’ayan; Joel T. Dudley

&NA; Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug‐resistant pathogens, drug‐resistant cancers (cisplatin‐resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta‐analyses could augment therapeutic development.


Nature Genetics | 2017

A functional genomics predictive network model identifies regulators of inflammatory bowel disease

Lauren A. Peters; Jacqueline Perrigoue; Arthur Mortha; Alina C. Iuga; Won Min Song; Eric M. Neiman; Sean R. Llewellyn; Antonio Di Narzo; Brian A. Kidd; Shannon Telesco; Yongzhong Zhao; Aleksandar Stojmirovic; Jocelyn Sendecki; Khader Shameer; Riccardo Miotto; Bojan Losic; Hardik Shah; Eunjee Lee; Minghui Wang; Jeremiah J. Faith; Andrew Kasarskis; Carrie Brodmerkel; Mark E. Curran; Anuk Das; Joshua R. Friedman; Yoshinori Fukui; Mary Beth Humphrey; Brian M. Iritani; Nicholas Sibinga; Teresa K. Tarrant

A major challenge in inflammatory bowel disease (IBD) is the integration of diverse IBD data sets to construct predictive models of IBD. We present a predictive model of the immune component of IBD that informs causal relationships among loci previously linked to IBD through genome-wide association studies (GWAS) using functional and regulatory annotations that relate to the cells, tissues, and pathophysiology of IBD. Our model consists of individual networks constructed using molecular data generated from intestinal samples isolated from three populations of patients with IBD at different stages of disease. We performed key driver analysis to identify genes predicted to modulate network regulatory states associated with IBD, prioritizing and prospectively validating 12 of the top key drivers experimentally. This validated key driver set not only introduces new regulators of processes central to IBD but also provides the integrated circuits of genetic, molecular, and clinical traits that can be directly queried to interrogate and refine the regulatory framework defining IBD.

Collaboration


Dive into the Riccardo Miotto's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Khader Shameer

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Li Li

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Kipp W. Johnson

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Andrew Kasarskis

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian A. Kidd

Icahn School of Medicine at Mount Sinai

View shared research outputs
Researchain Logo
Decentralizing Knowledge