Ravikumar Komandur Elayavilli
Mayo Clinic
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ravikumar Komandur Elayavilli.
Journal of the American Medical Informatics Association | 2013
Sunghwan Sohn; Kavishwar B. Wagholikar; Dingcheng Li; Siddhartha Jonnalagadda; Cui Tao; Ravikumar Komandur Elayavilli; Hongfang Liu
BACKGROUND Temporal information detection systems have been developed by the Mayo Clinic for the 2012 i2b2 Natural Language Processing Challenge. OBJECTIVE To construct automated systems for EVENT/TIMEX3 extraction and temporal link (TLINK) identification from clinical text. MATERIALS AND METHODS The i2b2 organizers provided 190 annotated discharge summaries as the training set and 120 discharge summaries as the test set. Our Event system used a conditional random field classifier with a variety of features including lexical information, natural language elements, and medical ontology. The TIMEX3 system employed a rule-based method using regular expression pattern match and systematic reasoning to determine normalized values. The TLINK system employed both rule-based reasoning and machine learning. All three systems were built in an Apache Unstructured Information Management Architecture framework. RESULTS Our TIMEX3 system performed the best (F-measure of 0.900, value accuracy 0.731) among the challenge teams. The Event system produced an F-measure of 0.870, and the TLINK system an F-measure of 0.537. CONCLUSIONS Our TIMEX3 system demonstrated good capability of regular expression rules to extract and normalize time information. Event and TLINK machine learning systems required well-defined feature sets to perform well. We could also leverage expert knowledge as part of the machine learning features to further improve TLINK identification performance.
Biomedical Informatics Insights | 2016
Vinod Kaggal; Ravikumar Komandur Elayavilli; Saeed Mehrabi; Joshua J. Pankratz; Sunghwan Sohn; Yanshan Wang; Dingcheng Li; Majid Mojarad Rastegar; Sean P. Murphy; Jason L. Ross; Rajeev Chaudhry; James D. Buntrock; Hongfang Liu
The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.
international conference on bioinformatics | 2015
Dingcheng Li; Majid Rastegar-Mojarad; Ravikumar Komandur Elayavilli; Yanshan Wang; Saeed Mehrabi; Yue Yu; Sunghwan Sohn; Yanpeng Li; Naveed Afzal; Hongfang Liu
Clinical natural language processing (NLP) has become indispensable in the secondary use of electronic medical records (EMRs). However, it is found that current clinical NLP tools face the problem of portability among different institutes. An ideal solution to this problem is cross-institutional data sharing. However, the legal enforcement of no revelation of protected health information (PHI) obstructs this practice even with the availability of state-of-the-art de-identification tools. In this paper, we investigated the use of a frequency-filtering approach to extract PHI-free sentences utilizing the Enterprise Data Trust (EDT), a large collection of EMRs at Mayo Clinic. Our approach is based on the assumption that sentences appearing frequently tend to contain no PHI. This assumption originates from the observation that there exist a large number of redundant descriptions of similar patient conditions in EDT. Both manual and automatic evaluations on the sentence set with frequencies higher than one show no PHI are found. The promising results demonstrate the potential of sharing highly frequent sentences among institutes.
Studies in health technology and informatics | 2015
Majid Rastegar-Mojarad; Ravikumar Komandur Elayavilli; Dingcheng Li; Hongfang Liu
Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.
international conference on bioinformatics | 2016
Majid Rastegar-Mojarad; Ravikumar Komandur Elayavilli; Liwei Wang; Rashmi Prasad; Hongfang Liu
Literature based discovery (LBD) is a well-known paradigm to discover hidden knowledge in scientific literature. By identifying and utilizing reported findings in literature, LBD hypothesizes novel discoveries. Most often, LBD systems generate a long list of potential discoveries and it would be time consuming and expensive to validate all of those discoveries. Preliminary validation or prioritization of the discoveries can improve the significance of LBD systems. In this study, we proposed a method utilizing information surrounding causal findings to prioritize discoveries generated by LBD systems. As a case study, we focused on discovering drug-disease relations, which have potential to identify drug repositioning candidates or adverse drug reactions. Our LBD system used drug-gene and gene-disease semantic predication in SemMedDB as causal findings and Swansons ABC model to generate potential drug-disease relations. Using sentences, which causal findings extracted from, our ranking method trained a binary classifier to classify generated drug-disease relations into desired classes. We trained and tested our classifier for three different purposes: a) drug repositioning b) adverse drug events c) drug-disease relation detection. The classifier obtained 0.78, 0.86, and 0.83 f-measure respectively for these tasks. The number of causal findings of each hypothesis, which were classified as positive by the classifier, is the main metric for ranking the hypotheses in the proposed method. To evaluate the ranking method, we counted and compared the number of true relations in the top 100 pairs, which were ranked by our method and one of previous methods. Out of 181 true relations in the test dataset, the proposed method ranked 20 of them in top 100 relations while this number was 13 for the other method.
bioinformatics and biomedicine | 2015
Majid Rastegar-Mojarad; Ravikumar Komandur Elayavilli; Dingcheng Li; Rashmi Prasad; Hongfang Liu
Drug repositioning has been a topic of great attention to researchers and pharmaceutical companies due to its significant impact on the cost of drug discovery. There are several approaches to identify potentially novel drug candidates through repurposing. Literature mining has played a critical role in mining such information from scientific articles. In this paper, we used drug-gene and gene-disease semantic predications extracted from Medline abstracts to generate a list of potential drug-disease pairs. We further ranked the generated pairs, by assigning scores based on the predicates that qualify drug-gene and gene-disease relationships. On comparing the top-ranked drug-disease pairs against the Comparative Toxicogenomics Database (CTD), a curated database for drug-disease relations, we found that a significant percentage of top ranked pairs appeared in CTD. Co-occurrence of these high-ranked pairs in Medline abstracts further improves the confidence in our approach to rank the inferred drug-disease relations higher in the list. Finally, manual evaluation of top ten pairs ranked by our approach revealed that nine of them have some biological significance based on expert judgment.
Mayo Clinic Proceedings: Innovations, Quality & Outcomes | 2018
Alisha P. Chaudhry; Naveed Afzal; Mohamed M. Abidian; Vishnu Priya Mallipeddi; Ravikumar Komandur Elayavilli; Christopher G. Scott; Iftikhar J. Kullo; Paul W. Wennberg; Joshua J. Pankratz; Hongfang Liu; Rajeev Chaudhry; Adelaide M. Arruda-Olson
Objective To quantify compliance with guideline recommendations for secondary prevention in peripheral artery disease (PAD) using natural language processing (NLP) tools deployed to an electronic health record (EHR) and investigate provider opinions regarding clinical decision support (CDS) to promote improved implementation of these strategies. Patients and Methods Natural language processing was used for automated identification of moderate to severe PAD cases from narrative clinical notes of an EHR of patients seen in consultation from May 13, 2015, to July 27, 2015. Guideline-recommended strategies assessed within 6 months of PAD diagnosis included therapy with statins, antiplatelet agents, angiotensin-converting enzyme inhibitors or angiotensin receptor blockers, and smoking abstention. Subsequently, a provider survey was used to assess provider knowledge regarding PAD clinical practice guidelines, comfort in recommending secondary prevention strategies, and potential role for CDS. Results Among 73 moderate to severe PAD cases identified by NLP, only 12 (16%) were on 4 guideline-recommended strategies. A total of 207 of 760 (27%) providers responded to the survey; of these 141 (68%) were generalists and 66 (32%) were specialists. Although 183 providers (88%) managed patients with PAD, 51 (25%) indicated they were uncomfortable doing so; 138 providers (67%) favored the development of a CDS system tailored for their practice and 146 (71%) agreed that an automated EHR-derived mortality risk score calculator for patients with PAD would be helpful. Conclusion Natural language processing tools can identify cases from EHRs to support quality metric studies. Findings of this pilot study demonstrate gaps in application of guideline-recommended strategies for secondary risk prevention for patients with moderate to severe PAD. Providers strongly support the development of CDS systems tailored to assist them in providing evidence-based care to patients with PAD at the point of care.
Database | 2016
Majid Rastegar-Mojarad; Ravikumar Komandur Elayavilli; Hongfang Liu
Biological expression language (BEL) is one of the main formal representation models of biological networks. The primary source of information for curating biological networks in BEL representation has been literature. It remains a challenge to identify relevant articles and the corresponding evidence statements for curating and validating BEL statements. In this paper, we describe BELTracker, a tool used to retrieve and rank evidence sentences from PubMed abstracts and full-text articles for a given BEL statement (per the 2015 task requirements of BioCreative V BEL Task). The system is comprised of three main components, (i) translation of a given BEL statement to an information retrieval (IR) query, (ii) retrieval of relevant PubMed citations and (iii) finding and ranking the evidence sentences in those citations. BELTracker uses a combination of multiple approaches based on traditional IR, machine learning, and heuristics to accomplish the task. The system identified and ranked at least one fully relevant evidence sentence in the top 10 retrieved sentences for 72 out of 97 BEL statements in the test set. BELTracker achieved a precision of 0.392, 0.532 and 0.615 when evaluated with three criteria, namely full, relaxed and context criteria, respectively, by the task organizers. Our team at Mayo Clinic was the only participant in this task. BELTracker is available as a RESTful API and is available for public use. Database URL: http://www.openbionlp.org:8080/BelTracker/finder/Given_BEL_Statement
Database | 2018
Sijia Liu; Feichen Shen; Ravikumar Komandur Elayavilli; Yanshan Wang; Majid Rastegar-Mojarad; Vipin Chaudhary; Hongfang Liu
Abstract Relation extraction is an important task in the field of natural language processing. In this paper, we describe our approach for the BioCreative VI Task 5: text mining chemical–protein interactions. We investigate multiple deep neural network (DNN) models, including convolutional neural networks, recurrent neural networks (RNNs) and attention-based (ATT-) RNNs (ATT-RNNs) to extract chemical–protein relations. Our experimental results indicate that ATT-RNN models outperform the same models without using attention and the ATT-gated recurrent unit (ATT-GRU) achieves the best performing micro average F1 score of 0.527 on the test set among the tested DNNs. In addition, the result of word-level attention weights also shows that attention mechanism is effective on selecting the most important trigger words when trained with semantic relation labels without the need of semantic parsing and feature engineering. The source code of this work is available at https://github.com/ohnlp/att-chemprot.
text retrieval conference | 2017
Yanshan Wang; Ravikumar Komandur Elayavilli; Majid Rastegar-Mojarad; Hongfang Liu