Is this you? Create Your Porfile

Kirk Roberts

University of Texas Health Science Center at Houston

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kirk Roberts is active.

Explore More

Publication

Featured researches published by Kirk Roberts.

Journal of the American Medical Informatics Association | 2011

Automatic extraction of relations between medical concepts in clinical texts

Bryan Rink; Sanda M. Harabagiu; Kirk Roberts

OBJECTIVE A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. MATERIALS AND METHODS A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. RESULTS The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. DISCUSSION Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. CONCLUSION Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available.

Journal of the American Medical Informatics Association | 2011

A flexible framework for deriving assertions from electronic medical records

Kirk Roberts; Sanda M. Harabagiu

OBJECTIVE This paper describes natural-language-processing techniques for two tasks: identification of medical concepts in clinical text, and classification of assertions, which indicate the existence, absence, or uncertainty of a medical problem. Because so many resources are available for processing clinical texts, there is interest in developing a framework in which features derived from these resources can be optimally selected for the two tasks of interest. MATERIALS AND METHODS The authors used two machine-learning (ML) classifiers: support vector machines (SVMs) and conditional random fields (CRFs). Because SVMs and CRFs can operate on a large set of features extracted from both clinical texts and external resources, the authors address the following research question: Which features need to be selected for obtaining optimal results? To this end, the authors devise feature-selection techniques which greatly reduce the amount of manual experimentation and improve performance. RESULTS The authors evaluated their approaches on the 2010 i2b2/VA challenge data. Concept extraction achieves 79.59 micro F-measure. Assertion classification achieves 93.94 micro F-measure. DISCUSSION Approaching medical concept extraction and assertion classification through ML-based techniques has the advantage of easily adapting to new data sets and new medical informatics tasks. However, ML-based techniques perform best when optimal features are selected. By devising promising feature-selection techniques, the authors obtain results that outperform the current state of the art. CONCLUSION This paper presents two ML-based approaches for processing language in the clinical texts evaluated in the 2010 i2b2/VA challenge. By using novel feature-selection methods, the techniques presented in this paper are unique among the i2b2 participants.

computer vision and pattern recognition | 2016

Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation

Hoo-Chang Shin; Kirk Roberts; Le Lu; Dina Demner-Fushman; Jianhua Yao; Ronald M. Summers

Despite the recent advances in automatically describing image contents, their applications have been mostly limited to image caption datasets containing natural images (e.g., Flickr 30k, MSCOCO). In this paper, we present a deep learning model to efficiently detect a disease from an image and annotate its contexts (e.g., location, severity and the affected organs). We employ a publicly available radiology dataset of chest x-rays and their reports, and use its image annotations to mine disease names to train convolutional neural networks (CNNs). In doing so, we adopt various regularization techniques to circumvent the large normalvs-diseased cases bias. Recurrent neural networks (RNNs) are then trained to describe the contexts of a detected disease, based on the deep CNN features. Moreover, we introduce a novel approach to use the weights of the already trained pair of CNN/RNN on the domain-specific image/text dataset, to infer the joint image/text contexts for composite image labeling. Significantly improved image annotation results are demonstrated using the recurrent neural cascade model by taking the joint image/text contexts into account.

Information Retrieval | 2016

State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the TREC 2014 CDS track

Kirk Roberts; Matthew S. Simpson; Dina Demner-Fushman; Ellen M. Voorhees; William R. Hersh

AbstractProviding access to relevant biomedical literature in a clinical setting has the potential to bridge a critical gap in evidence-based medicine. Here, our goal is specifically to provide relevant articles to clinicians to improve their decision-making in diagnosing, treating, and testing patients. To this end, the TREC 2014 Clinical Decision Support Track evaluated a system’s ability to retrieve relevant articles in one of three categories (Diagnosis, Treatment, Test) using an idealized form of a patient medical record . Over 100 submissions from over 25 participants were evaluated on 30 topics, resulting in over 37k relevance judgments. In this article, we provide an overview of the task, a survey of the information retrieval methods employed by the participants, an analysis of the results, and a discussion on the future directions for this challenging yet important task.

Journal of the American Medical Informatics Association | 2013

A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text

Kirk Roberts; Bryan Rink; Sanda M. Harabagiu

OBJECTIVE To provide a natural language processing method for the automatic recognition of events, temporal expressions, and temporal relations in clinical records. MATERIALS AND METHODS A combination of supervised, unsupervised, and rule-based methods were used. Supervised methods include conditional random fields and support vector machines. A flexible automated feature selection technique was used to select the best subset of features for each supervised task. Unsupervised methods include Brown clustering on several corpora, which result in our method being considered semisupervised. RESULTS On the 2012 Informatics for Integrating Biology and the Bedside (i2b2) shared task data, we achieved an overall event F1-measure of 0.8045, an overall temporal expression F1-measure of 0.6154, an overall temporal link detection F1-measure of 0.5594, and an end-to-end temporal link detection F1-measure of 0.5258. The most competitive system was our event recognition method, which ranked third out of the 14 participants in the event task. DISCUSSION Analysis reveals the event recognition method has difficulty determining which modifiers to include/exclude in the event span. The temporal expression recognition method requires significantly more normalization rules, although many of these rules apply only to a small number of cases. Finally, the temporal relation recognition method requires more advanced medical knowledge and could be improved by separating the single discourse relation classifier into multiple, more targeted component classifiers. CONCLUSIONS Recognizing events and temporal expressions can be achieved accurately by combining supervised and unsupervised methods, even when only minimal medical knowledge is available. Temporal normalization and temporal relation recognition, however, are far more dependent on the modeling of medical knowledge.

Journal of the American Medical Informatics Association | 2016

Interactive use of online health resources: a comparison of consumer and professional questions

Kirk Roberts; Dina Demner-Fushman

OBJECTIVE To understand how consumer questions on online resources differ from questions asked by professionals, and how such consumer questions differ across resources. MATERIALS AND METHODS Ten online question corpora, 5 consumer and 5 professional, with a combined total of over 40 000 questions, were analyzed using a variety of natural language processing techniques. These techniques analyze questions at the lexical, syntactic, and semantic levels, exposing differences in both form and content. RESULTS Consumer questions tend to be longer than professional questions, more closely resemble open-domain language, and focus far more on medical problems. Consumers ask more sub-questions, provide far more background information, and ask different types of questions than professionals. Furthermore, there is substantial variance of these factors between the different consumer corpora. DISCUSSION The form of consumer questions is highly dependent upon the individual online resource, especially in the amount of background information provided. Professionals, on the other hand, provide very little background information and often ask much shorter questions. The content of consumer questions is also highly dependent upon the resource. While professional questions commonly discuss treatments and tests, consumer questions focus disproportionately on symptoms and diseases. Further, consumers place far more emphasis on certain types of health problems (eg, sexual health). CONCLUSION Websites for consumers to submit health questions are a popular online resource filling important gaps in consumer health information. By analyzing how consumers write questions on these resources, we can better understand these gaps and create solutions for improving information access.This article is part of the Special Focus on Person-Generated Health and Wellness Data, which published in the May 2016 issue, Volume 23, Issue 3.

Journal of the American Medical Informatics Association | 2012

A supervised framework for resolving coreference in clinical records

Bryan Rink; Kirk Roberts; Sanda M. Harabagiu

OBJECTIVE A method for the automatic resolution of coreference between medical concepts in clinical records. MATERIALS AND METHODS A multiple pass sieve approach utilizing support vector machines (SVMs) at each pass was used to resolve coreference. Information such as lexical similarity, recency of a concept mention, synonymy based on Wikipedia redirects, and local lexical context were used to inform the method. Results were evaluated using an unweighted average of MUC, CEAF, and B(3) coreference evaluation metrics. The datasets used in these research experiments were made available through the 2011 i2b2/VA Shared Task on Coreference. RESULTS The method achieved an average F score of 0.821 on the ODIE dataset, with a precision of 0.802 and a recall of 0.845. These results compare favorably to the best-performing system with a reported F score of 0.827 on the dataset and the median system F score of 0.800 among the eight teams that participated in the 2011 i2b2/VA Shared Task on Coreference. On the i2b2 dataset, the method achieved an average F score of 0.906, with a precision of 0.895 and a recall of 0.918 compared to the best F score of 0.915 and the median of 0.859 among the 16 participating teams. DISCUSSION Post hoc analysis revealed significant performance degradation on pathology reports. The pathology reports were characterized by complex synonymy and very few patient mentions. CONCLUSION The use of several simple lexical matching methods had the most impact on achieving competitive performance on the task of coreference resolution. Moreover, the ability to detect patients in electronic medical records helped to improve coreference resolution more than other linguistic analysis.

Journal of Biomedical Informatics | 2015

The role of fine-grained annotations in supervised recognition of risk factors for heart disease from EHRs

Kirk Roberts; Sonya E. Shooshan; Laritza Rodriguez; Swapna Abhyankar; Halil Kilicoglu; Dina Demner-Fushman

This paper describes a supervised machine learning approach for identifying heart disease risk factors in clinical text, and assessing the impact of annotation granularity and quality on the systems ability to recognize these risk factors. We utilize a series of support vector machine models in conjunction with manually built lexicons to classify triggers specific to each risk factor. The features used for classification were quite simple, utilizing only lexical information and ignoring higher-level linguistic information such as syntax and semantics. Instead, we incorporated high-quality data to train the models by annotating additional information on top of a standard corpus. Despite the relative simplicity of the system, it achieves the highest scores (micro- and macro-F1, and micro- and macro-recall) out of the 20 participants in the 2014 i2b2/UTHealth Shared Task. This system obtains a micro- (macro-) precision of 0.8951 (0.8965), recall of 0.9625 (0.9611), and F1-measure of 0.9276 (0.9277). Additionally, we perform a series of experiments to assess the value of the annotated data we created. These experiments show how manually-labeled negative annotations can improve information extraction performance, demonstrating the importance of high-quality, fine-grained natural language annotations.

meeting of the association for computational linguistics | 2014

Decomposing Consumer Health Questions

Kirk Roberts; Halil Kilicoglu; Marcelo Fiszman; Dina Demner-Fushman

This paper presents a method for decomposing long, complex consumer health questions. Our approach largely decomposes questions using their syntactic structure, recognizing independent questions embedded in clauses, as well as coordinations and exemplifying phrases. Additionally, we identify elements specific to disease-related consumer health questions, such as the focus disease and background information. To achieve this, our approach combines rank-and-filter machine learning methods with rule-based methods. Our results demonstrate significant improvements over the heuristic methods typically employed for question decomposition that rely only on the syntactic parse tree.

Database | 2017

A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge

Trevor Cohen; Kirk Roberts; Anupama E. Gururaj; Xiaoling Chen; Saeid Pournejati; George Alter; William R. Hersh; Dina Demner-Fushman; Lucila Ohno-Machado; Hua Xu

Abstract The rapid proliferation of publicly available biomedical datasets has provided abundant resources that are potentially of value as a means to reproduce prior experiments, and to generate and explore novel hypotheses. However, there are a number of barriers to the re-use of such datasets, which are distributed across a broad array of dataset repositories, focusing on different data types and indexed using different terminologies. New methods are needed to enable biomedical researchers to locate datasets of interest within this rapidly expanding information ecosystem, and new resources are needed for the formal evaluation of these methods as they emerge. In this paper, we describe the design and generation of a benchmark for information retrieval of biomedical datasets, which was developed and used for the 2016 bioCADDIE Dataset Retrieval Challenge. In the tradition of the seminal Cranfield experiments, and as exemplified by the Text Retrieval Conference (TREC), this benchmark includes a corpus (biomedical datasets), a set of queries, and relevance judgments relating these queries to elements of the corpus. This paper describes the process through which each of these elements was derived, with a focus on those aspects that distinguish this benchmark from typical information retrieval reference sets. Specifically, we discuss the origin of our queries in the context of a larger collaborative effort, the biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium, and the distinguishing features of biomedical dataset retrieval as a task. The resulting benchmark set has been made publicly available to advance research in the area of biomedical dataset retrieval. Database URL: https://biocaddie.org/benchmark-data

Explore More