Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Scott L. DuVall is active.

Publication


Featured researches published by Scott L. DuVall.


Journal of the American Medical Informatics Association | 2011

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text

Özlem Uzuner; Brett R. South; Shuying Shen; Scott L. DuVall

The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.


Journal of the American Medical Informatics Association | 2012

Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure

Jennifer H. Garvin; Scott L. DuVall; Brett R. South; Bruce E. Bray; Daniel Bolton; Julia Heavirland; Steve Pickard; Paul A. Heidenreich; Shuying Shen; Charlene R. Weir; Matthew H. Samore; Mary K. Goldstein

OBJECTIVES Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics. MATERIALS AND METHODS We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements. RESULTS System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%). DISCUSSION An EF value of <40% can be accurately identified in VA echocardiogram reports. CONCLUSIONS An automated information extraction system can be used to accurately extract EF for quality measurement.


Arthritis Care and Research | 2014

Risk of hospitalized bacterial infections associated with biologic treatment among US veterans with rheumatoid arthritis

Jeffrey R. Curtis; Shuo Yang; Nivedita M. Patkar; Lang Chen; Jasvinder A. Singh; Grant W. Cannon; Ted R. Mikuls; Elizabeth Delzell; Kenneth G. Saag; Monika M. Safford; Scott L. DuVall; K. Alexander; Pavel Napalkov; Kevin L. Winthrop; Mary Jane Burton; Aaron W. C. Kamauu; John W. Baddley

The comparative risk of infection associated with non–anti–tumor necrosis factor (anti‐TNF) biologic agents is not well established. Our objective was to compare risk for hospitalized infections between anti‐TNF and non–anti‐TNF biologic agents in US veterans with rheumatoid arthritis (RA).


Journal of the American Medical Informatics Association | 2012

Evaluation of record linkage between a large healthcare provider and the Utah Population Database

Scott L. DuVall; Alison Fraser; Kerry Rowe; Alun Thomas; Geraldine P. Mineau

OBJECTIVE Electronically linked datasets have become an important part of clinical research. Information from multiple sources can be used to identify comorbid conditions and patient outcomes, measure use of healthcare services, and enrich demographic and clinical variables of interest. Innovative approaches for creating research infrastructure beyond a traditional data system are necessary. MATERIALS AND METHODS Records from a large healthcare systems enterprise data warehouse (EDW) were linked to a statewide population database, and a master subject index was created. The authors evaluate the linkage, along with the impact of missing information in EDW records and the coverage of the population database. The makeup of the EDW and population database provides a subset of cancer records that exist in both resources, which allows a cancer-specific evaluation of the linkage. RESULTS About 3.4 million records (60.8%) in the EDW were linked to the population database with a minimum accuracy of 96.3%. It was estimated that approximately 24.8% of target records were absent from the population database, which enabled the effect of the amount and type of information missing from a record on the linkage to be estimated. However, 99% of the records from the oncology data mart linked; they had fewer missing fields and this correlated positively with the number of patient visits. DISCUSSION AND CONCLUSION A general-purpose research infrastructure was created which allows disease-specific cohorts to be identified. The usefulness of creating an index between institutions is that it allows each institution to maintain control and confidentiality of their own information.


BMC Medical Informatics and Decision Making | 2012

Identification of methicillin-resistant Staphylococcus aureus within the Nation’s Veterans Affairs Medical Centers using natural language processing

Makoto L. Jones; Scott L. DuVall; Joshua Spuhl; Matthew H. Samore; Christopher Nielson; Michael A. Rubin

BackgroundAccurate information is needed to direct healthcare systems’ efforts to control methicillin-resistant Staphylococcus aureus (MRSA). Assembling complete and correct microbiology data is vital to understanding and addressing the multiple drug-resistant organisms in our hospitals.MethodsHerein, we describe a system that securely gathers microbiology data from the Department of Veterans Affairs (VA) network of databases. Using natural language processing methods, we applied an information extraction process to extract organisms and susceptibilities from the free-text data. We then validated the extraction against independently derived electronic data and expert annotation.ResultsWe estimate that the collected microbiology data are 98.5% complete and that methicillin-resistant Staphylococcus aureus was extracted accurately 99.7% of the time.ConclusionsApplying natural language processing methods to microbiology records appears to be a promising way to extract accurate and useful nosocomial pathogen surveillance data. Both scientific inquiry and the data’s reliability will be dependent on the surveillance system’s capability to compare from multiple sources and circumvent systematic error. The dataset constructed and methods used for this investigation could contribute to a comprehensive infectious disease surveillance system or other pressing needs.


Journal of Biomedical Informatics | 2010

Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators

Scott L. DuVall; Richard A. Kerber; Alun Thomas

Probabilistic record linkage is a method commonly used to determine whether demographic records refer to the same person. The Fellegi-Sunter method is a probabilistic approach that uses field weights based on log likelihood ratios to determine record similarity. This paper introduces an extension of the Fellegi-Sunter method that incorporates approximate field comparators in the calculation of field weights. The data warehouse of a large academic medical center was used as a case study. The approximate comparator extension was compared with the Fellegi-Sunter method in its ability to find duplicate records previously identified in the data warehouse using different demographic fields and matching cutoffs. The approximate comparator extension misclassified 25% fewer pairs and had a larger Welchs T statistic than the Fellegi-Sunter method for all field sets and matching cutoffs. The accuracy gain provided by the approximate comparator extension grew as less information was provided and as the matching cutoff increased. Given the ubiquity of linkage in both clinical and research settings, the incremental improvement of the extension has the potential to make a considerable impact.


european conference on machine learning | 2011

Active supervised domain adaptation

Avishek Saha; Piyush Rai; Hal Daumé; Suresh Venkatasubramanian; Scott L. DuVall

In this paper, we harness the synergy between two important learning paradigms, namely, active learning and domain adaptation. We show how active learning in a target domain can leverage information from a different but related source domain. Our proposed framework, Active Learning Domain Adapted (Alda), uses source domain knowledge to transfer information that facilitates active learning in the target domain. We propose two variants of Alda: a batch B-Alda and an online O-Alda. Empirical comparisons with numerous baselines on real-world datasets establish the efficacy of the proposed methods.


Annals of the Rheumatic Diseases | 2016

Association of hyperlipidaemia, inflammation and serological status and coronary heart disease among patients with rheumatoid arthritis: data from the National Veterans Health Administration

Iris Navarro-Millán; Shuo Yang; Scott L. DuVall; Lang Chen; John W. Baddley; Grant W. Cannon; Elizabeth Delzell; Jie Zhang; Monika M. Safford; Nivedita M. Patkar; Ted R. Mikuls; Jasvinder A. Singh; Jeffrey R. Curtis

Objective To examine the association of serum lipids, inflammation and seropositivity on coronary heart disease (CHD) and stroke in patients with rheumatoid arthritis (RA). Methods The incidence of hospitalised myocardial infarction (MI) or stroke was calculated in a cohort of patients with RA receiving care within the national Veterans Health Administration from 1998 to 2011. Cox proportional hazard models were used to examine the association between these outcomes and low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), C reactive protein (CRP) and erythrocyte sedimentation rate (ESR) as time-varying variables, divided into quintiles. Results There were 37 568 patients with RA in the cohort with mean age of 63 years (SD 12.1); 90% were men. There was a no clear association between LDL-C and CHD/stroke. Compared with lower HDL-C (<34 mg/dL), higher HDL-C (≥54 mg/dL) was inversely associated with MI (hazard ratio (HR)=0.68, 95% CI 0.55 to 0.85) and stroke (HR=0.69, 95% CI 0.50 to 0.96). Higher CRP >2.17 mg/dL (vs CRP <0.26 mg/dL) was associated with increased risk (HR=2.43, 95% CI 1.77 to 3.33) for MI and 2.02 (95% CI 1.32 to 3.08) for stroke. ESR >47 mm/h compared with <8 mm/h had an HR 1.87 (95% CI 1.39 to 2.52) for MI and 2.00 (95% CI 1.26 to 3.18) for stroke. The association between MI was significant for RA seropositivity (HR=1.23, 95% CI 1.03 to 1.48). Conclusions In this predominantly older male RA cohort, there was no clear association between LDL-C and CHD, whereas higher HDL-C was inversely associated with MI and stroke. CRP and ESR were similarly associated with increase MI risk and stroke, reflecting the prominent role of inflammation in CHD risk in RA.


Journal of the American Medical Informatics Association | 2013

Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation

Jeffrey P. Ferraro; Hal Daumé; Scott L. DuVall; Wendy W. Chapman; Henk Harkema; Peter J. Haug

OBJECTIVE Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives. METHODS Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt. RESULTS The evaluated POS taggers drop in accuracy by 8.5-15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3-91.0% on clinical texts. ClinAdapt reports 93.2-93.9%. CONCLUSIONS ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks.


JAMA Cardiology | 2017

Association Between HIV Infection and the Risk of Heart Failure With Reduced Ejection Fraction and Preserved Ejection Fraction in the Antiretroviral Therapy Era: Results From the Veterans Aging Cohort Study

Matthew S. Freiberg; Chung Chou H Chang; Melissa Skanderson; Olga V. Patterson; Scott L. DuVall; Cynthia Brandt; Kaku So-Armah; Kris Ann Oursler; John S. Gottdiener; Stephen S. Gottlieb; David A. Leaf; Maria C. Rodriguez-Barradas; Russell P. Tracy; Cynthia L. Gibert; David Rimland; Roger Bedimo; Sheldon T. Brown; Matthew Bidwell Goetz; Alberta Warner; Kristina Crothers; Hilary A. Tindle; Charles Alcorn; Justin M. Bachmann; Amy C. Justice; Adeel A. Butt

Importance With improved survival, heart failure (HF) has become a major complication for individuals with human immunodeficiency virus (HIV) infection. It is unclear if this risk extends to different types of HF in the antiretroviral therapy (ART) era. Determining whether HIV infection is associated with HF with reduced ejection fraction (HFrEF), HF with preserved ejection fraction (HFpEF), or both is critical because HF types differ with respect to underlying mechanism, treatment, and prognosis. Objectives To investigate whether HIV infection increases the risk of future HFrEF and HFpEF and to assess if this risk varies by sociodemographic and HIV-specific factors. Design, Setting, and Participants This study evaluated 98 015 participants without baseline cardiovascular disease from the Veterans Aging Cohort Study, an observational cohort of HIV-infected veterans and uninfected veterans matched by age, sex, race/ethnicity, and clinical site, enrolled on or after April 1, 2003, and followed up through September 30, 2012. The dates of the analysis were October 2015 to November 2016. Exposure Human immunodeficiency virus infection. Main Outcomes and Measures Outcomes included HFpEF (EF≥50%), borderline HFpEF (EF 40%-49%), HFrEF (EF<40%), and HF of unknown type (EF missing). Results Among 98 015 participants, the mean (SD) age at enrollment in the study was 48.3 (9.8) years, 97.0% were male, and 32.2% had HIV infection. During a median follow-up of 7.1 years, there were 2636 total HF events (34.6% were HFpEF, 15.5% were borderline HFpEF, 37.1% were HFrEF, and 12.8% were HF of unknown type). Compared with uninfected veterans, HIV-infected veterans had an increased risk of HFpEF (hazard ratio [HR], 1.21; 95% CI, 1.03-1.41), borderline HFpEF (HR, 1.37; 95% CI, 1.09-1.72), and HFrEF (HR, 1.61; 95% CI, 1.40-1.86). The risk of HFrEF was pronounced in veterans younger than 40 years at baseline (HR, 3.59; 95% CI, 1.95-6.58). Among HIV-infected veterans, time-updated HIV-1 RNA viral load of at least 500 copies/mL compared with less than 500 copies/mL was associated with an increased risk of HFrEF, and time-updated CD4 cell count less than 200 cells/mm3 compared with at least 500 cells/mm3 was associated with an increased risk of HFrEF and HFpEF. Conclusions and Relevance Individuals who are infected with HIV have an increased risk of HFpEF, borderline HFpEF, and HFrEF compared with uninfected individuals. The increased risk of HFrEF can manifest decades earlier than would be expected in a typical uninfected population. Future research should focus on prevention, risk stratification, and identification of the mechanisms for HFrEF and HFpEF in the HIV-infected population.

Collaboration


Dive into the Scott L. DuVall's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge