Helen L. Corns
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Helen L. Corns.
PLOS ONE | 2016
Santosh K. Verma; Joanna L. Willetts; Helen L. Corns; Helen R. Marucci-Wellman; David A. Lombardi; Theodore K. Courtney
Introduction Falls are the leading cause of unintentional injuries in the U.S.; however, national estimates for all community-dwelling adults are lacking. This study estimated the national incidence of falls and fall-related injuries among community-dwelling U.S. adults by age and gender and the trends in fall-related injuries across the adult life span. Methods Nationally representative data from the National Health Interview Survey (NHIS) 2008 Balance and Dizziness supplement was used to develop national estimates of falls, and pooled data from the NHIS was used to calculate estimates of fall-related injuries in the U.S. and related trends from 2004–2013. Costs of unintentional fall-related injuries were extracted from the CDC’s Web-based Injury Statistics Query and Reporting System. Results Twelve percent of community-dwelling U.S. adults reported falling in the previous year for a total estimate of 80 million falls at a rate of 37.2 falls per 100 person-years. On average, 9.9 million fall-related injuries occurred each year with a rate of 4.38 fall-related injuries per 100 person-years. In the previous three months, 2.0% of older adults (65+), 1.1% of middle-aged adults (45–64) and 0.7% of young adults (18–44) reported a fall-related injury. Of all fall-related injuries among community-dwelling adults, 32.3% occurred among older adults, 35.3% among middle-aged adults and 32.3% among younger adults. The age-adjusted rate of fall-related injuries increased 4% per year among older women (95% CI 1%–7%) from 2004 to 2013. Among U.S. adults, the total lifetime cost of annual unintentional fall-related injuries that resulted in a fatality, hospitalization or treatment in an emergency department was 111 billion U.S. dollars in 2010. Conclusions Falls and fall-related injuries represent a significant health and safety problem for adults of all ages. The findings suggest that adult fall prevention efforts should consider the entire adult lifespan to ensure a greater public health benefit.
Injury Prevention | 2009
Mark R. Lehto; Helen R. Marucci-Wellman; Helen L. Corns
To compare two Bayesian methods (Fuzzy and Naïve) for classifying injury narratives in large administrative databases into event cause groups, a dataset of 14 000 narratives was randomly extracted from claims filed with a worker’s compensation insurance provider. Two expert coders assigned one-digit and two-digit Bureau of Labor Statistics (BLS) Occupational Injury and Illness Classification event codes to each narrative. The narratives were separated into a training set of 11 000 cases and a prediction set of 3000 cases. The training set was used to develop two Bayesian classifiers that assigned BLS codes to narratives. Each model was then evaluated for the prediction set. Both models performed well and tended to predict one-digit BLS codes more accurately than two-digit codes. The overall sensitivity of the Fuzzy method was, respectively, 78% and 64% for one-digit and two-digit codes, specificity was 93% and 95%, and positive predictive value (PPV) was 78% and 65%. The Naïve method showed similar accuracy: a sensitivity of 80% and 70%, specificity of 96% and 97%, and PPV of 80% and 70%. For large administrative databases, Bayesian methods show significant promise as a means of classifying injury narratives into cause groups. Overall, Naïve Bayes provided slightly more accurate predictions than Fuzzy Bayes.
Injury Prevention | 2011
Helen R. Marucci-Wellman; Mark R. Lehto; Helen L. Corns
Background Bayesian methods show promise for classifying injury narratives from large administrative datasets into cause groups. This study examined a combined approach where two Bayesian models (Fuzzy and Naïve) were used to either classify a narrative or select it for manual review. Methods Injury narratives were extracted from claims filed with a workers compensation insurance provider between January 2002 and December 2004. Narratives were separated into a training set (n=11,000) and prediction set (n=3,000). Expert coders assigned two-digit Bureau of Labor Statistics Occupational Injury and Illness Classification event codes to each narrative. Fuzzy and Naïve Bayesian models were developed using manually classified cases in the training set. Two semi-automatic machine coding strategies were evaluated. The first strategy assigned cases for manual review if the Fuzzy and Naïve models disagreed on the classification. The second strategy selected additional cases for manual review from the Agree dataset using prediction strength to reach a level of 50% computer coding and 50% manual coding. Results When agreement alone was used as the filtering strategy, the majority were coded by the computer (n=1,928, 64%) leaving 36% for manual review. The overall combined (human plus computer) sensitivity was 0.90 and positive predictive value (PPV) was >0.90 for 11 of 18 2-digit event categories. Implementing the 2nd strategy improved results with an overall sensitivity of 0.95 and PPV >0.90 for 17 of 18 categories. Conclusions A combined Naïve-Fuzzy Bayesian approach can classify some narratives with high accuracy and identify others most beneficial for manual review, reducing the burden on human coders.
Injury Prevention | 2016
Kirsten Vallmuur; Helen R Marucci-Wellman; Jennifer A. Taylor; Mark R. Lehto; Helen L. Corns; Gordon S. Smith
Objective Vast amounts of injury narratives are collected daily and are available electronically in real time and have great potential for use in injury surveillance and evaluation. Machine learning algorithms have been developed to assist in identifying cases and classifying mechanisms leading to injury in a much timelier manner than is possible when relying on manual coding of narratives. The aim of this paper is to describe the background, growth, value, challenges and future directions of machine learning as applied to injury surveillance. Methods This paper reviews key aspects of machine learning using injury narratives, providing a case study to demonstrate an application to an established human-machine learning approach. Results The range of applications and utility of narrative text has increased greatly with advancements in computing techniques over time. Practical and feasible methods exist for semiautomatic classification of injury narratives which are accurate, efficient and meaningful. The human-machine learning approach described in the case study achieved high sensitivity and PPV and reduced the need for human coding to less than a third of cases in one large occupational injury database. Conclusions The last 20 years have seen a dramatic change in the potential for technological advancements in injury surveillance. Machine learning of ‘big injury narrative data’ opens up many possibilities for expanded sources of data which can provide more comprehensive, ongoing and timely surveillance to inform future injury prevention policy and practice.
Accident Analysis & Prevention | 2015
Helen R. Marucci-Wellman; Mark R. Lehto; Helen L. Corns
Public health surveillance programs in the U.S. are undergoing landmark changes with the availability of electronic health records and advancements in information technology. Injury narratives gathered from hospital records, workers compensation claims or national surveys can be very useful for identifying antecedents to injury or emerging risks. However, classifying narratives manually can become prohibitive for large datasets. The purpose of this study was to develop a human-machine system that could be relatively easily tailored to routinely and accurately classify injury narratives from large administrative databases such as workers compensation. We used a semi-automated approach based on two Naïve Bayesian algorithms to classify 15,000 workers compensation narratives into two-digit Bureau of Labor Statistics (BLS) event (leading to injury) codes. Narratives were filtered out for manual review if the algorithms disagreed or made weak predictions. This approach resulted in an overall accuracy of 87%, with consistently high positive predictive values across all two-digit BLS event categories including the very small categories (e.g., exposure to noise, needle sticks). The Naïve Bayes algorithms were able to identify and accurately machine code most narratives leaving only 32% (4853) for manual review. This strategy substantially reduces the need for resources compared with manual review alone.
Journal of Safety Research | 2015
Helen R. Marucci-Wellman; Theodore K. Courtney; Helen L. Corns; Gary S. Sorock; Barbara S. Webster; Radoslaw Wasiak; Y. Ian Noy; Simon Matz; Tom B. Leamon
INTRODUCTION Although occupational injuries are among the leading causes of death and disability around the world, the burden due to occupational injuries has historically been under-recognized, obscuring the need to address a major public health problem. METHODS We established the Liberty Mutual Workplace Safety Index (LMWSI) to provide a reliable annual metric of the leading causes of the most serious workplace injuries in the United States based on direct workers compensation (WC) costs. RESULTS More than
Accident Analysis & Prevention | 2017
Helen R. Marucci-Wellman; Helen L. Corns; Mark R. Lehto
600 billion in direct WC costs were spent on the most disabling compensable non-fatal injuries and illnesses in the United States from 1998 to 2010. The burden in 2010 remained similar to the burden in 1998 in real terms. The categories of overexertion (
Injury Prevention | 2012
Santosh K. Verma; Theodore K. Courtney; Helen L. Corns; Yueng-Hsiang Huang; David A. Lombardi; Wen-Ruey Chang; Melanye J. Brennan; Melissa J. Perry
13.6B, 2010) and fall on same level (
conference on human interface | 2007
Helen R. Marucci; Mark R. Lehto; Helen L. Corns
8.6B, 2010) were consistently ranked 1st and 2nd. PRACTICAL APPLICATION The LMWSI was created to establish the relative burdens of events leading to work-related injury so they could be better recognized and prioritized. Such a ranking might be used to develop research goals and interventions to reduce the burden of workplace injury in the United States.
conference on human interface | 2007
Helen L. Corns; Helen R. Marucci; Mark R. Lehto
Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NBSW=NBBI-GRAM=SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as we have done here, utilizing readily-available off-the-shelf machine learning techniques and resulting in only a fraction of narratives that require manual review. Human-machine ensemble methods are likely to improve performance over total manual coding.