Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Allen C. Browne is active.

Publication


Featured researches published by Allen C. Browne.


Journal of the American Medical Informatics Association | 2014

The pattern of name tokens in narrative clinical text and a comparison of five systems for redacting them

Mehmet Kayaalp; Allen C. Browne; Fiona M. Callaghan; Zeyno A. Dodd; Guy Divita; Selcuk Ozturk; Clement J. McDonald

Objective To understand the factors that influence success in scrubbing personal names from narrative text. Materials and methods We developed a scrubber, the NLM Name Scrubber (NLM-NS), to redact personal names from narrative clinical reports, hand tagged words in a set of gold standard narrative reports as personal names or not, and measured the scrubbing success of NLM-NS and that of four other scrubbing/name recognition tools (MIST, MITdeid, LingPipe, and ANNIE/GATE) against the gold standard reports. We ran three comparisons which used increasingly larger name lists. Results The test reports contained more than 1 million words, of which 2388 were patient and 20 160 were provider name tokens. NLM-NS failed to scrub only 2 of the 2388 instances of patient name tokens. Its sensitivity was 0.999 on both patient and provider name tokens and missed fewer instances of patient name tokens in all comparisons with other scrubbers. MIST produced the best all token specificity and F-measure for name instances in our most relevant study (study 2), with values of 0.997 and 0.938, respectively. In that same comparison, NLM-NS was second best, with values of 0.986 and 0.748, respectively, and MITdeid was a close third, with values of 0.985 and 0.796 respectively. With the addition of the Clinical Center name list to their native name lists, Ling Pipe, MITdeid, MIST, and ANNIE/GATE all improved substantially. MITdeid and Ling Pipe gained the most—reaching patient name sensitivity of 0.995 (F-measure=0.705) and 0.989 (F-measure=0.386), respectively. Discussion The privacy risk due to two name tokens missed by NLM-NS was statistically negligible, since neither individual could be distinguished among more than 150 000 people listed in the US Social Security Registry. Conclusions The nature and size of name lists have substantial influences on scrubbing success. The use of very large name lists with frequency statistics accounts for much of NLM-NS scrubbing success.


computer based medical systems | 1998

Lexicon assistance reduces manual verification of OCR output

Susan E. Hauser; Allen C. Browne; George R. Thoma; Alexa T. McCray

An OCR system chosen for its high recognition rate and low percent of false positives also assigns low confidence values to many characters that are actually correct. Human operators must verify all words containing low-confidence characters. We describe the creation of a lexicon optimized for automatically selectively resetting confidence values to high, thus reducing operator verification time. Two word lists, OCR Correct and OCR Incorrect, were extracted from files that had already been processed and verified, and became the standard for comparing candidate lexicons. A lexicon was selected from several candidate word lists maintained by the National Library of Medicine (NLM). In operation for about six months, lexicon-assisted verification has been reducing the number of words requiring operator verification by over 50%.


It Professional | 2012

A Systematic Approach for Medical Language Processing: Generating Derivational Variants

Chris J. Lu; Lynn McCreedy; Destinee Tormey; Allen C. Browne

Medical language processing seeks to analyze linguistic patterns in electronic medical records, which requires managing lexical variations. A systematic approach to generating derivational variants, including prefixes, suffixes, and zero derivations, has improved precision and recall rates.


international conference on health informatics | 2017

Generating a Distilled N-Gram Set - Effective Lexical Multiword Building in the SPECIALIST Lexicon.

Chris J. Lu; Destinee Tormey; Lynn McCreedy; Allen C. Browne

Multiwords are vital to better Natural Language Processing (NLP) systems for more effective and efficient parsers, refining information retrieval searches, enhancing precision and recall in Medical Language Processing (MLP) applications, etc. The Lexical Systems Group has enhanced the coverage of multiwords in the Lexicon to provide a more comprehensive resource for such applications. This paper describes a new systematic approach to lexical multiword acquisition from MEDLINE through filters and matchers based on empirical models. The design goal, function description, various tests and applications of filters, matchers, and data are discussed. Results include: 1) Generating a smaller (38%) distilled MEDLINE n-gram set with better precision and similar recall to the MEDLINE n-gram set; 2) Establishing a system for generating high precision multiword candidates for effective Lexicon building. We believe the MLP/NLP community can benefit from access to these big data (MEDLINE n-gram) sets. We also anticipate an accelerated growth of multiwords in the Lexicon with this system. Ultimately, improvement in recall or precision can be anticipated in NLP projects using the MEDLINE distilled n-gram set, SPECIALIST Lexicon and its applications.


biomedical engineering systems and technologies | 2016

Generating SD-Rules in the SPECIALIST Lexical Tools

Chris J. Lu; Destinee Tormey; Lynn McCreedy; Allen C. Browne

Suffix derivations (SDs) are used with query expansion in concept mapping as an effective Natural Language Processing (NLP) technique to improve recall without sacrificing precision. A systematic approach was proposed to generate derivations in the SPECIALIST Lexical Tools in which SD candidate rules were used to retrieve SD-pairs from the SPECIALIST Lexicon (Lu et al., 2012). Good SD candidate rules are gathered as SD-Rules in Lexical Tools for generating SDs that are not known to the Lexicon. This paper describes a methodology to select an optimized SD-Rule set that meets our requirement of 95\% system precision with best system performance from SD candidate rules. The results of the latest three releases of Lexical Tools show: 1) system precision and recall of selected SD-Rules are above 95\%. 2) a consistency between a computational linguistic approach and traditional linguistic knowledge for selecting the best Parent-Child rules. 3) a consistent approach yielding similar SD-Rule sets and system performance. Ultimately, it results in better precision and recall for NLP applications using Lexical Tools derivational related flow components.


annual symposium on computer application in medical care | 1994

Lexical methods for managing variation in biomedical terminologies.

Alexa T. McCray; Suresh Srinivasan; Allen C. Browne


Journal of the American Medical Informatics Association | 2008

Consumer Health Information Seeking as Hypothesis Testing

Alla Keselman; Allen C. Browne; David R. Kaufman


multimedia information retrieval | 1994

Exploiting a large thesaurus for information retrieval

Alan R. Aronson; Thomas C. Rindflesch; Allen C. Browne


Bulletin of The Medical Library Association | 1993

UMLS knowledge for biomedical language processing.

Alexa T. McCray; Alan R. Aronson; Allen C. Browne; Rindflesch Tc; Razi A; Suresh Srinivasan


american medical informatics association annual symposium | 2003

UMLS language and vocabulary tools.

Allen C. Browne; Guy Divita; Alan R. Aronson; Alexa T. McCray

Collaboration


Dive into the Allen C. Browne's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chris J. Lu

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Alla Keselman

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Destinee Tormey

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Lynn McCreedy

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Anantha Bangalore

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Tony Tse

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Clement J. McDonald

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Mehmet Kayaalp

University of Pittsburgh

View shared research outputs
Researchain Logo
Decentralizing Knowledge