Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where John S. Aberdeen is active.

Publication


Featured researches published by John S. Aberdeen.


MUC6 '95 Proceedings of the 6th conference on Message understanding | 1995

A model-theoretic coreference scoring scheme

Marc B. Vilain; John D. Burger; John S. Aberdeen; Dennis Connolly; Lynette Hirschman

This note describes a scoring scheme for the coreference task in MUC6. It improves on the original approach by: (1) grounding the scoring scheme in terms of a model; (2) producing more intuitive recall and precision scores; and (3) not requiring explicit computation of the transitive closure of coreference. The principal conceptual difference is that we have moved from a syntactic scoring model based on following coreference links to an approach defined by the model theory of those links.


MUC6 '95 Proceedings of the 6th conference on Message understanding | 1995

MITRE: description of the Alembic system used for MUC-6

John S. Aberdeen; John D. Burger; David S. Day; Lynette Hirschman; Patricia Robinson; Marc B. Vilain

As with several other veteran MUC participants, MITREs Alembic system has undergone a major transformation in the past two years. The genesis of this transformation occurred during a dinner conversation at the last MUC conference, MUC-5. At that time, several of us reluctantly admitted that our major impediment towards improved performance was reliance on then-standard linguistic models of syntax. We knew we would need an alternative to traditional linguistic grammars, even to the somewhat non-traditional categorial pseudo-parser we had in place at the time. The problem was, which alternative?


International Journal of Medical Informatics | 2010

The MITRE Identification Scrubber Toolkit: Design, training, and assessment

John S. Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Benjamin Wellner; Cheryl Clark; David A. Hanauer; Bradley Malin; Lynette Hirschman

PURPOSE Medical records must often be stripped of patient identifiers, or de-identified, before being shared. De-identification by humans is time-consuming, and existing software is limited in its generality. The open source MITRE Identification Scrubber Toolkit (MIST) provides an environment to support rapid tailoring of automated de-identification to different document types, using automatically learned classifiers to de-identify and protect sensitive information. METHODS MIST was evaluated with four classes of patient records from the Vanderbilt University Medical Center: discharge summaries, laboratory reports, letters, and order summaries. We trained and tested MIST on each class of record separately, as well as on pooled sets of records. We measured precision, recall, F-measure and accuracy at the word level for the detection of patient identifiers as designated by the HIPAA Safe Harbor Rule. RESULTS MIST was applied to medical records that differed in the amounts and types of protected health information (PHI): lab reports contained only two types of PHI (dates, names) compared to discharge summaries, which were much richer. Performance of the de-identification tool depended on record class; F-measure results were 0.996 for order summaries, 0.996 for discharge summaries, 0.943 for letters and 0.934 for laboratory reports. Experiments suggest the tool requires several hundred training exemplars to reach an F-measure of at least 0.9. CONCLUSIONS The MIST toolkit makes possible the rapid tailoring of automated de-identification to particular document types and supports the transition of the de-identification software to medical end users, avoiding the need for developers to have access to original medical records. We are making the MIST toolkit available under an open source license to encourage its application to diverse data sets at multiple institutions.


annual meeting of the special interest group on discourse and dialogue | 2001

Comparing several aspects of human-computer and human-human dialogues

Christine Doran; John S. Aberdeen; Laurie E. Damianos; Lynette Hirschman

While researchers have many intuitions about the differences between human-computer and human-human interactions, most of these have not previously been subject to empirical scrutiny. This work presents some initial experiments in this direction, with the ultimate goal being to use what we learn to improve computer dialogue systems. Working with data from the air travel domain, we identified a number of striking differences between the human-human and human-computer interactions.


Database | 2015

Scaling drug indication curation through crowdsourcing

Ritu Khare; John D. Burger; John S. Aberdeen; David Tresner-Kirsch; Theodore J. Corrales; Lynette Hirchman; Zhiyong Lu

Motivated by the high cost of human curation of biological databases, there is an increasing interest in using computational approaches to assist human curators and accelerate the manual curation process. Towards the goal of cataloging drug indications from FDA drug labels, we recently developed LabeledIn, a human-curated drug indication resource for 250 clinical drugs. Its development required over 40 h of human effort across 20 weeks, despite using well-defined annotation guidelines. In this study, we aim to investigate the feasibility of scaling drug indication annotation through a crowdsourcing technique where an unknown network of workers can be recruited through the technical environment of Amazon Mechanical Turk (MTurk). To translate the expert-curation task of cataloging indications into human intelligence tasks (HITs) suitable for the average workers on MTurk, we first simplify the complex task such that each HIT only involves a worker making a binary judgment of whether a highlighted disease, in context of a given drug label, is an indication. In addition, this study is novel in the crowdsourcing interface design where the annotation guidelines are encoded into user options. For evaluation, we assess the ability of our proposed method to achieve high-quality annotations in a time-efficient and cost-effective manner. We posted over 3000 HITs drawn from 706 drug labels on MTurk. Within 8 h of posting, we collected 18 775 judgments from 74 workers, and achieved an aggregated accuracy of 96% on 450 control HITs (where gold-standard answers are known), at a cost of


International Journal of Medical Informatics | 2013

Bootstrapping a de-identification system for narrative patient records: Cost-performance tradeoffs

David A. Hanauer; John S. Aberdeen; Samuel Bayer; Benjamin Wellner; Cheryl Clark; Kai Zheng; Lynette Hirschman

1.75 per drug label. On the basis of these results, we conclude that our crowdsourcing approach not only results in significant cost and time saving, but also leads to accuracy comparable to that of domain experts. Database URL: ftp://ftp.ncbi.nlm.nih.gov/pub/lu/LabeledIn/Crowdsourcing/.


Proceedings of the TIPSTER Text Program: Phase II | 1996

MITRE: DESCRIPTION OF THE ALEMBIC SYSTEM AS USED IN MET

John S. Aberdeen; John D. Burger; David S. Day; Lynette Hirschman; David D. Palmer; Patricia Robinson; Marc B. Vilain

PURPOSE We describe an experiment to build a de-identification system for clinical records using the open source MITRE Identification Scrubber Toolkit (MIST). We quantify the human annotation effort needed to produce a system that de-identifies at high accuracy. METHODS Using two types of clinical records (history and physical notes, and social work notes), we iteratively built statistical de-identification models by annotating 10 notes, training a model, applying the model to another 10 notes, correcting the models output, and training from the resulting larger set of annotated notes. This was repeated for 20 rounds of 10 notes each, and then an additional 6 rounds of 20 notes each, and a final round of 40 notes. At each stage, we measured precision, recall, and F-score, and compared these to the amount of annotation time needed to complete the round. RESULTS After the initial 10-note round (33min of annotation time) we achieved an F-score of 0.89. After just over 8h of annotation time (round 21) we achieved an F-score of 0.95. Number of annotation actions needed, as well as time needed, decreased in later rounds as model performance improved. Accuracy on history and physical notes exceeded that of social work notes, suggesting that the wider variety and contexts for protected health information (PHI) in social work notes is more difficult to model. CONCLUSIONS It is possible, with modest effort, to build a functioning de-identification system de novo using the MIST framework. The resulting system achieved performance comparable to other high-performing de-identification systems.


data integration in the life sciences | 2012

Validating candidate gene-mutation relations in MEDLINE abstracts via crowdsourcing

John D. Burger; Emily Doughty; Samuel Bayer; David Tresner-Kirsch; Ben Wellner; John S. Aberdeen; Kyungjoon Lee; Maricel G. Kann; Lynette Hirschman

Alembic is a comprehensive information extraction system that has been applied to a range of tasks. These include the now-standard components of the formal MUC evaluations: name tagging (NE in MUC-6), name normalization (TE), and template generation (ST). The system has also been exploited to help segment and index broadcast video and was used for early experiments on variants of the co-reference identification task. (For details, see [1].)


International Journal of Medical Informatics | 2014

De-identification of clinical narratives through writing complexity measures

Muqun Li; David Carrell; John S. Aberdeen; Lynette Hirschman; Bradley Malin

We describe an experiment to elicit judgments on the validity of gene-mutation relations in MEDLINE abstracts via crowdsourcing. The biomedical literature contains rich information on such relations, but the correct pairings are difficult to extract automatically because a single abstract may mention multiple genes and mutations. We ran an experiment presenting candidate gene-mutation relations as Amazon Mechanical Turk HITs (human intelligence tasks). We extracted candidate mutations from a corpus of 250 MEDLINE abstracts using EMU combined with curated gene lists from NCBI. The resulting document-level annotations were projected into the abstract text to highlight mentions of genes and mutations for review. Reviewers returned results within 36 hours. Initial weighted results evaluated against a gold standard of expert curated gene-mutation relations achieved 85% accuracy, with the best reviewer achieving 91% accuracy. We expect performance to increase with further experimentation, providing a scalable approach for rapid manual curation of important biological relations.


Machine Translation | 2012

Evaluation of 2-way Iraqi Arabic---English speech translation systems using automated metrics

Sherri L. Condon; Mark Arehart; Dan Parvaz; Gregory A. Sanders; Christy Doran; John S. Aberdeen

PURPOSE Electronic health records contain a substantial quantity of clinical narrative, which is increasingly reused for research purposes. To share data on a large scale and respect privacy, it is critical to remove patient identifiers. De-identification tools based on machine learning have been proposed; however, model training is usually based on either a random group of documents or a pre-existing document type designation (e.g., discharge summary). This work investigates if inherent features, such as the writing complexity, can identify document subsets to enhance de-identification performance. METHODS We applied an unsupervised clustering method to group two corpora based on writing complexity measures: a collection of over 4500 documents of varying document types (e.g., discharge summaries, history and physical reports, and radiology reports) from Vanderbilt University Medical Center (VUMC) and the publicly available i2b2 corpus of 889 discharge summaries. We compare the performance (via recall, precision, and F-measure) of de-identification models trained on such clusters with models trained on documents grouped randomly or VUMC document type. RESULTS For the Vanderbilt dataset, it was observed that training and testing de-identification models on the same stylometric cluster (with the average F-measure of 0.917) tended to outperform models based on clusters of random documents (with an average F-measure of 0.881). It was further observed that increasing the size of a training subset sampled from a specific cluster could yield improved results (e.g., for subsets from a certain stylometric cluster, the F-measure raised from 0.743 to 0.841 when training size increased from 10 to 50 documents, and the F-measure reached 0.901 when the size of the training subset reached 200 documents). For the i2b2 dataset, training and testing on the same clusters based on complexity measures (average F-score 0.966) did not significantly surpass randomly selected clusters (average F-score 0.965). CONCLUSIONS Our findings illustrate that, in environments consisting of a variety of clinical documentation, de-identification models trained on writing complexity measures are better than models trained on random groups and, in many instances, document types.

Collaboration


Dive into the John S. Aberdeen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Audrey N. Le

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Bryan L. Pellom

University of Colorado Boulder

View shared research outputs
Researchain Logo
Decentralizing Knowledge