Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Aurélie Névéol is active.

Publication


Featured researches published by Aurélie Névéol.


Database | 2009

Understanding PubMed® user search behavior through log analysis

Rezarta Islamaj Doğan; G. Craig Murray; Aurélie Névéol; Zhiyong Lu

This article reports on a detailed investigation of PubMed users’ needs and behavior as a step toward improving biomedical information retrieval. PubMed is providing free service to researchers with access to more than 19 million citations for biomedical articles from MEDLINE and life science journals. It is accessed by millions of users each day. Efficient search tools are crucial for biomedical researchers to keep abreast of the biomedical literature relating to their own research. This study provides insight into PubMed users’ needs and their behavior. This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries. Multiple aspects of users’ interactions with PubMed are characterized in detail with evidence from these logs. Despite having many features in common with general Web searches, biomedical information searches have unique characteristics that are made evident in this study. PubMed users are more persistent in seeking information and they reformulate queries often. The three most frequent types of search are search by author name, search by gene/protein, and search by disease. Use of abbreviation in queries is very frequent. Factors such as result set size influence users’ decisions. Analysis of characteristics such as these plays a critical role in identifying users’ information needs and their search habits. In turn, such an analysis also provides useful insight for improving biomedical information retrieval. Database URL: http://www.ncbi.nlm.nih.gov/PubMed


Journal of the American Medical Informatics Association | 2011

Recommending MeSH terms for annotating biomedical articles

Minlie Huang; Aurélie Névéol; Zhiyong Lu

Background Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. Methods Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. Results Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. Conclusion Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing.


meeting of the association for computational linguistics | 2016

Findings of the 2016 Conference on Machine Translation.

Ondˇrej Bojar; Rajen Chatterjee; Christian Federmann; Yvette Graham; Barry Haddow; Matthias Huck; Antonio Jimeno Yepes; Philipp Koehn; Varvara Logacheva; Christof Monz; Matteo Negri; Aurélie Névéol; Mariana L. Neves; Martin Popel; Matt Post; Raphael Rubino; Carolina Scarton; Lucia Specia; Marco Turchi; Karin Verspoor; Marcos Zampieri

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments). The quality estimation task had three subtasks, with a total of 14 teams, submitting 39 entries. The automatic post-editing task had a total of 6 teams, submitting 11 entries.


Journal of Biomedical Informatics | 2011

Semi-automatic semantic annotation of PubMed queries

Aurélie Névéol; Rezarta Islamaj Doğan; Zhiyong Lu

Information processing algorithms require significant amounts of annotated data for training and testing. The availability of such data is often hindered by the complexity and high cost of production. In this paper, we investigate the benefits of a state-of-the-art tool to help with the semantic annotation of a large set of biomedical queries. Seven annotators were recruited to annotate a set of 10,000 PubMed® queries with 16 biomedical and bibliographic categories. About half of the queries were annotated from scratch, while the other half were automatically pre-annotated and manually corrected. The impact of the automatic pre-annotations was assessed on several aspects of the task: time, number of actions, annotator satisfaction, inter-annotator agreement, quality and number of the resulting annotations. The analysis of annotation results showed that the number of required hand annotations is 28.9% less when using pre-annotated results from automatic tools. As a result, the overall annotation time was substantially lower when pre-annotations were used, while inter-annotator agreement was significantly higher. In addition, there was no statistically significant difference in the semantic distribution or number of annotations produced when pre-annotations were used. The annotated query corpus is freely available to the research community. This study shows that automatic pre-annotations are found helpful by most annotators. Our experience suggests using an automatic tool to assist large-scale manual annotation projects. This helps speed-up the annotation time and improve annotation consistency while maintaining high quality of the final annotations.


meeting of the association for computational linguistics | 2007

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches

Alan R. Aronson; Olivier Bodenreider; Dina Demner-Fushman; Kin Wah Fung; Vivian K. Lee; James G. Mork; Aurélie Névéol; Lee B. Peters; Willie J. Rogers

This paper describes the application of an ensemble of indexing and classification systems, which have been shown to be successful in information retrieval and classification of medical literature, to a new task of assigning ICD-9-CM codes to the clinical history and impression sections of radiology reports. The basic methods used are: a modification of the NLM Medical Text Indexer system, SVM, k-NN and a simple pattern-matching method. The basic methods are combined using a variant of stacking. Evaluated in the context of a Medical NLP Challenge, fusion produced an F-score of 0.85 on the Challenge test set, which is considerably above the mean Challenge F-score of 0.77 for 44 participating groups.


Journal of Biomedical Informatics | 2009

A recent advance in the automatic indexing of the biomedical literature

Aurélie Névéol; Sonya E. Shooshan; Susanne M. Humphrey; James G. Mork; Alan R. Aronson

The volume of biomedical literature has experienced explosive growth in recent years. This is reflected in the corresponding increase in the size of MEDLINE, the largest bibliographic database of biomedical citations. Indexers at the US National Library of Medicine (NLM) need efficient tools to help them accommodate the ensuing workload. After reviewing issues in the automatic assignment of Medical Subject Headings (MeSH terms) to biomedical text, we focus more specifically on the new subheading attachment feature for NLMs Medical Text Indexer (MTI). Natural Language Processing, statistical, and machine learning methods of producing automatic MeSH main heading/subheading pair recommendations were assessed independently and combined. The best combination achieves 48% precision and 30% recall. After validation by NLM indexers, a suitable combination of the methods presented in this paper was integrated into MTI as a subheading attachment feature producing MeSH indexing recommendations compliant with current state-of-the-art indexing practice.


Journal of the American Medical Informatics Association | 2010

Extracting Rx information from clinical narrative

James G. Mork; Olivier Bodenreider; Dina Demner-Fushman; Rezarta Islamaj Doğan; François-Michel Lang; Zhiyong Lu; Aurélie Névéol; Lee B. Peters; Sonya E. Shooshan; Alan R. Aronson

OBJECTIVE The authors used the i2b2 Medication Extraction Challenge to evaluate their entity extraction methods, contribute to the generation of a publicly available collection of annotated clinical notes, and start developing methods for ontology-based reasoning using structured information generated from the unstructured clinical narrative. DESIGN Extraction of salient features of medication orders from the text of de-identified hospital discharge summaries was addressed with a knowledge-based approach using simple rules and lookup lists. The entity recognition tool, MetaMap, was combined with dose, frequency, and duration modules specifically developed for the Challenge as well as a prototype module for reason identification. MEASUREMENTS Evaluation metrics and corresponding results were provided by the Challenge organizers. RESULTS The results indicate that robust rule-based tools achieve satisfactory results in extraction of simple elements of medication orders, but more sophisticated methods are needed for identification of reasons for the orders and durations. LIMITATIONS Owing to the time constraints and nature of the Challenge, some obvious follow-on analysis has not been completed yet. CONCLUSIONS The authors plan to integrate the new modules with MetaMap to enhance its accuracy. This integration effort will provide guidance in retargeting existing tools for better processing of clinical text.


north american chapter of the association for computational linguistics | 2009

Exploring Two Biomedical Text Genres for Disease Recognition

Aurélie Névéol; Won Gu Kim; W. John Wilbur; Zhiyong Lu

In the framework of contextual information retrieval in the biomedical domain, this paper reports on the automatic detection of disease concepts in two genres of biomedical text: sentences from the literature and PubMed user queries. A statistical model and a Natural Language Processing algorithm for disease recognition were applied on both corpora. While both methods show good performance (F=77% vs. F=76%) on the sentence corpus, results on the query corpus indicate that the statistical model is more robust (F=74% vs. F=70%).


international health informatics symposium | 2010

Automatic integration of drug indications from multiple health resources

Aurélie Névéol; Zhiyong Lu

Drug indication refers to what disease(s) a drug may treat -- a type of information that is frequently sought by biomedical researchers, health care professionals and the general public. Although such information may be available online, it is often challenging for non-experts to glean unbiased reliable information from multiple websites of various quality. In addition, most drug indication information is only available in free text as opposed to structured format, thus making it difficult for further automatic analysis by computers. In response, we herein focus on automatically extracting and integrating drug indication information from multiple resources such as DailyMed and MeSH Scope notes. We select trustworthy resources of drug/disease relationships and apply state-of-the-art relationship extraction methods, customized to improve recall and perform ellipsis and anaphora resolution. As a result, 7,670 unique TREATS relationships between 4,666 drugs and 1,293 diseases are integrated from 4 different sources with an estimated overall correctness of 77% and specificity of 84%.


pacific symposium on biocomputing | 2006

Multiple approaches to fine-grained indexing of the biomedical literature.

Aurélie Névéol; Sonya E. Shooshan; Susanne M. Humphrey; Thomas C. Rindflesh; Alan R. Aronson

The number of articles in the MEDLINE database is expected to increase tremendously in the coming years. To ensure that all these documents are indexed with continuing high quality, it is necessary to develop tools and methods that help the indexers in their daily task. We present three methods addressing a novel aspect of automatic indexing of the biomedical literature, namely producing MeSH main heading/subheading pair recommendations. The methods, (dictionary-based, post- processing rules and Natural Language Processing rules) are described and evaluated on a genetics-related corpus. The best overall performance is obtained for the subheading genetics (70% precision and 17% recall with post-processing rules, 48% precision and 37% recall with the dictionary-based method). Future work will address extending this work to all MeSH subheadings and a more thorough study of method combination.

Collaboration


Dive into the Aurélie Névéol's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cyril Grouin

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Zhiyong Lu

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Pierre Zweigenbaum

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Alan R. Aronson

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James G. Mork

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Thomas Lavergne

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Sonya E. Shooshan

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Alexandrina Rogozan

Institut national des sciences appliquées de Rouen

View shared research outputs
Researchain Logo
Decentralizing Knowledge