Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christophe Roeder is active.

Publication


Featured researches published by Christophe Roeder.


BMC Bioinformatics | 2010

The structural and content aspects of abstracts versus bodies of full text journal articles are different

K. Bretonnel Cohen; Helen L. Johnson; Karin Verspoor; Christophe Roeder; Lawrence Hunter

BackgroundAn increase in work on the full text of journal articles and the growth of PubMedCentral have the opportunity to create a major paradigm shift in how biomedical text mining is done. However, until now there has been no comprehensive characterization of how the bodies of full text journal articles differ from the abstracts that until now have been the subject of most biomedical text mining research.ResultsWe examined the structural and linguistic aspects of abstracts and bodies of full text articles, the performance of text mining tools on both, and the distribution of a variety of semantic classes of named entities between them. We found marked structural differences, with longer sentences in the article bodies and much heavier use of parenthesized material in the bodies than in the abstracts. We found content differences with respect to linguistic features. Three out of four of the linguistic features that we examined were statistically significantly differently distributed between the two genres. We also found content differences with respect to the distribution of semantic features. There were significantly different densities per thousand words for three out of four semantic classes, and clear differences in the extent to which they appeared in the two genres. With respect to the performance of text mining tools, we found that a mutation finder performed equally well in both genres, but that a wide variety of gene mention systems performed much worse on article bodies than they did on abstracts. POS tagging was also more accurate in abstracts than in article bodies.ConclusionsAspects of structure and content differ markedly between article abstracts and article bodies. A number of these differences may pose problems as the text mining field moves more into the area of processing full-text articles. However, these differences also present a number of opportunities for the extraction of data types, particularly that found in parenthesized text, that is present in article bodies but not in article abstracts.


BMC Bioinformatics | 2012

A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

Karin Verspoor; Kevin Bretonnel Cohen; Arrick Lanfranchi; Colin Warner; Helen L. Johnson; Christophe Roeder; Jinho D. Choi; Christopher S. Funk; Yuriy Malenkiy; Miriam Eckert; Nianwen Xue; William A. Baumgartner; Michael Bada; Martha Palmer; Lawrence Hunter

BackgroundWe introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus.ResultsMany biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data.ConclusionsThe finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications.


BMC Bioinformatics | 2014

Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters.

Christopher S. Funk; William A. Baumgartner; Benjamin Garcia; Christophe Roeder; Michael Bada; K. Bretonnel Cohen; Lawrence Hunter; Karin Verspoor

BackgroundOntological concepts are useful for many different biomedical tasks. Concepts are difficult to recognize in text due to a disconnect between what is captured in an ontology and how the concepts are expressed in text. There are many recognizers for specific ontologies, but a general approach for concept recognition is an open problem.ResultsThree dictionary-based systems (MetaMap, NCBO Annotator, and ConceptMapper) are evaluated on eight biomedical ontologies in the Colorado Richly Annotated Full-Text (CRAFT) Corpus. Over 1,000 parameter combinations are examined, and best-performing parameters for each system-ontology pair are presented.ConclusionsBaselines for concept recognition by three systems on eight biomedical ontologies are established (F-measures range from 0.14–0.83). Out of the three systems we tested, ConceptMapper is generally the best-performing system; it produces the highest F-measure of seven out of eight ontologies. Default parameters are not ideal for most systems on most ontologies; by changing parameters F-measure can be increased by up to 0.4. Not only are best performing parameters presented, but suggestions for choosing the best parameters based on ontology characteristics are presented.


north american chapter of the association for computational linguistics | 2009

High-precision biological event extraction with a concept recognizer

K. Bretonnel Cohen; Karin Verspoor; Helen L. Johnson; Christophe Roeder; Philip V. Ogren; William A. Baumgartner; Elizabeth K. White; Lawrence Hunter

We approached the problems of event detection, argument identification, and negation and speculation detection as one of concept recognition and analysis. Our methodology involved using the OpenDMAP semantic parser with manually-written rules. We achieved state-of-the-art precision for two of the three tasks, scoring the highest of 24 teams at precision of 71.81 on Task 1 and the highest of 6 teams at precision of 70.97 on Task 2. The OpenDMAP system and the rule set are available at bionlp.sourceforge.net.


computational intelligence | 2011

HIGH-PRECISION BIOLOGICAL EVENT EXTRACTION: EFFECTS OF SYSTEM AND OF DATA.

K. Bretonnel Cohen; Karin Verspoor; Helen L. Johnson; Christophe Roeder; Philip V. Ogren; William A. Baumgartner; Elizabeth K. White; Hannah Tipney; Lawrence Hunter

We approached the problems of event detection, argument identification, and negation and speculation detection in the BioNLP’09 information extraction challenge through concept recognition and analysis. Our methodology involved using the OpenDMAP semantic parser with manually written rules. The original OpenDMAP system was updated for this challenge with a broad ontology defined for the events of interest, new linguistic patterns for those events, and specialized coordination handling. We achieved state‐of‐the‐art precision for two of the three tasks, scoring the highest of 24 teams at precision of 71.81 on Task 1 and the highest of 6 teams at precision of 70.97 on Task 2. We provide a detailed analysis of the training data and show that a number of trigger words were ambiguous as to event type, even when their arguments are constrained by semantic class. The data is also shown to have a number of missing annotations. Analysis of a sampling of the comparatively small number of false positives returned by our system shows that major causes of this type of error were failing to recognize second themes in two‐theme events, failing to recognize events when they were the arguments to other events, failure to recognize nontheme arguments, and sentence segmentation errors. We show that specifically handling coordination had a small but important impact on the overall performance of the system. The OpenDMAP system and the rule set are available at http://bionlp.sourceforge.net.


BMC Bioinformatics | 2011

U-Compare bio-event meta-service: compatible BioNLP event extraction services

Yoshinobu Kano; Jari Björne; Filip Ginter; Tapio Salakoski; Ekaterina Buyko; Udo Hahn; K. Bretonnel Cohen; Karin Verspoor; Christophe Roeder; Lawrence Hunter; Halil Kilicoglu; Sabine Bergler; Sofie Van Landeghem; Thomas Van Parys; Yves Van de Peer; Makoto Miwa; Sophia Ananiadou; Mariana Neves; Alberto Pascual-Montano; Arzucan Özgür; Dragomir R. Radev; Sebastian Riedel; Rune Sætre; Hong-Woo Chun; Jin-Dong Kim; Sampo Pyysalo; Tomoko Ohta; Jun’ichi Tsujii

BACKGROUND Bio-molecular event extraction from literature is recognized as an important task of bio text mining and, as such, many relevant systems have been developed and made available during the last decade. While such systems provide useful services individually, there is a need for a meta-service to enable comparison and ensemble of such services, offering optimal solutions for various purposes. RESULTS We have integrated nine event extraction systems in the U-Compare framework, making them intercompatible and interoperable with other U-Compare components. The U-Compare event meta-service provides various meta-level features for comparison and ensemble of multiple event extraction systems. Experimental results show that the performance improvements achieved by the ensemble are significant. CONCLUSIONS While individual event extraction systems themselves provide useful features for bio text mining, the U-Compare meta-service is expected to improve the accessibility to the individual systems, and to enable meta-level uses over multiple event extraction systems such as comparison and ensemble.


Bioinformatics | 2010

A UIMA wrapper for the NCBO annotator

Christophe Roeder; Clement Jonquet; Nigam H. Shah; William A. Baumgartner; Karin Verspoor; Lawrence Hunter

Summary: The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator—an ontology-based annotation service—to make it available as a component in UIMA workflows. Availability: This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows. Contact: [email protected]


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2010

Exploring Species-Based Strategies for Gene Normalization

Karin Verspoor; Christophe Roeder; Helen L. Johnson; Kevin Bretonnel Cohen; William A. Baumgartner; Lawrence Hunter


language resources and evaluation | 2010

Test Suite Design for Biomedical Ontology Concept Recognition Systems.

K. Bretonnel Cohen; Christophe Roeder; William A. Baumgartner; Lawrence Hunter; Karin Verspoor


Information Technology and Libraries | 2014

Negotiating a Text Mining License for Faculty Researchers

Leslie A. Williams; Lynne M. Fox; Christophe Roeder; Lawrence Hunter

Collaboration


Dive into the Christophe Roeder's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lawrence Hunter

University of Colorado Denver

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Helen L. Johnson

University of Colorado Denver

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cartic Ramakrishnan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Chun-Nan Hsu

University of California

View shared research outputs
Top Co-Authors

Avatar

Eduard H. Hovy

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge