Florian Boudin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Florian Boudin is active.

Explore More

Publication

Featured researches published by Florian Boudin.

BMC Medical Informatics and Decision Making | 2010

Combining classifiers for robust PICO element detection

Florian Boudin; Jian-Yun Nie; Joan C. Bartlett; Roland Grad; Pierre Pluye; Martin Dawes

BackgroundFormulating a clinical information need in terms of the four atomic parts which are Population/Problem, Intervention, Comparison and Outcome (known as PICO elements) facilitates searching for a precise answer within a large medical citation database. However, using PICO defined items in the information retrieval process requires a search engine to be able to detect and index PICO elements in the collection in order for the system to retrieve relevant documents.MethodsIn this study, we tested multiple supervised classification algorithms and their combinations for detecting PICO elements within medical abstracts. Using the structural descriptors that are embedded in some medical abstracts, we have automatically gathered large training/testing data sets for each PICO element.ResultsCombining multiple classifiers using a weighted linear combination of their prediction scores achieves promising results with an f-measure score of 86.3% for P, 67% for I and 56.6% for O.ConclusionsOur experiments on the identification of PICO elements showed that the task is very challenging. Nevertheless, the performance achieved by our identification method is competitive with previously published results and shows that this task can be achieved with a high accuracy for the P element but lower ones for I and O elements.

empirical methods in natural language processing | 2015

Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions

Florian Boudin; Hugo Mougard; Benoit Favre

In concept-based summarization, sentence selection is modelled as a budgeted maximum coverage problem. As this problem is NP-hard, pruning low-weight concepts is required for the solver to find optimal solutions efficiently. This work shows that reducing the number of concepts in the model leads to lower Rouge scores, and more importantly to the presence of multiple optimal solutions. We address these issues by extending the model to provide a single optimal solution, and eliminate the need for concept pruning using an approximation algorithm that achieves comparable performance to exact inference.

european conference on information retrieval | 2010

Improving medical information retrieval with PICO element detection

Florian Boudin; Lixin Shi; Jian-Yun Nie

Without a well formulated and structured question, it can be very difficult and time consuming for physicians to identify appropriate resources and search for the best available evidence for medical treatment in evidence-based medicine (EBM). In EBM, clinical studies and questions involve four aspects: Population/Problem (P), Intervention (I), Comparison (C) and Outcome (O), which are known as PICO elements. It is intuitively more advantageous to use these elements in Information Retrieval (IR). In this paper, we first propose an approach to automatically identify the PICO elements in documents and queries. We test several possible approaches to use the identified elements in IR. Experiments show that it is a challenging task to determine accurately PICO elements. However, even with noisy tagging results, we can still take advantage of some PICO elements, namely I and P elements, to enhance the retrieval process, and this allows us to obtain significantly better retrieval effectiveness than the state-of-the-art methods.

meeting of the association for computational linguistics | 2015

Reducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear Programming

Florian Boudin

We introduce a global inference model for keyphrase extraction that reduces overgeneration errors by weighting sets of keyphrase candidates according to their component words. Our model can be applied on top of any supervised or unsupervised word weighting function. Experimental results show a substantial improvement over commonly used word-based ranking approaches.

european conference on information retrieval | 2012

Using a medical thesaurus to predict query difficulty

Florian Boudin; Jian-Yun Nie; Martin Dawes

Estimating query performance is the task of predicting the quality of results returned by a search engine in response to a query. In this paper, we focus on pre-retrieval prediction methods for the medical domain. We propose a novel predictor that exploits a thesaurus to ascertain how difficult queries are. In our experiments, we show that our predictor outperforms the state-of-the-art methods that do not use a thesaurus.

meeting of the association for computational linguistics | 2015

LINA: Identifying Comparable Documents from Wikipedia

Emmanuel Morin; Amir Hazem; Florian Boudin; Elizaveta Loginova-Clouet

This paper describes the LINA system for the BUCC 2015 shared track. Following (Enright and Kondrak, 2007), our system identify comparable documents by collecting counts of hapax words. We extend this method by filtering out document pairs sharing target documents using pigeonhole reasoning and cross-lingual information .

Proceedings of the ACM fourth international workshop on Data and text mining in biomedical informatics | 2010

Deriving a test collection for clinical information retrieval from systematic reviews

Florian Boudin; Jian-Yun Nie; Martin Dawes

In this paper, we describe the construction of a test collection for evaluating clinical information retrieval. The purpose of this test collection is to provide a basis for researchers to experiment with PECO-structured queries. Systematic reviews are used as a starting point for generating queries and relevance judgments. We give some details on the difficulties encountered in building this resource and report the results achieved by current state-of-the-art approaches.

international joint conference on natural language processing | 2013