Is this you? Create Your Porfile

Delphine Bernhard

Centre national de la recherche scientifique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Delphine Bernhard is active.

Explore More

Publication

Featured researches published by Delphine Bernhard.

workshop on innovative use of nlp for building educational applications | 2008

Answering Learners' Questions by Retrieving Question Paraphrases from Social Q&A Sites

Delphine Bernhard; Iryna Gurevych

Information overload is a well-known problem which can be particularly detrimental to learners. In this paper, we propose a method to support learners in the information seeking process which consists in answering their questions by retrieving question paraphrases and their corresponding answers from social Q&A sites. Given the novelty of this kind of data, it is crucial to get a better understanding of how questions in social Q&A sites can be automatically analysed and retrieved. We discuss and evaluate several pre-processing strategies and question similarity metrics, using a new question paraphrase corpus collected from the WikiAnswers Q&A site. The results show that viable performance levels of more than 80% accuracy can be obtained for the task of question paraphrase retrieval.

cross language evaluation forum | 2008

Simple Morpheme Labelling in Unsupervised Morpheme Analysis

Delphine Bernhard

This paper describes a system for unsupervised morpheme analysis and the results it obtained at Morpho Challenge 2007. The system takes a plain list of words as input and returns a list of labelled morphemic segments for each word. Morphemic segments are obtained by an unsupervised learning process which can directly be applied to different natural languages. Results obtained at competition 1 (evaluation of the morpheme analyses) are better in English, Finnish and German than in Turkish. For information retrieval (competition 2), the best results are obtained when indexing is performed using Okapi (BM25) weighting for all morphemes minus those belonging to an automatic stop list made of the most common morphemes.

cross language evaluation forum | 2009

MorphoNet: exploring the use of community structure for unsupervised morpheme analysis

Delphine Bernhard

This paper investigates a novel approach to unsupervised morphology induction relying on community detection in networks. In a first step, morphological transformation rules are automatically acquired based on graphical similarities between words. These rules encode substring substitutions for transforming one word form into another. The transformation rules are then applied to the construction of a lexical network. The nodes of the network stand for words while edges represent transformation rules. In the next step, a clustering algorithm is applied to the network to detect families of morphologically related words. Finally, morpheme analyses are produced based on the transformation rules and the word families obtained after clustering. While still in its preliminary development stages, this method obtained encouraging results at Morpho Challenge 2009, which demonstrate the viability of the approach.

string processing and information retrieval | 2011

When was it written? automatically determining publication dates

Anne Garcia-Fernandez; Anne-Laure Ligozat; Marco Dinarelli; Delphine Bernhard

Automatically determining the publication date of a document is a complex task, since a document may contain only few intratextual hints about its publication date. Yet, it has many important applications. Indeed, the amount of digitized historical documents is constantly increasing, but their publication dates are not always properly identified via OCR acquisition. Accurate knowledge about publication dates is crucial for many applications, e.g. studying the evolution of documents topics over a certain period of time. In this article, we present a method for automatically determining the publication dates of documents, which was evaluated on a French newspaper corpus in the context of the DEFT 2011 evaluation campaign. Our system is based on a combination of different individual systems, relying both on supervised and unsupervised learning, and uses several external resources, e.g. Wikipedia, Google Books Ngrams, and etymological background knowledge about the French language. Our system detects the correct year of publication in 10% of the cases for 300-word excerpts and in 14% of the cases for 500-word excerpts, which is very promising given the complexity of the task.

Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) | 2014

Syntactic Sentence Simplification for French

Laetitia Brouwers; Delphine Bernhard; Anne-Laure Ligozat; Thomas François

This paper presents a method for the syntactic simplification of French texts. Syntactic simplification aims at making texts easier to understand by simplifying complex syntactic structures that hinder reading. Our approach is based on the study of two parallel corpora (encyclopaedia articles and tales). It aims to identify the linguistic phenomena involved in the manual simplification of French texts and organise them within a typology. We then propose a syntactic simplification system that relies on this typology to generate simplified sentences. The module starts by generating all possible variants before selecting the best subset. The evaluation shows that about 80% of the simplified sentences produced by our system are accurate.

cross language evaluation forum | 2009

Combining probabilistic and translation-based models for information retrieval based on word sense annotations

Elisabeth Wolf; Delphine Bernhard; Iryna Gurevych

The objective of our experiments in the monolingual robust word sense disambiguation (WSD) track at CLEF 2009 is twofold. On the one hand, we intend to increase the precision of WSD by a heuristic-based combination of the annotations of the two WSD systems. For this, we provide an extrinsic evaluation on different levels of word sense accuracy. On the other hand, we aim at combining an often used probabilistic model, namely the Divergence From Randomness BM25 model, with a monolingual translation-based model. Our best performing system with and without utilizing word senses ranked 1st overall in the monolingual task. However, we could not observe any improvement by applying the sense annotations compared to the retrieval settings based on tokens or lemmas only.

Biomedical Informatics Insights | 2012

A Combined Approach to Emotion Detection in Suicide Notes

Alexander Pak; Delphine Bernhard; Patrick Paroubek; Cyril Grouin

In this paper, we present the system we have developed for participating in the second task of the i2b2/VA 2011 challenge dedicated to emotion detection in clinical records. On the official evaluation, we ranked 6th out of 26 participants. Our best configuration, based upon a combination of both a machine-learning based approach and manually-defined transducers, obtained a 0.5383 global F-measure, while the distribution of the other 26 participants’ results is characterized by mean = 0.4875, stdev = 0.0742, min = 0.2967, max = 0.6139, and median = 0.5027. Combination of machine learning and transducer is achieved by computing the union of results from both approaches, each using a hierarchy of sentiment specific classifiers.

international conference on computational linguistics | 2010

A Monolingual Tree-based Translation Model for Sentence Simplification

Zhemin Zhu; Delphine Bernhard; Iryna Gurevych

international joint conference on natural language processing | 2009

Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding

Delphine Bernhard; Iryna Gurevych

Journal of the American Medical Informatics Association | 2011

Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification

Anne-Lyse Minard; Anne-Laure Ligozat; Asma Ben Abacha; Delphine Bernhard; Bruno Cartoni; Louise Deléger; Brigitte Grau; Sophie Rosset; Pierre Zweigenbaum; Cyril Grouin

Explore More