Thierry Poibeau | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thierry Poibeau is active.

Explore More

Publication

Featured researches published by Thierry Poibeau.

computational linguistics in the netherlands | 2000

Proper Name Extraction from Non-Journalistic Texts

Thierry Poibeau; Leila Kosseim

This paper discusses the influence of the corpus on the automatic identification of proper names in texts. Techniques developed for the newswire genre are generally not sufficient to deal with larger corpora containing texts that do not follow strict writing constraints (for example, e-mail messages, transcriptions of oral conversations, etc). After a brief review of the research performed on news texts, we present some of the problems involved in the analysis of two different corpora: e-mails and hand-transcribed telephone conversations. Once the sources of errors have been presented, we then describe an approach to adapt a proper name extraction system developed for newspaper texts to the analysis of e-mail

Multi-source, Multilingual Information Extraction and Summarization | 2013

Automatic Text Summarization: Past, Present and Future

Horacio Saggion; Thierry Poibeau

Automatic text summarization, the computer-based production of condensed versions of documents, is an important technology for the information society. Without summaries it would be practically impossible for human beings to get access to the ever growing mass of information available online. Although research in text summarization is over 50 years old, some efforts are still needed given the insufficient quality of automatic summaries and the number of interesting summarization topics being proposed in different contexts by end users (“domain-specific summaries”, “opinion-oriented summaries”, “update summaries”, etc.). This paper gives a short overview of summarization methods and evaluation.

international conference on computational linguistics | 2004

Event-based information extraction for the biomedical domain: the Caderige project

Erick Alphonse; Sophie Aubin; Philippe Bessières; Gilles Bisson; Thierry Hamon; Sandrine Lagarrigue; Adeline Nazarenko; Alain-Pierre Manine; Claire Nédellec; Mohamed Ould Abdel Vetah; Thierry Poibeau; Davy Weissenbacher

This paper gives an overview of the Caderige project. This project involves teams from different areas (biology, machine learning, natural language processing) in order to develop highlevel analysis tools for extracting structured information from biological bibliographical databases, especially Medline. The paper gives an overview of the approach and compares it to the state of the art.

conference of the european chapter of the association for computational linguistics | 2003

The multilingual named entity recognition framework

Thierry Poibeau

This paper presents a multilingual system designed to recognize named entities in a wide variety of languages (currently more than 12 languages are concerned). The system includes original strategies to deal with a wide variety of encoding character sets, analysis strategies and algorithms to process these languages.

Archive | 2012

Multi-source, Multilingual Information Extraction and Summarization

Thierry Poibeau; Horacio Saggion; Jakub Piskorski; Roman Yangarber

Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting them to the user in condensed form. The ongoing information explosion makes IE and TS critical for successful functioning within the information society. These technologies face particular challenges due to the inherent multi-source nature of the information explosion. The technologies must now handle not isolated texts or individual narratives, but rather large-scale repositories and streams---in general, in multiple languages---containing a multiplicity of perspectives, opinions, or commentaries on particular topics, entities or events. There is thus a need to adapt existing techniques and develop new ones to deal with these challenges. This volume contains a selection of papers that present a variety of methodologies for content identification and extraction, as well as for content fusion and regeneration. The chapters cover various aspects of the challenges, depending on the nature of the information sought---names vs. events,--- and the nature of the sources---news streams vs. image captions vs. scientific research papers, etc. This volume aims to offer a broad and representative sample of studies from this very active research field.

LREC 2008 Workshop on Sentiment Analysis: Emotion, Metaphor, Ontology and Terminology | 2011

Sentiment Analysis Using Automatically Labelled Financial News Items

Michel Généreux; Thierry Poibeau; Moshe Koppel

Given a corpus of financial news items labelled according to the market reaction following their publication, we investigate ‘cotemporeneous’ and forward-looking price stock movements. Our approach is to provide a pool of relevant textual features to a machine learning algorithm to detect substantial stock price variations. Our two working hypotheses are that the market reaction to a news item is a good indicator for labelling financial news items, and that a machine learning algorithm can be trained on those news items to build models detecting price movement effectively.

international conference on computational linguistics | 2002

Generating extraction patterns from a large semantic network and an untagged corpus

Thierry Poibeau; Dominique Dutoit

This paper presents a module dedicated to the elaboration of linguistic resources for a versatile Information Extraction system. In order to decrease the time spent on the elaboration of resources for the IE system and guide the end-user in a new domain, we suggest to use a machine learning system that helps defining new templates and associated resources. This knowledge is automatically derived from the text collection, in interaction with a large semantic network.

international conference on computational linguistics | 2002

Inferring knowledge from a large semantic network

Dominique Dutoit; Thierry Poibeau

In this paper, we present a rich semantic network based on a differential analysis. We then detail implemented measures that take into account common and differential features between words. In a last section, we describe some industrial applications.

joint conference on lexical and computational semantics | 2015

Combining Open Source Annotators for Entity Linking through Weighted Voting

Pablo Ruiz; Thierry Poibeau

An English entity linking (EL) workflow is presented, which combines the annotations of five public open source EL services. The annotations are combined through a weighted voting scheme inspired by the ROVER method , which had not been previously tested on EL outputs. The combined results improved over each individual systems results, as evaluated on four different golden sets.

Cognitive Processing | 2012

Gestalt compositionality and instruction-based meaning construction

Gilles Col; Jeanne Aptekman; Stéphanie Girault; Thierry Poibeau

AbtractWe would like to propose a new model of meaning construction based on language comprehension considered as a dynamic process during which the meaning of each linguistic unit and the global meaning of the sentence are determined simultaneously. This model, which may be called “gestalt compositionality,” is radically opposed to the classic compositional mechanism advocated by linguistic formalism based on the primacy of syntax. The process considers the syntactic structure of an utterance as the product of meaning construction rather than its source. The comprehension of an utterance is consequently directly based on the interaction between the different basic components of this utterance: lexical units, grammatical markers, positional relations between units, and more generally, basic “constructions” in the sense of Construction Grammar. Thus, meaning is really the result of a gestalt compositional process insomuch as the contribution of each basic component depends on the contribution of the other components present in the utterance. We show a first attempt at modeling from French and English examples.

Explore More