Nicolas Hernandez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicolas Hernandez is active.

Explore More

Publication

Featured researches published by Nicolas Hernandez.

conference of the european chapter of the association for computational linguistics | 2009

What's in a Message?

Stergos D. Afantenos; Nicolas Hernandez

In this paper we present the first step in a larger series of experiments for the induction of predicate/argument structures. The structures that we are inducing are very similar to the conceptual structures that are used in Frame Semantics (such as FrameNet). Those structures are called messages and they were previously used in the context of a multi-document summarization system of evolving events. The series of experiments that we are proposing are essentially composed from two stages. In the first stage we are trying to extract a representative vocabulary of words. This vocabulary is later used in the second stage, during which we apply to it various clustering approaches in order to identify the clusters of predicates and arguments---or frames and semantic roles, to use the jargon of Frame Semantics. This paper presents in detail and evaluates the first stage.

Polibits | 2011

Detecting Derivatives using Specific and Invariant Descriptors

Fabien Poulard; Nicolas Hernandez; Béatrice Daille

This paper explores the detection of derivation links between texts (otherwise called plagiarism, near-duplication, revision, etc.) at the document level. We evaluate the use of textual elements implementing the ideas of specificity and invariance as well as their combination to characterize derivatives. We built a French press corpus based on Wikinews revisions to run this evaluation. We obtain performances similar to the state of the art method (n-grams overlap) while reducing the signature size and so, the processing costs. In order to ensure the verifiability and the reproducibility of our results we make our code as well as our corpus available to the community.

international conference on computational linguistics | 2017

Dialogue Act Taxonomy Interoperability Using a Meta-Model

Soufian Salim; Nicolas Hernandez; Emmanuel Morin

Dialogue act taxonomies, such as those of DAMSL, DiAML or the HCRC dialogue structure, can be incorporated into a larger meta-model by breaking down their labels into primitive functional features. Doing so enables the re-exploitation of annotated data for automatic dialogue act recognition tasks across taxonomies, i.e. it gives us the means to make a classifier learn from data annotated according to taxonomies different from the target taxonomy. We propose a meta-model covering several well-known taxonomies of dialogue acts, and we demonstrate its usefulness for the task of cross-taxonomy dialogue act recognition.

linguistic annotation workshop | 2014

Exploiting the Human Computational Effort Dedicated to Message Reply Formatting for Training Discursive Email Segmenters

Nicolas Hernandez; Soufian Salim

In the context of multi-domain and multimodal online asynchronous discussion analysis, we propose an innovative strategy for manual annotation of dialog act (DA) segments. The process aims at supporting the analysis of messages in terms of DA. Our objective is to train a sequence labelling system to detect the segment boundaries. The originality of the proposed approach is to avoid manually annotating the training data and instead exploit the human computational efforts dedicated to message reply formatting when the writer replies to a message by inserting his response just after the quoted text appropriate to his intervention. We describe the approach, propose a new electronic mail corpus and report the evaluation of segmentation models we built.

Document numérique | 2010

Évaluation de descripteurs statistiques et linguistiques pour la détection de dérivation de texte

Fabien Poulard; Nicolas Hernandez; Stergos D. Afantenos; Béatrice Daille

Dans cet article, nous traitons du probleme de la detection de relations de derivation et de coderivation entre des paires d’articles de presse en francais. Nous reprenons le cadre des approches par signature largement utilise dans la litterature et nous experimentons plusieurs types de descripteurs selectionnes pour leur singularite : trigrammes hapax, entites nommees, composes nominaux et connecteurs discursifs. Nous evaluons ces differentes approches en termes de cout de mise en œuvre ainsi que de capacite a predire ces types de relations sur le corpus PIITHIE. Nous montrons qu’il est ainsi possible de conserver un niveau de performance comparable a l’approche etat de l’art tout en reduisant fortement la taille de la modelisation des documents et donc du cout de mise en œuvre.

LREC 2010 Workshop 'New Challenges for NLP Frameworks' | 2010