Is this you? Create Your Porfile

Yann Mathet

University of Caen Lower Normandy

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yann Mathet is active.

Explore More

Publication

Featured researches published by Yann Mathet.

document engineering | 2012

The Glozz platform: a corpus annotation and mining tool

Antoine Widlöcher; Yann Mathet

Corpus linguistics and Natural Language Processing make it necessary to produce and share reference annotations to which linguistic and computational models can be compared. Creating such resources requires a formal framework supporting description of heterogeneous linguistic objects and structures, appropriate representation formats, and adequate manual annotation tools, making it possible to locate, identify and describe linguistic phenomena in textual documents. The Glozz platform addresses all these needs, and provides a highly versatile corpus annotation tool with advanced visualization, querying and evaluation possibilities.

intelligent information systems | 2003

Passage Extraction in Geographical Documents

Frédérik Bilhaut; Thierry Charnois; Patrice Enjalbert; Yann Mathet

This paper presents a project whose aim is to retrieve information in geographical documents. It relies on the generic structure of geographical information which relates some phenomena (for example of sociological or economic nature) with localisations in space and time. The system includes semantic analysers of spatial and temporal expressions, a term extractor (for phenomena), and a discourse analysis module linking the three components altogether, mostly relying on Charolles’ discourse universes model. Documents are processed off-line and the results are stored thanks to an XML markup, ready for queries combining the three components of geographical information. Ranked lists of dynamically-bounded passages are returned as answers.

Computational Linguistics | 2015

The unified and holistic method gamma γ for inter-annotator agreement measure and alignment

Yann Mathet; Antoine Widlöcher; Jean-Philippe Métivier

Agreement measures have been widely used in computational linguistics for more than 15 years to check the reliability of annotation processes. Although considerable effort has been made concerning categorization, fewer studies address unitizing, and when both paradigms are combined even fewer methods are available and discussed. The aim of this article is threefold. First, we advocate that to deal with unitizing, alignment and agreement measures should be considered as a unified process, because a relevant measure should rely on an alignment of the units from different annotators, and this alignment should be computed according to the principles of the measure. Second, we propose the new versatile measure γ, which fulfills this requirement and copes with both paradigms, and we introduce its implementation. Third, we show that this new method performs as well as, or even better than, other more specialized methods devoted to categorization or segmentation, while combining the two paradigms at the same time.

Computational Linguistics | 2017

The Agreement Measure γcat a Complement to γ Focused on Categorization of a Continuum

Yann Mathet

Agreement on unitizing, where several annotators freely put units of various sizes and categories on a continuum, is difficult to assess because of the simultaneaous discrepancies in positioning and categorizing. The recent agreement measure γ offers an overall solution that simultaneously takes into account positions and categories. In this article, I propose the additional coefficient γcat, which complements γ by assessing the agreement on categorization of a continuum, putting aside positional discrepancies. When applied to pure categorization (with predefined units), γcat behaves the same way as the famous dedicated Krippendorffs α, even with missing values, which proves its consistency. A variation of γcat is also proposed that provides an in-depth assessment of categorizing for each individual category. The entire family of γ coefficients is implemented in free software.

document engineering | 2015

Combining Advanced Information Retrieval and Text-Mining for Digital Humanities

Antoine Widlöcher; Nicolas Bechet; Jean-Marc Lecarpentier; Yann Mathet; Julia Roger

Digital Humanities make more and more structured and richly annotated corpora available. Most of this data rely on well known and established standards, such as TEI, which especially enable scientists to edit and publish their work. However, one of the remaining problems is to give adequate access to this rich data, in order to produce higher-order knowledge. In this paper, we present an integrated environment combining an advanced search engine and text-mining techniques for hermeneutics in Digital Humanities. Relying on semantic web technologies, the search engine uses full text as well as complex embedding structures and offers a single interface to access rich and heterogeneous data and meta-data. Text-mining possibilities enable scholars to exhibit regularities in corpora. Results obtained on the Cartesian corpus illustrate these principles and tools.

north american chapter of the association for computational linguistics | 2003

Geographic reference analysis for geographic document querying

Frédérik Bilhaut; Thierry Charnois; Patrice Enjalbert; Yann Mathet

Actes de la 10e Conférence Traitement Automatique du Langage Naturel (TALN'03) | 2003

Indexation discursive pour la navigation intradocumentaire : cadres temporels et spatiaux dans l'information géographique

Frédérik Bilhaut; Lydia-Mai Ho-Dac; Andrée Borillo; Thierry Charnois; Patrice Enjalbert; Anne Le Draoulec; Yann Mathet; Hélène Miguet; Marie-Paule Péry-Woodley; Laure Sarda

Traitement Automatique des Langues Naturelles 2009 | 2008

ANNODIS: une approche outillée de l'annotation de structures discursives

Marie-Paule Péry-Woodley; Nicholas Asher; Patrice Enjalbert; Farah Benamara; Myriam Bras; Cécile Fabre; Stéphane Ferrari; Lydia-Mai Ho-Dac; A. Le Draoulec; Yann Mathet

Atelier Défi Fouille de Textes (DEFT'07) dans le cadre de la plate-forme AFIA 2007 (Association Française pour l'Intelligence Artificielle) | 2007