Matthieu Constant
University of Marne-la-Vallée
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Matthieu Constant.
meeting of the association for computational linguistics | 2014
Marie Candito; Matthieu Constant
In this paper, we investigate various strate- gies to predict both syntactic dependency parsing and contiguous multiword expres- sion (MWE) recognition, testing them on the dependency version of French Tree- bank (Abeille and Barrier, 2004), as in- stantiated in the SPMRL Shared Task (Seddah et al., 2013). Our work focuses on using an alternative representation of syntactically regular MWEs, which cap- tures their syntactic internal structure. We obtain a system with comparable perfor- mance to that of previous works on this dataset, but which predicts both syntactic dependencies and the internal structure of MWEs. This can be useful for capturing the various degrees of semantic composi- tionality of MWEs.
meeting of the association for computational linguistics | 2016
Matthieu Constant; Joakim Nivre
We present a transition-based system that jointly predicts the syntactic structure and lexical units of a sentence by building two structures over the input words: a syntactic dependency tree and a ...
meeting of the association for computational linguistics | 2006
Olivier Blanc; Matthieu Constant
We present Outilex, a generalist linguistic platform for text processing. The platform includes several modules implementing the main operations for text processing and is designed to use large-coverage Language Resources. These resources (dictionaries, grammars, annotated texts) are formatted into XML, in accordance with current standards. Evaluations on efficiency are given.
ACM Transactions on Speech and Language Processing | 2013
Matthieu Constant; Joseph Le Roux; Anthony Sigogne
The integration of compounds in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly preidentified. This article evaluates two empirical strategies to incorporate such multiword units in a real PCFG-LA parsing context: (1) the use of a grammar including compound recognition, thanks to specialized annotation schemes for compounds; (2) the use of a state-of-the-art discriminative compound prerecognizer integrating endogenous and exogenous features. We show how these two strategies can be combined with word lattices representing possible lexical analyses generated by the recognizer. The proposed systems display significant gains in terms of multiword recognition and often in terms of standard parsing accuracy. Moreover, we show through an Oracle analysis that this combined strategy opens promising new research directions.
international conference natural language processing | 2008
Matthieu Constant; Patrick Watrin
This paper details a network infrastructure for representing and sharing multiword units. It enables connecting local networks describing linguistic semi-fixed components in the form of local grammars.
international conference natural language processing | 2002
Matthieu Constant
This paper analyses French locative prepositional phrases containing a location proper name Npr (e.g. Mediterranee) and its associated classifier Nc (e.g. mer). The (Nc, Npr) pairs are formally described with the aid of elementary sentences. We study their syntactic properties within adverbial support verb constructions and encode them in a Lexicon-Grammar Matrix. From this matrix, we build grammars in the form of graphs and evaluate their application to a journalistic corpus.
international multiconference on computer science and information technology | 2009
Anthony Sigogne; Matthieu Constant
This paper adresses the problem of clustering dynamic collections of web documents. We show an iterative algorithm based on a fine-grained keyword extraction (simple, compound words and proper nouns). Each new document inserted in the collection is either assigned to an existing class containing documents of the same topic, or assigned to a new class. After each step, when necessary, classes are refined using statistical techniques. The implementation of this algorithm was successfully integrated in an application used for Information Intelligence.
international conference on implementation and application of automata | 2007
Olivier Blanc; Matthieu Constant; Patrick Watrin
Language is full of multiword unit expressions that form basic semantic units. The identification of these structures limits the combinatorial complexity induced by lexical ambiguity. In this paper, we detail an experiment that largely integrates these notions in a finite-state procedure of segmentation into super-chunks, preliminary to a parser. We show that the chunker, developped for French, reaches 92.9% precision and 98.7% recall.
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) | 2017
Hazem Al Saied; Matthieu Constant; Marie Candito
We describe the ATILF-LLF system built for the MWE 2017 Shared Task on automatic identification of verbal multiword expressions. We participated in the closed track only, for all the 18 available languages. Our system is a robust greedy transition-based system, in which MWE are identified through a MERGE transition. The system was meant to accommodate the variety of linguistic resources provided for each language, in terms of accompanying morphological and syntactic information. Using per-MWE Fscore, the system was ranked first 1 for all but two languages (Hungarian and Romanian).
north american chapter of the association for computational linguistics | 2016
Matthieu Constant; Joseph Le Roux; Nadi Tomeh
We explore the consequences of representing token segmentations as hierarchical structures (trees) for the task of Multiword Expression (MWE) recognition, in isolation or in combination with dependency parsing. We propose a novel representation of token segmentation as trees on tokens, resembling dependency trees. Given this new representation, we present and evaluate two different architectures to combine MWE recognition and dependency parsing in the easy-first framework: a pipeline and a joint system, both taking advantage of lexical and syntactic dimensions. We experimentally validate that MWE recognition significantly helps syntactic parsing.