Guillaume Jacquet
Xerox
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guillaume Jacquet.
content based multimedia indexing | 2011
Gabriela Csurka; Stéphane Clinchant; Guillaume Jacquet
The aim of this paper is to explore different medical image modality and retrieval strategies. First, we analyze how current state-of-the art image representations (bags of visual words and Fisher Vectors) perform when we use them for medical modality classification. Then we integrated these representations in a content based image retrieval system and tested on a medical image retrieval task. Finally, in both cases, we explored how the performance can be improved if we combine visual with textual information. To show the performance of different systems we compared our approaches to the systems participated at the Medical Task of the latest ImageClef Challenge [16].
meeting of the association for computational linguistics | 2009
Julien Ah-Pine; Guillaume Jacquet
We propose a system which builds, in a semi-supervised manner, a resource that aims at helping a NER system to annotate corpus-specific named entities. This system is based on a distributional approach which uses syntactic dependencies for measuring similarities between named entities. The specificity of the presented method however, is to combine a clique-based approach and a clustering technique that amounts to a soft clustering method. Our experiments show that the resource constructed by using this clique-based clustering system allows to improve different NER systems.
Sprachwissenschaft | 2016
Maud Ehrmann; Guillaume Jacquet; Ralf Steinberger
Since 2004 the European Commissions Joint Research Centre (JRC) has been analysing the online version of printed media in over twenty languages and has automatically recognised and compiled large amounts of named entities ( persons and organisations) and their many name variants. The collected variants not only include standard spellings in various countries, languages and scripts, but also frequently found spelling mistakes or lesser used name forms, all occurring in real-life text (e.g. Benjamin/ Binyamin/Bibi/Benyamin/Biniamin/Netanyahu/Netanjahu/Neanyahou/Netahny/ ). This entity name variant data, known as JRCNames, has been available for public download since 2011. In this article, we report on our efforts to render JRC-Names as Linked Data (LD), using the lexicon model for ontologies lemon. Besides adhering to Semantic Web standards, this new release goes beyond the initial one in that it includes titles found next to the names, as well as date ranges when the titles and the name variants were found. It also establishes links towards existing datasets, such as DBpedia and Talk-Of-Europe. As multilingual linguistic linked dataset, JRC-Names can help bridge the gap between structured data and natural languages, thus supporting large-scale data integration, e.g. cross-lingual mapping, and web-based content processing, e.g. entity linking. JRC-Names is publicly available through the dataset catalogue of the European Unions Open Data Portal.
language and technology conference | 2009
Caroline Brun; Maud Ehrmann; Guillaume Jacquet
This paper is an extended version of [1], describing our participation to the Metonymy Resolution (task #8) at SemEval 2007. In order to perform named entity metonymy resolution on location names and company names, as required for this task, we developed a hybrid system based on the use of a robust parser that extracts deep syntactic relations combined with a non supervised distributional approach, also relying on the relations extracted by the parser.
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) | 2017
Sophie Chesney; Guillaume Jacquet; Ralf Steinberger; Jakub Piskorski
This paper describes an approach for the classification of millions of existing multiword entities (MWEntities), such as organisation or event names, into thirteen category types, based only on the tokens they contain. In order to classify our very large in-house collection of multilingual MWEntities into an application-oriented set of entity categories, we trained and tested distantly-supervised classifiers in 43 languages based on MWEntities extracted from BabelNet. The best-performing classifier was the multi-class SVM using a TF.IDF-weighted data representation. Interestingly, one unique classifier trained on a mix of all languages consistently performed better than classifiers trained for individual languages, reaching an averaged F1-value of 88.8%. In this paper, we present the training and test data, including a human evaluation of its accuracy, describe the methods used to train the classifiers, and discuss the results.
Archive | 2007
Caroline Brun; Maud Ehrmann; Guillaume Jacquet
Archive | 2008
Julien Ah-Pine; Guillaume Jacquet
cross-language evaluation forum | 2010
Stéphane Clinchant; Gabriela Csurka; Julien Ah-Pine; Guillaume Jacquet; Florent Perronnin; Jorge Sánchez; Keyvan Minoukadeh
Archive | 2010
Ágnes Sándor; Guillaume Jacquet
Archive | 2013
Caroline Hagège; Guillaume Jacquet