Guillaume Wisniewski
University of Paris
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guillaume Wisniewski.
machine learning and data mining in pattern recognition | 2007
Guillaume Wisniewski; Francis Maes; Ludovic Denoyer; Patrick Gallinari
We address the problem of learning automatically to map heterogeneous semi-structured documents onto a mediated target XML schema. We adopt a machine learning approach where the mapping between input and target documents is learned from a training corpus of documents. We first introduce a general stochastic model of semi structured documents generation and transformation. This model relies on the concept of meta-document which is a latent variable providing a link between input and target documents. It allows us to learn the correspondences when the input documents are expressed in a large variety of schemas. We then detail an instance of the general model for the particular task of HTML to XML conversion. This instance is tested on three different corpora using two different inference methods: a dynamic programming method and an approximate LaSO-based method.
Revue des Sciences et Technologies de l'Information - Série Document Numérique | 2007
Guillaume Wisniewski; Francis Maes; Ludovic Denoyer; Patrick Gallinari
Le developpement des systemes de gestion de contenu a profondement change la nature du web : de plus en plus de documents sont crees automatiquement et leur mise en page reflete leur structure logique. Dans ce travail, nous montrons que l’information contenue dans la mise en page est suffisante pour inferer une structure semantiquement riche, ce qui ouvre la voie a de nombreuses applications. Le passage d’une information de mise en page a une structure semantique se heurte a deux principaux obstacles : l’heterogeneite des donnees et le caractere implicite de la structure des documents web. Nous decrivons un modele stochastique capable d’apprendre a transformer des documents semi-structures vers un schema defini a priori et presentons une instance particuliere de ce modele adaptee a la transformation de documents heterogenes HTML en XML.
international acm sigir conference on research and development in information retrieval | 2004
Ludovic Denoyer; Guillaume Wisniewski; Patrick Gallinari
CORIA 2005 - 2ème Conférence en Recherche d'Informations et Applications | 2005
Guillaume Wisniewski; Ludovic Denoyer; Patrick Gallinari
ECML'05 Workshop on Relationnal Machine Learning | 2005
Patrick Gallinari; Guillaume Wisniewski; Francis Maes; Ludovic Denoyer
Archive | 2007
Guillaume Wisniewski; Francis Maes; Ludovic Denoyer; Patrick Gallinari
Extraction et Gestion des Connaissances (EGC) | 2007
Guillaume Wisniewski; Patrick Gallinari
Archive | 2006
Guillaume Wisniewski; Ludovic Denoyer; Maes Francis; Patrick Gallinari; Francis Maes
3eme Conference en Recherche d'Information et Applications (CORIA'06) | 2006
Guillaume Wisniewski; Ludovic Denoyer; Francis Maes; Patrick Gallinari
Archive | 2005
Ludovic Denoyer; Guillaume Wisniewski; Patrick Gallinari