Aurélien Lemay | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aurélien Lemay is active.

Explore More

Publication

Featured researches published by Aurélien Lemay.

fundamentals of computation theory | 2009

Equivalence of deterministic nested word to word transducers

Slawomir Staworko; Grégoire Laurence; Aurélien Lemay; Joachim Niehren

We study the equivalence problem of deterministic nested word to word transducers and show it to be surprisingly robust. Modulo polynomial time reductions, it can be identified with 4 equivalence problems for diverse classes of deterministic non-copying order-preserving transducers. In particular, we present polynomial time back and fourth reductions to the morphism equivalence problem on context free languages, which is known to be solvable in polynomial time.

symposium on principles of database systems | 2010

A learning algorithm for top-down XML transformations

Aurélien Lemay; Sebastian Maneth; Joachim Niehren

A generalization from string to trees and from languages to translations is given of the classical result that any regular language can be learned from examples: it is shown that for any deterministic top-down tree transformation there exists a sample set of polynomial size (with respect to the minimal transducer) which allows to infer the translation. Until now, only for string transducers and for simple relabeling tree transducers, similar results had been known. Learning of deterministic top-down tree transducers (dtops) is far more involved because a dtop can copy, delete, and permute its input subtrees. Thus, complex dependencies of labeled input to output paths need to be maintained by the algorithm. First, a Myhill-Nerode theorem is presented for dtops, which is interesting on its own. This theorem is then used to construct a learning algorithm for dtops. Finally, it is shown how our result can be applied to xml transformations (e.g. xslt programs). For this, a new dtd-based encoding of unranked trees by ranked ones is presented. Over such encodings, dtops can realize many practically interesting xml transformations which cannot be realized on firstchild/next-sibling encodings.

extending database technology | 2015

Learning Path Queries on Graph Databases

Angela Bonifati; Radu Ciucanu; Aurélien Lemay

We investigate the problem of learning graph queries by exploiting user examples. The input consists of a graph database in which the user has labeled a few nodes as positive or negative examples, depending on whether or not she would like the nodes as part of the query result. Our goal is to handle such examples to find a query whose output is what the user expects. This kind of scenario is pivotal in several application settings where unfamiliar users need to be assisted to specify their queries. In this paper, we focus on path queries defined by regular expressions, we identify fundamental difficulties of our problem setting, we formalize what it means to be learnable, and we prove that the class of queries under study enjoys this property. We additionally investigate an interactive scenario where we start with an empty set of examples and we identify the informative nodes i.e., those that contribute to the learning process. Then, we ask the user to label these nodes and iterate the learning process until she is satisfied with the learned query. Finally, we present an experimental study on both real and synthetic datasets devoted to gauging the effectiveness of our learning algorithm and the improvement of the interactive approach.

international colloquium on grammatical inference | 2006

Learning n-ary node selecting tree transducers from completely annotated examples

Aurélien Lemay; Joachim Niehren; Rémi Gilleron

We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (n-NSTTs). These are tree automata that capture the class of monadic second-order definable n-ary queries. We show that n-NSTTs defined polynomially bounded n-ary queries can be learned from polynomial time and data. An application in Web information extraction yields encouraging results.

very large data bases | 2016

Generating flexible workloads for graph databases

Guillaume Bagan; Angela Bonifati; Radu Ciucanu; George H. L. Fletcher; Aurélien Lemay; Nicky Advokaat

Graph data management tools are nowadays evolving at a great pace. Key drivers of progress in the design and study of data intensive systems are solutions for synthetic generation of data and workloads, for use in empirical studies. Current graph generators, however, provide limited or no support for workload generation or are limited to fixed use-cases. Towards addressing these limitations, we demonstrate gMark, the first domain- and query language-independent framework for synthetic graph and query workload generation. Its novel features are: (i) fine-grained control of graph instance and query workload generation via expressive user-defined schemas; (ii) the support of expressive graph query languages, including recursion among other features; and, (iii) selectivity estimation of the generated queries. During the demonstration, we will showcase the highly tunable generation of graphs and queries through various user-defined schemas and targeted selectivities, and the variety of supported practical graph query languages. We will also show a performance comparison of four state-of-the-art graph database engines, which helps us understand their current strengths and desirable future extensions.

very large data bases | 2014

A Paradigm for Learning Queries on Big Data

Angela Bonifati; Radu Ciucanu; Aurélien Lemay; Slawomir Staworko

Specifying a database query using a formal query language is typically a challenging task for non-expert users. In the context of big data, this problem becomes even harder as it requires the users to deal with database instances of big sizes and hence difficult to visualize. Such instances usually lack a schema to help the users specify their queries, or have an incomplete schema as they come from disparate data sources. In this paper, we propose a novel paradigm for interactive learning of queries on big data, without assuming any knowledge of the database schema. The paradigm can be applied to different database models and a class of queries adequate to the database model. In particular, in this paper we present two instantiations that validated the proposed paradigm for learning relational join queries and for learning path queries on graph databases. Finally, we discuss the challenges of employing the paradigm for further data models and for learning cross-model schema mappings.

language and automata theory and applications | 2011

Normalization of sequential top-down tree-to-word transducers

Grégoire Laurence; Aurélien Lemay; Joachim Niehren; Slawomir Staworko; Marc Tommasi

We study normalization of deterministic sequential top-down tree-to-word transducers (stws), that capture the class of deterministic top-down nested-word to word transducers. We identify the subclass of earliest stws (estws) that yield unique normal forms when minimized. The main result of this paper is an effective normalization procedure for stws. It consists of two stages: we first convert a given stw to an equivalent estw, and then, we minimize the estw.

language and automata theory and applications | 2014

Learning Sequential Tree-to-Word Transducers

Grégoire Laurence; Aurélien Lemay; Joachim Niehren; Slawomir Staworko; Marc Tommasi

We study the problem of learning sequential top-down tree-to-word transducers stws. First, we present a Myhill-Nerode characterization of the corresponding class of sequential tree-to-word transformations

international colloquium on grammatical inference | 2008

Schema-Guided Induction of Monadic Queries

Jérôme Champavère; Rémi Gilleron; Aurélien Lemay; Joachim Niehren

{mathcal{STW}}

developments in language theory | 2012

Learning rational functions

Adrien Boiret; Aurélien Lemay; Joachim Niehren

. Next, we investigate what learning of stws means, identify fundamental obstacles, and propose a learning model with abstain. Finally, we present a polynomial learning algorithm.

Explore More