Is this you? Create Your Porfile

Gemma C. Garriga

French Institute for Research in Computer Science and Automation

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gemma C. Garriga is active.

Explore More

Publication

Featured researches published by Gemma C. Garriga.

Annals of Mathematics and Artificial Intelligence | 2013

Mining closed patterns in relational, graph and network data

Gemma C. Garriga; Roni Khardon; Luc De Raedt

Recent theoretical insights have led to the introduction of efficient algorithms for mining closed item-sets. This paper investigates potential generalizations of this paradigm to mine closed patterns in relational, graph and network databases. Several semantics and associated definitions for closed patterns in relational data have been introduced in previous work, but the differences among these and the implications of the choice of semantics was not clear. The paper investigates these implications in the context of generalizing the LCM algorithm, an algorithm for enumerating closed item-sets. LCM is attractive since its run time is linear in the number of closed patterns and since it does not need to store the patterns output in order to avoid duplicates, further reducing memory signature and run time. Our investigation shows that the choice of semantics has a dramatic effect on the properties of closed patterns and as a result, in some settings a generalization of the LCM algorithm is not possible. On the other hand, we provide a full generalization of LCM for the semantic setting that has been previously used by the Claudien system.

international conference on data mining | 2011

A Fixed Parameter Tractable Integer Program for Finding the Maximum Order Preserving Submatrix

Jens Humrich; Thomas Gärtner; Gemma C. Garriga

Order-preserving sub matrices are an important tool for the analysis of gene expression data. As finding large order-preserving sub matrices is a computationally hard problem, previous work has investigated both exact but exponential-time as well as polynomial-time but inexact algorithms for finding large order-preserving sub matrices. In this paper, we propose a novel exact algorithm to find maximum order preserving sub matrices which is fixed parameter tractable with respect to the number of columns of the provided gene expression data. In particular, our algorithm is based on solving a sequence of mixed integer linear programs and it exhibits better guarantees as well as better runtime performance as compared to the state-of-the-art exact algorithms. Our empirical study in benchmark datasets shows large improvement in terms of computational speed.

Archive | 2018

Consumer Journey Analytics in the Context of Data Privacy and Ethics

Andreas E. Braun; Gemma C. Garriga

By Big Data Analytics we understand new technologies and methods that go beyond how data and analytics was previously handled. On the data side, extremely large data sets can be stored and processed, even real‐time, and at reasonable cost. On the analytical side, methods are no longer limited to hard‐coded (business) rules or statistics, but leverage Artificial Intelligence (AI) and particularly Machine Learning (ML). In this paper we argue that Big Data’s quickest business wins and first tangible impact is in the domain of customer/consumer analytics, summarized as Digital Consumer Journey Analytics. Such journeys are constructed from people’s movement and navigational patterns in both the virtual and physical world; while individual data points are at first not very expressive, the picture created by continuous collection of ubiquitous data and their history, allows to unveil almost any identity profile. Under this increasingly digital environment, staying in the relevant set of consumers is of utmost importance for businesses. Big Data Analytics can support by, e. g., driving product and service design and customer experience improvements. However, there are increasing limitations to the possibilities of Big Data Analytics in consumer businesses, in particular Data Privacy and Data Ethics. Businesses have to deal with a growing appetite of legislation and prudential regulation, e. g., the EU GDPR. The challenge is to make data‐driven offerings trusted in the digital age. In this paper we will illustrate the journey in an insurance business to grow successful Big Data Use Cases in Consumer Analytics and discuss Privacy by Design (PbD) and Private Enhancing Technologies (PET) as means to build trusted data‐driven products and services.

international conference on machine learning and applications | 2013

Learning from Multiple Graphs Using a Sigmoid Kernel

Thomas Ricatte; Gemma C. Garriga; Rémi Gilleron; Marc Tommasi

This paper studies the problem of learning from a set of input graphs, each of them representing a different relation over the same set of nodes. Our goal is to merge those input graphs by embedding them into an Euclidean space related to the commute time distance in the original graphs. This is done with the help of a small number of labeled nodes. Our algorithm output a combined kernel that can be used for different graph learning tasks. We consider two combination methods: the (classical) linear combination and the sigmoid combination. We compare the combination methods on node classification tasks using different semi-supervised graph learning algorithms. We note that the sigmoid combination method exhibits very positive results.

Archive | 2013

Towards Other Structured Data

Gemma C. Garriga

Typically, research on pattern discovery has progressed from mining itemsets to mining sequences, and from sequences to mining other structures such as trees, lattices or graphs, e.g. a non-exhaustive set is [136, 6, 84, 13, 78, 83, 100, 134, 129, 140, 23, 24]. Last but not least, we raise the question of how to extend the results obtained with sequences to other structured data.

Archive | 2013

Horn Axiomatizations for Sequences

Gemma C. Garriga

This chapter focuses on the study of association rules for ordered data by using the system of closed sets of sequences, defined by operator (triangle). Our contribution is a notion of deterministic association rules with order where a set of sequences always implies another sequence in the data. The central advantage of dealing with deterministic rules is that they do not require to select, with little or no formal guidance, one single measure of strength of implication because they always hold. Moreover, since they are pure standard implications, they can be studied in purely logical terms. Indeed, in the second chapter we already mentioned that the set of deterministic association rules derived from classical lattice-theoretic methods axiomatize the minimal Horn upper bound of a binary relation [14]. On the basis of this formalization, the main result of this chapter is a similar characterization of the implications with order as the empirical Horn approximation of the input set of sequences. To allow for this characterization, we will require the definition of certain background Horn conditions to ensure the consistency of the theory. As a consequence of this main result, we can also prove the isomorphy of the lattice of closed sets of sequences and the classical binary lattice when the background Horn conditions hold. Finally, we discuss the computation of all these rules in practice.

Archive | 2013

Lattice Theory for Sequences

Gemma C. Garriga

The goal of this chapter is to use FCA theory to formalize a new closure system that characterizes sequential data. Since we are not dealing with the classical unordered context of the preliminaries, setting all the conditions for the new Galois connection is not a trivial task. To start with, it departs from the unordered case in the very definition of intersection; whereas we saw in the last chapter that the intersection of two itemsets is another itemset, the intersection of two or more sequences is not necessarily a single sequence. Let us consider the following definition.

Archive | 2013

Transformations on General Partial Orders

Gemma C. Garriga

Next we will be considering repetitions of items in the input sequential data, and as a consequence, the final closed partial orders are not necessarily injective. As we will see, dealing with general partial orders makes the proper formalization with category theory a bit more difficult. To start with, we are forced to drop the injectivity of the morphisms in the general category of graphs, and this allows for many different ways of mapping a partial order over a sequence. Still another inconvenience, Theorem 5.1 of chapter 5 just holds for one of the directions in this new problem. Indeed, the maximal paths of the final closed partial orders do not necessarily coincide exactly with the intersections of the compatible input sequential data. Here we will try to formally justify the construction of our final closed partial orders for a set of data as the colimit transformation on path-preserving edges. Colimits will naturally generalize also the coproduct transformation of chapter 5, but as we shall see, proving that our final structure has the property of being maximally specific is still an unsolved combinatorial problem.

Archive | 2013

Transformations on Injective Partial Orders

Gemma C. Garriga

As mentioned in the first chapter, alternatively to the mining of plain sequential patterns from input sequences, some approaches want to describe portions of the data by means of compatible partial orders, i.e. collections of events occurring frequently together in the input sequences. The complexity of managing these structures and the combinatorial explosion to tackle all the cases makes of this an algorithmically challenging problem.

international joint conference on artificial intelligence | 2007