Guillaume Raschia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guillaume Raschia is active.

Explore More

Publication

Featured researches published by Guillaume Raschia.

Fuzzy Sets and Systems | 2002

SAINTETIQ: a fuzzy set-based approach to database summarization

Guillaume Raschia; Noureddine Mouaddib

In this paper, a new approach to database summarization is introduced through our model named SAINTETIQ. Based on a hierarchical conceptual clustering algorithm, SAINTETIQ incrementally builds a summary hierarchy from database records. Furthermore, the fuzzy set-based representation of data allows to handle vague, uncertain or imprecise information, as well as to improve accuracy and robustness of the construction process of summaries. Finally, background knowledge provides a user-defined vocabulary to synthesize and to make highly intelligible the summary descriptions.

extending database technology | 2008

Summary management in P2P systems

Rabab Hayek; Guillaume Raschia; Patrick Valduriez; Noureddine Mouaddib

Sharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer sufficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this paper, we consider summaries that are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the appropriate algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance.

ieee international conference on fuzzy systems | 2003

A fuzzy linguistic summarization technique for TV recommender systems

Antoine Pigeau; Guillaume Raschia; Marc Gelgon; Noureddine Mouaddib; Regis Saint-Paul

The increasing number of satellite and cable television channels is resulting in a soaring number of broadcast programs available to viewers. To alleviate this problem, Personal Video Recorders (multimedia platforms which record TV programs on a hard disk) should integrate a recommender system, which purpose is to filter programs according to their relevance. These systems are based on a user profile, acting as a representative for the users interests. An important research issue resides in going beyond explicitly user-defined profiles. This paper presents a TV recommender system using fuzzy linguistic summarization technique, which enables automatic learning of the user profile. The logical architecture of the recommender system based on the SAINTETIQ model, as well as the main ideas of the filtering task are introduced in this communication.

ieee international conference on fuzzy systems | 2001

Using fuzzy labels as background knowledge for linguistic summarization of databases

Guillaume Raschia; Noureddine Mouaddib

In this paper, some important features of a new approach to data summarization are introduced. Our model named SAINTETIQ produces summaries of groups of database records with different granularities. A summary is represented on each attribute by fuzzy sets associated to linguistic descriptors. One major feature of the SAINTETIQ system is the intensive use of background knowledge (BK) in the summarization process. BK is built a priori on each attribute. It supports both a translation step of descriptions of database tuples into a user-defined vocabulary, and a generalization step providing synthetic intents of summaries. Furthermore, the fuzzy set-based representation of summaries allows the system to improve robustness and accuracy of summary descriptions.

flexible query answering systems | 2004

Querying the SaintEtiQ Summaries – A First Attempt

W. Amenel Voglozin; Guillaume Raschia; Laurent Ughetto; Noureddine Mouaddib

For some years, data summarization techniques have been developed to handle the growth of databases. However these techniques are usually not provided with tools for end-users to efficiently use the produced summaries. This paper presents a first attempt to develop a querying tool for the SaintEtiQ summarization model. The proposed search algorithm takes advantage of the hierarchical structure of the SaintEtiQ summaries to efficiently answer questions such as ”how are, on some attributes, the tuples which have specific characteristics?”. Moreover, this algorithm can be seen both as a boolean querying mechanism over a hierarchy of summaries, and as a flexible querying mechanism over the underlying relational tuples.

edbt icdt workshops | 2013

Anonymizing sequential releases under arbitrary updates

Adeel Anjum; Guillaume Raschia

In todays global information society, governments, companies, public and private institutions and even individuals have to cope with growing demands for personal data publication from scientists, statisticians, journalists and many other data consumers. Current researches on privacy-preserving data publishing by sanitization focus on static dataset, which have no updates. In real life however, data sources are dynamic and usually the updates in these datasets are mainly arbitrary. Then, applying any popular static privacy-preserving technique inevitably yields to information disclosure. Among the few works in the literature that relate to the serial data publication, none of them focuses on arbitrary updates, i.e. with any consistent insert/update/delete sequence, and especially in the presence of auxiliary knowledge that tracks updates of individuals. In this communication, we first highlight the invalidation of existing algorithms and present an extension of the m-invariance generalization model coined τ-safety. Then we formally state the problem of privacy-preserving dataset publication of sequential releases in the presence of arbitrary updates and chainability-based background knowledge. We also propose an approximate algorithm, and we show that our approach to τ-safety, not only prevents from any privacy breach but also achieve a high utility of the anonymous releases.

conference on information and knowledge management | 2009

Time sequence summarization to scale up chronology-dependent applications

Quang-Khai Pham; Guillaume Raschia; Noureddine Mouaddib; Regis Saint-Paul; Boualem Benatallah

In this paper, we present the concept of Time Sequence Summarization to support chronology-dependent applications on massive data sources. Time sequence summarization takes as input a time sequence of events that are chronologically ordered. Each event is described by a set of descriptors. Time sequence summarization produces a concise time sequence that can be substituted for the original time sequence in chronology-dependent applications. We propose an algorithm that achieves time sequence summarization based on a generalization, grouping and concept formation process. Generalization expresses event descriptors at higher levels of abstraction using taxonomies while grouping gathers similar events. Concept formation is responsible for reducing the size of the input time sequence of events by representing each group created by one concept. The process is performed in a way such that the overall chronology of events is preserved. The algorithm computes the summary incrementally and has reduced algorithmic complexity. The resulting output is a concise representation, yet, informative enough to directly support chronology-dependent applications. We validate our approach by summarizing one year of financial news provided by Reuters.

intelligent information systems | 2006

Querying a summary of database

W. A. Voglozin; Guillaume Raschia; Laurent Ughetto; Noureddine Mouaddib

For some years, data summarization techniques have been developed to handle the growth of databases. However these techniques are usually not provided with tools for end-users to efficiently use the produced summaries. This paper presents a first attempt to develop a querying tool for the SAINTETIQ summarization model. The proposed search algorithm takes advantage of the hierarchical structure of the SAINTETIQ summaries to efficiently answer questions such as “how are, on some attributes, the tuples which have specific characteristics?” Moreover, this algorithm can be seen both as a boolean querying mechanism over a hierarchy of summaries, and as a flexible querying mechanism over the underlying relational tuples.

The first computers | 2017

BangA: An Efficient and Flexible Generalization-Based Algorithm for Privacy Preserving Data Publication

Adeel Anjum; Guillaume Raschia

Privacy-Preserving Data Publishing (PPDP) has become a critical issue for companies and organizations that would release their data. k-Anonymization was proposed as a first generalization model to guarantee against identity disclosure of individual records in a data set. Point access methods (PAMs) are not well studied for the problem of data anonymization. In this article, we propose yet another approximation algorithm for anonymization, coined BangA, that combines useful features from Point Access Methods (PAMs) and clustering. Hence, it achieves fast computation and scalability as a PAM, and very high quality thanks to its density-based clustering step. Extensive experiments show the efficiency and effectiveness of our approach. Furthermore, we provide guidelines for extending BangA to achieve a relaxed form of differential privacy which provides stronger privacy guarantees as compared to traditional privacy definitions.

ieee international conference on fuzzy systems | 2002

Prototyping and browsing image databases using linguistic summaries

Regis Saint-Paul; Guillaume Raschia; Noureddine Mouaddib

A new approach for the summarization of image databases is described. A linguistic description of images is first generated from its low level features such as color. An original incremental summary process, called SAINTETIQ is then applied to the images, leading to a comprehensive hierarchically organized set of summaries of parts of the database. The summary process is driven by data and relies on the background knowledge represented by a fuzzy relational thesaurus. It provides both a proximity measure between terms of color domain, and a way to produce generalized descriptions of summaries. Finally, this article points out usages of the summaries. A new method to browse image databases is then presented as well as a way to produce relevant prototypes from groups of images. Those tasks take advantage of the linguistic descriptions of summaries provided by SAINTETIQ.

Explore More