Alain Casali | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alain Casali is active.

Explore More

Publication

Featured researches published by Alain Casali.

knowledge discovery and data mining | 2003

Extracting semantics from data cubes using cube transversals and closures

Alain Casali; Rosine Cicchetti; Lotfi Lakhal

In this paper we propose a lattice-based approach intended for extracting semantics from datacubes: borders of version spaces for supervised classification, closed cube lattice to summarize the semantics of datacubes w.r.t. COUNT, SUM, and covering graph of the quotient cube as a visualization tool of minimal multidimensional associations. With this intention, we introduce two novel concepts: the cube transversals and the cube closures over the cube lattice of a categorical database relation. We propose a levelwise merging algorithm for mining minimal cube transversals with a single database scan. We introduce the cube connection, show that it is a Galois connection and derive a closure operator over the cube lattice. Using cube transversals and closures, we define a new characterization of boundary sets which provide a condensed representation of version spaces used to enhance supervised classification. The algorithm designed for computing such borders improves the complexity of previous proposals. We also introduce the concept of closed cube lattice and show that it is isomorph to on one hand the Galois lattice and on the other hand the quotient cube w.r.t. COUNT, SUM. Proposed in [16], the quotient cube is a succinct summary of a datacube preserving the Rollup/Drilldown semantics. We show that the quotient cube w.r.t. COUNT, SUM and the closed cube lattice have a similar expression power but the latter has the smallest possible size. Finally we focus on the multidimensional association issue and introduce the covering graph of the quotient cube which provides the user with a visualization tool of minimal multidimensional associations.

data warehousing and knowledge discovery | 2005

Essential patterns: a perfect cover of frequent patterns

Alain Casali; Rosine Cicchetti; Lotfi Lakhal

The extraction of frequent patterns often yields extremely voluminous results which are difficult to handle. Computing a concise representation or cover of the frequent pattern set is thus an interesting alternative investigated by various approaches. The work presented in this article fits in such a trend. We introduce the concept of essential pattern and propose a new cover based on this concept. Such a cover makes it possible to decide whether a pattern is frequent or not, to compute its frequency and, in contrast with related work, to infer its disjunction and negation frequencies. A levelwise algorithm with a pruning step which uses the maximal frequent patterns for computing the essential patterns is proposed. Experiments show that when the number of frequent patterns is very high (strongly correlated data), the defined cover is significantly more reduced than the cover considered until now as minimal: the frequent closed patterns.

data warehousing and knowledge discovery | 2007

Emerging cubes for trends analysis in OLAP databases

Sébastien Nedjar; Alain Casali; Rosine Cicchetti; Lotfi Lakhal

In various approaches, data cubes are pre-computed in order to efficiently answer Olap queries. Such cubes are also successfully used for multidimensional analysis of data streams. In this paper, we address the issue of performing cube comparisons in order to exhibit trend reversals between two cubes. Mining such trend changes provides users with a novel and specially interesting knowledge. For capturing the latter, we introduce the concept of emerging cube. Moreover, we provide a condensed representation of emerging cubes which avoids to compute two underlying cubes. Finally, we study an algorithmic way to achieve our representation using cube maximals and cube transversals.

New Trends in Data Warehousing and Data Analysis | 2009

Closed Cube Lattices

Alain Casali; Sébastien Nedjar; Rosine Cicchetti; Lotfi Lakhal

In this paper we propose a lattice-based approach intended for summa- rizing the Data Cubes. With this intention, we introduce a novel concept: the cube closure over the cube lattice (multidimensional search space) of a categorical data- base relation. We introduce the cube connection, show that it is a Galois connection and derive a closure operator over the cube lattice. We introduce the concept of Closed Cube lattice which is a cover for Data Cube and show that it is isomorphic to, on one hand the Galois (concept) lattice and, on the other hand the Quotient Cube. Proposed by Lakshmanan et al., the Quotient Cube is a succinct summary of a Data Cube preserving the Rollup/Drilldown semantics. We show that the Quo- tient Cube, provided with a closure-based characterization, can be derived from the Closed Cube. Thus these two structures have a similar expression power but the Closed Cube is smaller. Finally, we perform some experiments in order to measure the benefit of our approach.

Information Systems | 2009

Emerging Cubes: Borders, size estimations and lossless reductions

Sébastien Nedjar; Alain Casali; Rosine Cicchetti; Lotfi Lakhal

Discovering trend reversals between two data cubes provides users with a novel and interesting knowledge when the real world context fluctuates: What is new? Which trends appear or emerge? Which tendencies are immersing or disappear? With the concept of Emerging Cube, we capture such trend reversals by enforcing an emergence constraint. We resume the classical borders for the Emerging Cube and introduce a new one which optimizes both storage space and computation time, provides a simple characterization of the size of Emerging Cubes, as well as classification and cube navigation tools. We soundly state the connection between the classical and proposed borders by using cube transversals. Knowing the size of Emerging Cubes without computing them is of great interest in particular for adjusting at best the underlying emergence constraint. We address this issue by studying an upper bound and characterizing the exact size of Emerging Cubes. We propose two strategies for quickly estimate their size: one based on analytical estimation, without database access, and one based on probabilistic counting using the proposed borders as the input of the near-optimal algorithm HyperLogLog. Due to the efficiency of the estimation algorithm various iterations can be performed to calibrate at best the emergence constraint. Moreover, we propose reduced and lossless representations of the Emerging Cube by using the concept of cube closure. Finally, we perform experiments for different data distributions in order to measure on one hand the size of the introduced condensed and concise representations and on the other hand the performance (accuracy and computation time) of the proposed estimation method.

International Journal of Data Warehousing and Mining | 2009

Lossless Reduction of Datacubes using Partitions

Alain Casali; SÃ©bastien Nedjar; Rosine Cicchetti; Lotfi Lakhal; NoÃ«l Novelli

Datacubes are especially useful for answering efficiently queries on data warehouses. Nevertheless the amount of generated aggregated data is huge with respect to the initial data which is itself very large. Recent research has addressed the issue of a summary of Datacubes in order to reduce their size. The approach presented in this paper fits in a similar trend. We propose a concise representation, called Partition Cube, based on the concept of partition and we give a new algorithm to compute it. We propose a Relational Partition Cube, a novel ROLAP cubing solution for managing Partition Cubes using the relational technology. Analytical evaluations show that the storage space of Partition Cubes is smaller than Datacubes. In order to confirm analytical comparison, experiments are performed in order to compare our approach with Datacubes and with two of the best reduction methods, the Quotient Cube and the Closed Cube.

database and expert systems applications | 2007

Convex cube: towards a unified structure for multidimensional databases

Alain Casali; Sébastien Nedjar; Rosine Cicchetti; Lotfi Lakhal

In various approaches, data cubes are pre-computed in order to efficiently answer Olap queries. Such cubes are also successfully used for multidimensional analysis of data streams. The notion of data cube has been explored in various ways: iceberg cubes, range cubes, differential cubes or emerging cubes. In this paper, we introduce the concept of convex cube which captures all the tuples satisfying a monotone and/or antimonotone constraint combination. It can be represented in a very compact way in order to optimize both computation time and required storage space. The convex cube is not an additional structure appended to the list of cube variants but we propose it as a unifying structure that we use to characterize, in a simple, sound and homogeneous way, the other quoted types of cubes.

IEEE Transactions on Semiconductor Manufacturing | 2012

Discovering Correlated Parameters in Semiconductor Manufacturing Processes: A Data Mining Approach

Alain Casali; Christian Ernst

Data mining tools are nowadays becoming more and more popular in the semiconductor manufacturing industry, and especially in yield-oriented enhancement techniques. This is because conventional approaches fail to extract hidden relationships between numerous complex process control parameters. In order to highlight correlations between such parameters, we propose in this paper a complete knowledge discovery in databases (KDD) model. The mining heart of the model uses a new method derived from association rules programming, and is based on two concepts: decision correlation rules and contingency vectors. The first concept results from a cross fertilization between correlation and decision rules. It enables relevant links to be highlighted between sets of values of a relation and the values of sets of targets belonging to the same relation. Decision correlation rules are built on the twofold basis of the chi-squared measure and of the support of the extracted values. Due to the very nature of the problem, levelwise algorithms only allow extraction of results with long execution times and huge memory occupation. To offset these two problems, we propose an algorithm based both on the lectic order and contingency vectors, an alternate representation of contingency tables. This algorithm is the basis of our KDD model software, called MineCor. An overall presentation of its other functions, of some significant experimental results, and of associated performances are provided and discussed.

data warehousing and knowledge discovery | 2004

Mining Borders of the Difference of Two Datacubes

Alain Casali

In this paper we use the novel concept of minimal cube transversals on the cube lattice of a categorical database relation for mining the borders of the difference of two datacubes. The problem of finding cube transversals is a sub-problem of hypergraph transversal discovery since there exists an order-embedding from the cube lattice to the power set lattice of binary attributes. Based on this result, we propose a levelwise algorithm and an optimization which uses the frequency of the disjunction for mining minimal cube transversals. Using cube transversals, we introduce a new OLAP functionality: discovering the difference of two uni-compatible datacubes or the most frequent elements in the difference. Finally we propose a merging algorithm for mining the boundary sets of the difference without computing the two related datacubes. Provided with such a difference of two datacubes capturing similar informations but computed at different dates, a user can focus on what is new or more generally on how evolve the previously observed trends.

availability reliability and security | 2013

Extracting Correlated Patterns on Multicore Architectures

Alain Casali; Christian Ernst

In this paper, we present a new approach relevant to the discovery of correlated patterns, based on the use of multicore architectures. Our work rests on a full KDD system and allows one to extract Decision Correlation Rules based on the Chi-squared criterion that include a target column from any database. To achieve this objective, we use a levelwise algorithm as well as contingency vectors, an alternate and more powerful representation of contingency tables, in order to prune the search space. The goal is to parallelize the processing associated with the extraction of relevant rules. The parallelization invokes the PPL (Parallel Patterns Library), which allows a simultaneous access to the whole available cores / processors on modern computers. We finally present first results on the reached performance gains.

Explore More