Is this you? Create Your Porfile

Michel Sala

Centre national de la recherche scientifique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michel Sala is active.

Explore More

Publication

Featured researches published by Michel Sala.

database and expert systems applications | 2011

Reduce, You Say: What NoSQL Can Do for Data Aggregation and BI in Large Repositories

Laurent Bonnet; Anne Laurent; Michel Sala; Bénédicte Laurent; Nicolas Sicard

Data aggregation is one of the key features used in databases, especially for Business Intelligence (e.g., ETL, OLAP) and analytics/data mining. When considering SQL databases, aggregation is used to prepare and visualize data for deeper analyses. However, these operations are often impossible on very large volumes of data regarding memory-and-time-consumption. In this paper, we show how NoSQL databases such as MongoDB and its key-value stores, thanks to the native MapReduce algorithm, can provide an efficient framework to aggregate large volumes of data. We provide basic material about the MapReduce algorithm, the different NoSQL databases (read intensive vs. write intensive). We investigate how to efficiently modelize the data framework for BI and analytics. For this purpose, we focus on read intensive NoSQL databases using MongoDB and we show how NoSQL and MapReduce can help handling large volumes of data.

Engineering Applications of Artificial Intelligence | 2015

Spatio-temporal data classification through multidimensional sequential patterns: Application to crop mapping in complex landscape

Yoann Pitarch; Dino Ienco; Elodie Vintrou; Agnès Bégué; Anne Laurent; Pascal Poncelet; Michel Sala; Maguelonne Teisseire

The main use of satellite imagery concerns the process of the spectral and spatial dimensions of the data. However, to extract useful information, the temporal dimension also has to be accounted for which increases the complexity of the problem. For this reason, there is a need for suitable data mining techniques for this source of data. In this work, we developed a data mining methodology to extract multidimensional sequential patterns to characterize temporal behaviors. We then used the extracted multidimensional sequences to build a classifier, and show how the patterns help to distinguish between the classes. We evaluated our technique using a real-world dataset containing information about land use in Mali (West Africa) to automatically recognize if an area is cultivated or not.

flexible query answering systems | 2013

M2LFGP: Mining Gradual Patterns over Fuzzy Multiple Levels

Yogi Satrya Aryadinata; Arnaud Castelltort; Anne Laurent; Michel Sala

Data are often described at several levels of granularity. For instance, data concerning fruits that are purchased can be categorized regarding some criteria such as size, weight, color, etc.. When dealing with data from the real world, such categories can hardly be defined in a crisp manner. For instance, some fruits may belong both to the small and medium-sized fruits. Data mining methods have been proposed to deal with such data, in order to take benefit from the several levels when extracting relevant patterns. The challenge is to discover patterns that are not too general as they would not contain relevant novel information while remaining typical as detailed data do not embed general and representative information. In this paper, we focus on the extraction of gradual patterns in the context of hierarchical data. Gradual patterns describe covariation of attributes such as the bigger, the more expensive. As our proposal increases the number of combinations to be considered since all levels must be explored, we propose to implement the parallel computation in order to decrease the execution time.

Revue des Sciences et Technologies de l'Information - Série Document Numérique | 2009

Indexation de co-occurrences dans des corpus de documents structurés et production de cartes sémantiques interactives

Pierre Pompidor; Boris Carbonneill; Michel Sala

Resume Confrontes a la problematique de l’indexation de tres grands corpus documentaires d’entreprises, nous avons mis au point une methode simple mais efficace (en temps de calcul et de volumetrie), permettant de filtrer par document les co-occurrences les plus representatives de ceux-ci. Le choix d’un contexte de co-occurrences a deux raisons. D’une part les requetes portant sur des corpus specialises et composes par des experts, s’appuient sur peu de termes precisement choisis dont l’indexation des associations permet la construction de cartes semantiques de navigation dans les concepts du corpus. Pour cela nous prenons en compte la structure des documents en validant les contenus des paragraphes par ceux de leurs titres. Notre methode s’appuie sur des mesures tf.idf successives effectuees dans le contexte d’un document et non d’un corpus, sur les contenus des paragraphes auxquels sont integres progressivement la hierarchie des titres les introduisant. D’autre part, nous exploitons simultanement une ontologie de controle et les requetes des utilisateurs comportant les termes precedemment discrimines pour valider par le theoreme de Bayes, les associations semantiques ainsi determinees, qui finalement permettent la production de cartes semantiques.

database and expert systems applications | 1997

On expert system for the revision of an experimental sciences researcher's knowledge

Michel Sala

In the frame of this article, we propose a learning environment which may allow an experimental sciences researcher to revise his knowledge of the domain. To reach this goal, we propose an architecture and an expert system dedicated to the domain of biology. First, we develop the different modules used: calculation tools (which handle the datas), the notion of inference (which modifies the knowledge of the researcher), and the databases of the field (which contain additional informations). In a second step, we validate our world by presenting an example in immunogenetics. We end this article with the detail of the architecture of the application SIGALE (System-IG Alignment Learning Environment).

ICWI | 2003