Michel Sala
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michel Sala.
database and expert systems applications | 2011
Laurent Bonnet; Anne Laurent; Michel Sala; Bénédicte Laurent; Nicolas Sicard
Data aggregation is one of the key features used in databases, especially for Business Intelligence (e.g., ETL, OLAP) and analytics/data mining. When considering SQL databases, aggregation is used to prepare and visualize data for deeper analyses. However, these operations are often impossible on very large volumes of data regarding memory-and-time-consumption. In this paper, we show how NoSQL databases such as MongoDB and its key-value stores, thanks to the native MapReduce algorithm, can provide an efficient framework to aggregate large volumes of data. We provide basic material about the MapReduce algorithm, the different NoSQL databases (read intensive vs. write intensive). We investigate how to efficiently modelize the data framework for BI and analytics. For this purpose, we focus on read intensive NoSQL databases using MongoDB and we show how NoSQL and MapReduce can help handling large volumes of data.
Engineering Applications of Artificial Intelligence | 2015
Yoann Pitarch; Dino Ienco; Elodie Vintrou; Agnès Bégué; Anne Laurent; Pascal Poncelet; Michel Sala; Maguelonne Teisseire
The main use of satellite imagery concerns the process of the spectral and spatial dimensions of the data. However, to extract useful information, the temporal dimension also has to be accounted for which increases the complexity of the problem. For this reason, there is a need for suitable data mining techniques for this source of data. In this work, we developed a data mining methodology to extract multidimensional sequential patterns to characterize temporal behaviors. We then used the extracted multidimensional sequences to build a classifier, and show how the patterns help to distinguish between the classes. We evaluated our technique using a real-world dataset containing information about land use in Mali (West Africa) to automatically recognize if an area is cultivated or not.
flexible query answering systems | 2013
Yogi Satrya Aryadinata; Arnaud Castelltort; Anne Laurent; Michel Sala
Data are often described at several levels of granularity. For instance, data concerning fruits that are purchased can be categorized regarding some criteria such as size, weight, color, etc.. When dealing with data from the real world, such categories can hardly be defined in a crisp manner. For instance, some fruits may belong both to the small and medium-sized fruits. Data mining methods have been proposed to deal with such data, in order to take benefit from the several levels when extracting relevant patterns. The challenge is to discover patterns that are not too general as they would not contain relevant novel information while remaining typical as detailed data do not embed general and representative information. In this paper, we focus on the extraction of gradual patterns in the context of hierarchical data. Gradual patterns describe covariation of attributes such as the bigger, the more expensive. As our proposal increases the number of combinations to be considered since all levels must be explored, we propose to implement the parallel computation in order to decrease the execution time.
Revue des Sciences et Technologies de l'Information - Série Document Numérique | 2009
Pierre Pompidor; Boris Carbonneill; Michel Sala
Resume Confrontes a la problematique de l’indexation de tres grands corpus documentaires d’entreprises, nous avons mis au point une methode simple mais efficace (en temps de calcul et de volumetrie), permettant de filtrer par document les co-occurrences les plus representatives de ceux-ci. Le choix d’un contexte de co-occurrences a deux raisons. D’une part les requetes portant sur des corpus specialises et composes par des experts, s’appuient sur peu de termes precisement choisis dont l’indexation des associations permet la construction de cartes semantiques de navigation dans les concepts du corpus. Pour cela nous prenons en compte la structure des documents en validant les contenus des paragraphes par ceux de leurs titres. Notre methode s’appuie sur des mesures tf.idf successives effectuees dans le contexte d’un document et non d’un corpus, sur les contenus des paragraphes auxquels sont integres progressivement la hierarchie des titres les introduisant. D’autre part, nous exploitons simultanement une ontologie de controle et les requetes des utilisateurs comportant les termes precedemment discrimines pour valider par le theoreme de Bayes, les associations semantiques ainsi determinees, qui finalement permettent la production de cartes semantiques.
database and expert systems applications | 1997
Michel Sala
In the frame of this article, we propose a learning environment which may allow an experimental sciences researcher to revise his knowledge of the domain. To reach this goal, we propose an architecture and an expert system dedicated to the domain of biology. First, we develop the different modules used: calculation tools (which handle the datas), the notion of inference (which modifies the knowledge of the researcher), and the databases of the field (which contain additional informations). In a second step, we validate our world by presenting an example in immunogenetics. We end this article with the detail of the architecture of the application SIGALE (System-IG Alignment Learning Environment).
ICWI | 2003
Michel Sala; Pierre Pompidor; Danièle Hérin
FUTURE COMPUTING | 2012
Siwipa Pruitikanee; Lisa Di Jorio; Anne Laurent; Michel Sala
international conference on web engineering | 2004
Michel Sala; Gaël Isoird
International Journal of Emerging Sciences | 2014
Arnaud Castelltort; C. Fauvet; J. Guidoni; Anne Laurent; Michel Sala
knowledge discovery and data mining | 2013
Yogi Satrya Aryadinata; Anne Laurent; Michel Sala