Gilles Nachouki
IRIN
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gilles Nachouki.
Information Fusion | 2008
Gilles Nachouki; Mohamed Quafafou
This paper describes a new approach of heterogeneous data source fusion. Data sources are either static or active: static data sources can be structured or semi-structured, whereas active sources are services. In order to develop data sources fusion systems in dynamic contexts, we need to study all issues raised by the matching paradigms. This challenging problem becomes crucial with the dominating role of the internet. Classical approaches of data integration, based on schemas mediation, are not suitable to the World Wide Web (WWW) environment where data is frequently modified or deleted. Therefore, we develop a loosely integrated approach that takes into consideration both conflict management and semantic rules which must be enriched in order to integrate new data sources. Moreover, we introduce an XML-based Multi-data source Fusion Language (MFL) that aims to define and retrieve conflicting data from multiple data sources. The system, which is developed according to this approach, is called MDSManager (Multi-Data Source Manager). The benefit of the proposed framework is shown through a real world application based on web data sources fusion which is dedicated to online markets indices tracking. Finally, we give an evaluation of our MFL language. The results show that our language improves significantly the XQuery language especially considering its expressiveness power and its performances.
web intelligence | 2005
Gilles Nachouki; Mohamed Quafafou; Marie-Pierre Chastang
In this paper, we show the design of MDSManager a system based on a multidatasource approach for data integration. MDSManager uses a multidatasource language called EXQ (extended XQuery). EXQ is designed in order to access and interconnect multiple conflicting static data sources (included databases, XML,HTML), and/or active data sources (included distinct services like Java classes, C programs, Web services etc.).
management of emergent digital ecosystems | 2009
Anis Ismail; Mohamed Quafafou; Gilles Nachouki; Mohammad Hajjar
Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of internet traffic. P2P systems have emerged as a popular way to share huge volumes of data. Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer is a database management system in itself, exposing its own schema. In such settings, the main objective is the efficient search across peer databases by processing each incoming query without overly consuming bandwidth. The usability of these systems depends on effective techniques to find and retrieve data; however, efficient and effective routing of content-based queries is an emerging problem in P2P networks. In this paper, we propose an architecture, based on (super-)peers, and we focus on query routing. Our approach considers that (super-)Peers having similar interests are grouped together for an efficient query routing method. In such groups, called knowledge-super-peers (KSP), super-peers submit queries that are often processed by members of this group. A KSP is a specific super-peer which contains knowledge about: 1. its super-peers and 2. the others super-peers. Knowledge is extracted by using data mining techniques (e.g. decision tree algorithms) starting from queries of peers that transit on the network. The advantage of this distributed knowledge is that, it avoids to making semantic mapping, between heterogeneous data sources owned by (super-)peers, each time the system decides to route query to other (super-)peers. The set of KSP improves the robustness in queries routing mechanism and scalability in P2P Network. Compared with a baseline approach, our proposal shows a better performance with respect to important criteria such as response time, precision and recall.
databases in networked information systems | 2003
Gilles Nachouki; Marie-Pierre Chastang
This paper describes the design of a system, which facilitates Accessing and Interconnecting heterogeneous data sources. Data sources can be static or active: static data sources include structured or semistructured data like databases, XML and HTML documents; active data sources include services which are localised on one or several servers including web services. The main originality of this work is to make interoperability between actives and/or static data sources based on XQuery language. As an example of using our approach, we’ll give a scenario for analyzing log files basing on OLAP (On Line Analytical Processing) literature.
international conference on internet and web applications and services | 2010
Anis Ismail; Mohamed Quafafou; Gilles Nachouki; Mohammad Hajjar
In traditional P2P networks, such as Gnutella, peers propagate query messages towards the resource holders by flooding them through the network. However, it is a costly operation since it consumes node and link resources excessively, which are often unnecessarily. There is no reason, for example, for a peer to receive a query message if the peer has no matching resource or is not on the path to a peer holding a matching resource. However, how to quickly discover the right resource in a large-scale P2P network without generating too much network traffic and with minimum possible time remain highly challenging. In this paper, we propose a new peer-to-peer (P2P) search method aiming at exploiting data mining concepts (Decision Tree) to improve search performance for information retrieval in P2P network. We use a PDMS system, which aims to combine a Super-Peer (SP) based network with the capability of managing a data model attached to the peers in the form of relational, xml, or object schemes. Each SP is connected to a Global-Knowledge-Super-Peer (GKSP) that operates with an index (decision tree), to predict the relevant domains (super-peers), to answer a given query. Compared with a super peer-based approach, our proposal architectures show the effect of the data mining with better performance with respect to response time, number of messages, precision and recall.
international symposium on computer and information sciences | 2003
Gilles Nachouki
This paper describes the design of a system, which facilitates Accessing and Interconnecting heterogeneous data sources. Data sources can be static or active: static data sources include structured or semistructured data like databases, XML and HTML documents; active data sources include services which are localised on one or several servers including web services. The main originality of this work is to make interoperability between actives and/or static data sources based on XQuery language.
european conference on principles of data mining and knowledge discovery | 2000
Tudor Teusan; Gilles Nachouki; Henri Briand; Jacques Philippe
In this paper we propose an approach for mining association rules in large, dense databases. For finding such rules, frequent itemsets must first be discovered. As finding all the frequent itemsets is very time-consuming for dense databases, we propose an algorithm that is able to quickly discover an image of the complete set containing all the frequent itemsets. We define what an image is, and we present a genetic algorithm for discovering such an image. To monitor the discovery process we introduce the notion of dynamics of the algorithm. To measure the performances of our frequent itemsets discovery algorithm, we introduce the notion of efficiency of the discovery process.
management of emergent digital ecosystems | 2009
Anis Ismail; Mohamed Quafafou; Gilles Nachouki; Mohammad Hajjar
Data mining has been used to extract hidden information from large databases. In peer-to-peer context, a challenging problem is how to find the appropriate peer to deal with a given query without overly consuming bandwidth? Different methods proposed routing strategies of queries taking into account the p2p network at hand. We consider an unstructured P2P system based on an organization of peers around super-peers that are connected to meta-super-peer according to their semantic domains. This paper integrates decision trees in P2P architectures for predicting Query-Suitable super-peers representing a community of peers where one among them is able to answer the given query. In fact by analyzing the queries log file, we construct a predictive model that avoids flooding queries in the p2p network by predicting the appropriate super-peer, and hence the peer to answer the query. A challenging problem in a schema-based peer-to-peer (P2P) system is how to locate peers that are relevant with respect to a given query. In this paper, we propose an architecture, based on (super-) peers, and we focus on query routing. Our approach considers that (super-) peers having similar interests are grouped together for an efficient query routing method. In such groups, called Meta-Super-Peers (MSP), super-peers submit queries that are often processed by members of this group. A MSP is a specific super-peer which contains knowledge about: 1. its super-peers and 2. The others MSP. Knowledge is extracted by using data mining techniques (e.g. decision tree algorithms) starting from queries of peers that transit on the network. The advantage of this distributed knowledge is that, it avoids making semantic mapping, between heterogeneous data sources owned by (super-)peers, each time the system decides to route query to other (super-)peers. The set of MSP improves the robustness in queries routing mechanism and scalability in P2P Network. Compared with a baseline approach, our proposal architectures show the effect of the data mining with better performance with respect to response time and precision.
database and expert systems applications | 1996
Abdellatif Saoudi; Gilles Nachouki; Henri Briand
The authors describe a federated information system that uses a terminological system for the integration of heterogeneous databases. They present how this system is used for specifying correspondences between knowledge belonging to local schemas. These correspondences are specified as a set of assertions that cover the inter- and intra-schema constraints. They also present an algorithm for checking consistency among specified correspondences.
Advanced Query Processing (1) | 2013
Gilles Nachouki; Mohamed Quafafou; Omar Boucelma; François-Marie Colonna
Over the last twenty years, information integration has received considerable efforts from both industry and academia. Approaches to information integration developed so far can be categorized as follows: (1) first-generation approaches, that require the definition of a global schema and a semantic integration which should be performed upfront (before query execution); (2) second-generation approaches, well illustrated by the dataspace management concept, which promote a pay-asyou-go data integration. The first category has led to well known mediation approaches such as GAV (Global as View), LAV (Local as View), GLAV (Generalized Local As View), BAV (Both As View), and BGLAV (BYU Global-Local-as-View). Approaches pertaining to the second category are geared towards the development of dataspace management systems and are currently gaining a lot of attention. In this chapter we are interested in exploiting both types of approaches in querying conflicting data spread over multiple web sources. To this aim, first we show how an XML-based BGLAV approach can handle these conflicting data sources, then we describe how the same problem can be addressed by using the Multi Fusion Approach (MFA), an approach pertaining to second-generation techniques. Both BGLAV and MFA are illustrated in using genomic data sources accessible through the Web.