Sadok Ben Yahia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sadok Ben Yahia is active.

Explore More

Publication

Featured researches published by Sadok Ben Yahia.

Data mining and computational intelligence | 2001

Discovering knowledge from fuzzy concept lattice

Sadok Ben Yahia; Ali Jaoua

Since its inception, association rule mining has become one of the core data mining tasks, and has attracted tremendous interest among researchers and practitioners. Many efficient algorithms have been proposed in the literature, e.g., Apriori, Partition, DIC, for mining association rules in the context of marketbasket analysis. They are all based on apriori methods, i.e., pruning the itemset lattice, and requires multiple database accesses. However, research so far has mainly focused on mining over binary data, i.e., either an item is present in a transaction or not. Little attention was paid to mining over data where the quantity of items is considered. In this paper, we propose to address the problem of mining fuzzy association rules, by considering the quantity of items in the transactions. After the fuzzification of the transaction database, we apply a new efficient algorithm, called FARD (Fuzzy Association Rule Discovery), for mining fuzzy association rules. FARD is based on the pruning of the fuzzy concept lattice, and can be applied equally to classical or fuzzy databases, by scanning the database only once.

Fuzzy Sets and Systems | 2011

Extracting compact and information lossless sets of fuzzy association rules

Sarra Ayouni; Sadok Ben Yahia; Anne Laurent

Applying classical association rule extraction framework on fuzzy datasets leads to an unmanageably highly sized association rule sets. Moreover, the discretization operation leads to information loss and constitutes a hamper towards an efficient exploitation of the mined knowledge. To overcome such a drawback, this paper proposes the extraction and the exploitation of compact and informative generic basis of fuzzy association rules. The presented approach relies on the extension, within the fuzzy context, of the notion of closure and Galois connection, that we introduce in this paper. In order to select without loss of information a generic subset of all fuzzy association rules, we define three fuzzy generic basis from which remaining (redundant) FARs are generated. This generic basis constitutes a compact nucleus of fuzzy association rules, from which it is possible to informatively derive all the remaining rules. In order to ensure a sound and complete derivation process, we introduce an axiomatic system allowing the complete derivation of all the redundant rules. The results obtained from experiments carried out on benchmark datasets are very encouraging. They highlight a very important reduction of the number of the extracted fuzzy association rules without information loss.

international conference on data mining | 2010

Bridging Folksonomies and Domain Ontologies: Getting Out Non-taxonomic Relations

Chiraz Trabelsi; Aicha Ben Jrad; Sadok Ben Yahia

Social book marking tools are rapidly emerging on the Web as it can be witnessed by the overwhelming number of participants. In such spaces, users annotate resources by means of any keyword or tag that they find relevant, giving raise to lightweight conceptual structures \emph{aka} folksonomies. In this respect, needless to mention that ontologies can be of benefit for enhancing information retrieval metrics. In this paper, we introduce a novel approach for ontology learning from a \textit{folksonomy}, which provide shared vocabularies and semantic relations between tags. The main thrust of the introduced approach stands in putting the focus on the discovery of \textit{non-taxonomic} relationships. The latter are often neglected, even though they are of paramount importance from a semantic point of view. The discovery process heavily relies on triadic concepts to discover and select related tags and to extract and label non-taxonomically relationships between related tags and external sources for tags filtering and non-taxonomic relationships extraction. In addition, we also discuss a new approach to evaluate obtained relations in an automatic way against WordNet repository and presents promising results for a real world \textit{folksonomy}.

international conference on tools with artificial intelligence | 2012

Ranking and Selecting Association Rules Based on Dominance Relationship

Slim Bouker; Rabie Saidi; Sadok Ben Yahia; Engelbert Mephu Nguifo

The huge number of association rules represent the main hamper that a decision maker faces. In order to bypass this hamper, an efficient selection of rules has to be performed. Since selection is necessarily based on evaluation, many interestingness measures have been proposed. However, the abundance of these measures gave rise to a new problem, namely the heterogeneity of the evaluation results and this created confusion to the decision. In this respect, we propose a novel approach to discover interesting association rules without favoring or excluding any measure by adopting the notion of dominance between association rules. Our approach bypasses the problem of measure heterogeneity and unveils a compromise between their evaluations. Interestingly enough, the proposed approach also avoids another non-trivial problem which is the threshold value specification.

International Journal on Artificial Intelligence Tools | 2014

Mining Undominated Association Rules Through Interestingness Measures

Slim Bouker; Rabie Saidi; Sadok Ben Yahia; Engelbert Mephu Nguifo

The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used for the analysis and the comprehension of huge amounts of data. However, the number of generated rules is too large to be efficiently analyzed and explored in any further process. In order to bypass this hamper, an efficient selection of rules has to be performed. Since selection is necessarily based on evaluation, many interestingness measures have been proposed. However, the abundance of these measures gave rise to a new problem, namely the heterogeneity of the evaluation results and this created confusion to the decision. In this respect, we propose a novel approach to discover interesting association rules without favoring or excluding any measure by adopting the notion of dominance between association rules. Our approach bypasses the problem of measure heterogeneity and unveils a compromise between their evaluations. Interestingly enough, the proposed approach also avoids another non-trivial problem which is the threshold value specification. Extensive carried out experiments on benchmark datasets show the benefits of the introduced approach.

pacific asia workshop on intelligence and security informatics | 2010

MAD-IDS: novel intrusion detection system using mobile agents and data mining approaches

Imen Brahmi; Sadok Ben Yahia; Pascal Poncelet

Intrusion Detection has been investigated for many years and the field reached the maturity. Nevertheless, there are still important challenges, e.g., how an Intrusion Detection System (IDS) can detect distributed attacks. To tackle this problem, we propose a novel distributed IDS, based on the desirable features provided by the mobile agent methodology and the high accuracy offered by the data mining techniques.

International Journal of Foundations of Computer Science | 2008

SUCCINCT MINIMAL GENERATORS: THEORETICAL FOUNDATIONS AND APPLICATIONS

Tarek Hamrouni; Sadok Ben Yahia; Engelbert Mephu Nguifo

In data mining applications, highly sized contexts are handled what usually results in a considerably large set of frequent itemsets, even for high values of the minimum support threshold. An interesting solution consists then in applying an appropriate closure operator that structures frequent itemsets into equivalence classes, such that two itemsets belong to the same class if they appear in the same sets of objects. Among equivalent itemsets, minimal elements (w.r.t. the number of items) are called minimal generators (MGs), while their associated closure is called closed itemset (CI), and is the largest one within the corresponding equivalence class. Thus, the pairs - composed by MGs and their associated CIs - make easier localizing each itemset since it is necessarily encompassed by an MG and an CI. In addition, they offer informative implication/association rules, with minimal premises and maximal conclusions, which losslessly represent the entire rule set. These important concepts - MG and CI - were hence at the origin of various works. Nevertheless, the inherent absence of a unique MG associated to a given CI leads to an intra-class combinatorial redundancy that leads an exhaustive storage and impractical use. This motivated an in-depth study towards a lossless reduction of this redundancy. This study was started by Dong et al. who introduced the succinct system of minimal generators (SSMG) as an attempt to eliminate the redundancy within this set. In this paper, we give a thorough study of the SSMG as formerly defined by Dong et al. This system will be shown to suffer from some flaws. As a remedy, we introduce a new lossless reduction of the MG set allowing to overcome its limitations. The new SSMG will then be incorporated into the framework of generic bases of association rules. This makes it possible to only maintain succinct and informative rules. After that, we give a thorough formal study of the related inference mechanisms allowing to derive all redundant association rules, starting from the maintained ones. Finally, an experimental evaluation shows the utility of our approach towards eliminating important rate of redundant information.

agents and data mining interaction | 2011

Towards a multiagent-based distributed intrusion detection system using data mining approaches

Imen Brahmi; Sadok Ben Yahia; Hamed Aouadi; Pascal Poncelet

The system that monitors the events occurring in a computer system or a network and analyzes the events for sign of intrusions is known as Intrusion Detection System (IDS). The IDS need to be accurate, adaptive, and extensible. Although many established techniques and commercial products exist, their effectiveness leaves room for improvement. A great deal of research has been carried out on intrusion detection in a distributed environment to palliate the drawbacks of centralized approaches. However, distributed IDS suffer from a number of drawbacks e.g. , high rates of false positives, low efficiency, etc. In this paper, we propose a distributed IDS that integrates the desirable features provided by the multi-agent methodology with the high accuracy of data mining techniques. The proposed system relies on a set of intelligent agents that collect and analyze the network connections, and data mining techniques are shown to be useful to detect the intrusions. Carried out experiments showed superior performance of our distributed IDS compared to the centralized one.

international conference on formal concept analysis | 2007

About the lossless reduction of the minimal generator family of a context

Tarek Hamrouni; Petko Valtchev; Sadok Ben Yahia; Engelbert Mephu Nguifo

Minimal generators (MGs), aka minimal keys, play an important role in many theoretical and practical problem settings involving closure systems that originate in graph theory, relational database design, data mining, etc. As minima of the equivalence classes associated to closures, MGs underlie many compressed representations: For instance, they form premises in canonical implication/ association rules - with closures as conclusions - that losslessly represent the entire rule family of a closure system. However, MGs often show an intra-class combinatorial redundancy that makes an exhaustive storage and use impractical. In this respect, the succinct system of minimal generators (SSMG) recently introduced by Dong et al. is a first step towards a lossless reduction of this redundancy. However, as shown elsewhere, some of the claims about SSMG, e.g., its invariant size and lossless nature, do not hold. As a remedy, we propose here a new succinct family which restores the losslessness by adding few further elements to the SSMG core, while theoretically grounding the whole. Computing means for the new family are presented together with the empirical evidences about its relative size w.r.t. the entire MG family and similar structures from the literature.

knowledge and systems engineering | 2014

Mining Frequent Itemsets in Evidential Database

Ahmed Samet; Eric Lefevre; Sadok Ben Yahia

Mining frequent patterns is widely used to discover knowledge from a database. It was originally applied on Market Basket Analysis (MBA) problem which represents the Boolean databases. In those databases, only the existence of an article (item) in a transaction is defined. However, in real-world application, the gathered information generally suffer from imperfections. In fact, a piece of information may contain two types of imperfection: imprecision and uncertainty. Recently, a new database representing and integrating those two types of imperfection were introduced: Evidential Database. Only few works have tackled those databases from a data mining point of view. In this work, we aim to discuss evidential itemset’s support. We improve the complexity of state of art methods for support’s estimation. We also introduce a new support measure gathering fastness and precision. The proposed methods are tested on several constructed evidential databases showing performance improvement.

Explore More