Minyar Sassi Hidri | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Minyar Sassi Hidri is active.

Explore More

Publication

Featured researches published by Minyar Sassi Hidri.

computer software and applications conference | 2015

Dynamic Data Replication-Driven Model in Data Grids

Rahma Souli-Jbali; Minyar Sassi Hidri; Rahma Ben Ayed

Today, data grid appears more and more like the future solution of hardware and software offering infinite computing and storage capacity. In order to best exploit the available resources, it seems necessary to design new replication solutions and data movement suited to this type of architecture. It is therefore necessary to adapt the travel policies, positioning and data management based on the needs of applications and the opportunities for the underlying platform to optimize the execution time. The work presented in this paper is a solution to this problem.

database and expert systems applications | 2013

A Parallel Comparator of Documents

Sonia Alouane Ksouri; Minyar Sassi Hidri; Kamel Barkaoui

Documents, sentences and words clustering are well studied problems. Most existing algorithms cluster documents, sentences and words separately but not simultaneously. However, when analyzing large textual corpuses, the amount of data to be processed in a single machine is usually limited by the main memory available, and the increase of these data to be analyzed leads to increasing computational workload. In this paper we present a parallel fuzzy triadic similarity measure called PFT-Sim, to calculate fuzzy memberships in a context of document co-clustering based on a parallel programming architecture. It allows computing simultaneously fuzzy co-similarity matrices between documents/sentences and sentences/words. Each one is built on the basis of the others. The PFT-SIM model provides a parallel data analysis strategy and divides the similarity computing task into parallel sub-tasks to tackle efficiency and scalability problems.

International Journal of Service Science, Management, Engineering, and Technology | 2015

Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection

Sonia Alouane-Ksouri; Minyar Sassi Hidri

The contribution of this work relates to the field of Arabic text-based document analysis for the detection of plagiarism. This analysis will be carried out according to the triadic computation model of document similarity. The authors propose a hybrid segmentation prototype for Arabic text-based documents that links different processing steps in order to generate the similarity rate between the documents of an Arabic corpus. It involves two segmentation systems and a morphological analysis in order to obtain a matrix representation adapted to the triadic similarity computation according to three abstraction levels: documents, sentences and words.

International Journal of Fuzzy System Applications archive | 2016

No-FSQL: A Graph-based Fuzzy NoSQL Querying Model

Minyar Sassi Hidri; Ines Benali-Sougui; Amel Grissa-Touzi

NoSQL Not only SQL is an efficient database model for storing and manipulating huge quantities of precise data. However, most NoSQL databases scale well as data grows and often are flexible enough to accommodate imprecise and ambiguous data. This comprehensive hands-on guide presents fundamental concepts and practical solutions for using fuzziness with NoSQL to deals with fuzzy databases FDB. In this paper, the authors present a graph-based fuzzy NoSQL model to deal with large fuzzy databases while extending the NoSQL one. The authors consider the cypher declarative query language proposed for Neo4j which is the current leader on this market to querying fuzzy databases.

ieee international conference on fuzzy systems | 2016

Sampling-based consensus fuzzy clustering on Big Data

Mohamed Ali Zoghlami; Minyar Sassi Hidri; Rahma Ben Ayed

Many companies spend vast amounts of resources to collect, transform and store the massive amounts of data that flows through their business processes. When it comes to doing analysis and machine learning such as clustering on this data, time and compute speed gate determine how much data can be analyzed. Moreover, most Big Data clustering algorithms do not look at a complete, large dataset. Instead, they look at a subsample and work on approximations. However, work on samples can spread useful data that can be sources of value. In this paper, we use sampling combined with consensus strategy to dissemble the whole Big Data into small subsets, then basic partitions are locally generated from them using parallel processing. For the sampling part, we propose a partial data clustering (PDC) according to different nodes to classify the current sub-samples of partial data access (PDA) merged together with optimal prototypes generated from the last PDC and condensed into weighted points. For the consensus part, we apply a split-and-merge fuzzy clustering to equivalently transfer the consensus clustering problem into an optimization clustering one. Extensive experiments on several datasets demonstrate the ability to handle massive data and the consensus computing make the proposed classifier promising candidate for Big Data clustering.

International Journal of Intelligent Engineering Informatics | 2015

A cloud-based optimal fuzzy clustering of distributed data

Rahma Souli-Jbali; Minyar Sassi Hidri

Cloud computing is an infrastructure that allows the storage of large datasets. It provides a great and parallel computing which permits a faster computation on distributed data. The contribution of this paper concerns the development of a cloud-based fuzzy clustering algorithm of distributed datasets while detecting the optimal partition in a global view. The proposed algorithm meets the confidentiality constraint which prohibits the sharing of data between different resources while guaranteeing the data anonymity located on the cloud servers. A series of experiments was conducted to evaluate the efficiency of the proposed algorithm. The obtained results show the performance of the proposed algorithm on both quality and response time components.

Procedia Computer Science | 2018

An Ontology-driven MapReduce Framework for Association Rules Mining in Massive Data

Rania Mkhinini Gahar; Olfa Arfaoui; Minyar Sassi Hidri; Nejib Ben Hadj-Alouane

Abstract To be competitive, companies need to be able to take advantage of the huge amounts of data, called also Big Data deluge, to predict what might happen in the future. In this way, predictive analytics play an important role for extracting useful information which may extend the business strategy and so gain competitive advantages. Predictive analytics involve data mining algorithms to discover knowledge from huge volumes of data. In this context, Association Rules (ARs) mining is considered as one of the most wide-spread data mining techniques. It is especially based on frequent itemsets mining process. However, when it comes to Big Data, ARs mining algorithms produce a huge amount of ARs, many of which are redundant and unuseful. To overcome this drawback, we propose a ontology-driven Map-Reduce Framework for ARs mining in massive data. Ontologies allow to filter the generated ARs and keep only useful ones. The filtering process is assured by a semantic pruning phase introduced in the Map-Reduce jobs in order to eliminate unuseful candidates from the computing of the Maximal Frequent Itemsets (MFI). This may allow a quantitative and especially qualitative reduction of the number of MFI and subsequently of the ARs. Extensive experiments on several datasets demonstrate the ability to handle massive data for mining ARs.

Archive | 2018

Towards a New Extracting and Querying Approach of Fuzzy Summaries

Ines Benali-Sougui; Minyar Sassi Hidri; Amel Grissa-Touzi

Diversification of DB applications highlighted the limitations of relational database management system (RDBMS) particularly on the modeling plan. In fact, in the real world, we are increasingly faced with the situation where applications need to handle imprecise data and to offer a flexible querying to their users. Several theoretical solutions have been proposed. However, the impact of this work in practice remained negligible with the exception of a few research prototypes based on the formal model GEFRED. In this chapter, the authors propose a new approach for exploitation of fuzzy relational databases (FRDB) described by the model GEFRED. This approach consists of 1) a new technique for extracting summary fuzzy data, Fuzzy SAINTETIQ, based on the classification of fuzzy data and formal concepts analysis; 2) an approach of assessing flexible queries in the context of FDB based on the set of fuzzy summaries generated by our fuzzy SAINTETIQ system; 3) an approach of repairing and substituting unanswered query.

International Journal on Artificial Intelligence Tools | 2017

Consensus-Driven Cluster Analysis: Top-Down and Bottom-Up Based Split-and-Merge Classifiers

Mohamed Ali Zoghlami; Minyar Sassi Hidri; Rahma Ben Ayed

Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Typically, the goal is searching for the socalled median (or consensus) partition, i.e. the partition that is most similar, on average, to all the input partitions. In this paper we address the problem of combining multiple fuzzy clusterings without access to the underlying features of the data while basing on inter-clusters similarity. We are concerned of top-down and bottom-up based consensus-driven fuzzy clustering while splitting and merging worst clusters. The objective is to reconcile a structure, developed for patterns in some dataset with the structural findings already available for other related ones. The proposed classifiers consider dispersion and dissimilarity between the partitions as well as the corresponding fuzzy proximity matrices. Several illustrative numerical examples, using both synthetic data and those coming from available machine learning repositories, are also included. The experimental component of the study shows the efficiency of the proposed classifiers in terms of quality and runtime.

Fuzzy Sets and Systems | 2017

Speeding up the large-scale consensus fuzzy clustering for handling Big Data

Minyar Sassi Hidri; Mohamed Ali Zoghlami; Rahma Ben Ayed

Abstract Massive data can create a real competitive advantage for the companies; it is used to better respond to customers, to follow the behavior of consumers, to anticipate the evolutions, etc. However, it has its own deficiencies. This data volume not only requires big storage spaces but also makes analysis, processing and retrieval operations very difficult and hugely time-consuming. One way to overcome these problems is to cluster this data into a compact format that is still an informative version of the entire data. A lot of clustering algorithms have been proposed. However, their scaling is poor in terms of computation time whenever the size of the data gets larger. In this paper, we make full use of consensus clustering to handle Big Data clustering. We use sampling combined with a split-and-merge strategy to fragment data into small subsets, then basic partitions are locally generated from them using RHadoops parallel processing MapReduce model and later a consensus tendency is followed to obtain the final result. A scalability analysis is conducted to demonstrate the performance of the proposed clustering models by increasing both the number of computing nodes used and the sample size while satisfying the volume and the velocity dimensions.

Explore More