Chiraz Latiri
Tunis University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chiraz Latiri.
pacific-asia conference on knowledge discovery and data mining | 2017
Mohamed Chebel; Chiraz Latiri; Eric Gaussier
In this paper, we propose to complement the context vectors used in bilingual lexicon extraction from comparable corpora with concept vectors, that aim at capturing all the words related to the concepts associated with a given word. This allows one to rely on a representation that is less sparse, especially in specialized domains where the use of a general bilingual lexicon leaves many words untranslated. The concept vectors we are considering are based on closed concepts mining developed in Formal Concept Analysis (FCA). The obtained results on two different comparable corpora show that enriching context vectors with concept vectors leads to lexicons of higher quality, especially in specialized domains.
Procedia Computer Science | 2015
Mohamed Chebel; Chiraz Latiri; Eric Gaussier
Abstract To address multilingual document classification in an effcient and effective manner, we claim that a synergy between classical IR techniques such as vector model and some advanced data mining methods, especially Formal Concept Analysis, is particularly appropriate. We propose in this paper, a new statistical approach for extracting inter-language clusters from multilingual documents based on Closed Concepts Mining and vector model. Formal Concept Analysis techniques are applied to extract Closed Concepts from comparable corpora; and, then, exploit these Closed Concepts and vector models in the clustering and alignment of multilin- gual documents. An experimental evaluation is conducted on the collection of bilingual documents French-English of CLEF’2003. The results confirmed that the synergy between Formal Concept Analysis and vector model is fruitful to extract bilingual classes of documents, with an interesting comparability score.
database and expert systems applications | 2016
Soumaya Guesmi; Chiraz Trabelsi; Chiraz Latiri
In this paper, we introduce a community detection approach from heterogeneous multi-relational network which incorporate the multiple types of objects and relationships, derived from a bibliographic networks. The proposed approach performs firstly by constructing the relation context family (RCF) to represent the different objects and relations in the multi-relational bibliographic networks using the Relational Concept Analysis (RCA) methods; and secondly by exploring such RCF for community detection. Experiments performed on a dataset of academic publications from the Computer Science domain enhance the effectiveness of our proposal and open promising issues.
Procedia Computer Science | 2016
Soumaya Guesmi; Chiraz Trabelsi; Chiraz Latiri
Abstract Community detection in multi-relational bibliographic networks is an important issue. There has been a surge of interest in community detection focusing on analyzing the linkage or topological structure of these networks. However, communities identified by these proposed approaches, commonly reflect the strength of connections between networks nodes and neglect considering the interesting topics or the venues, i.e., conferences or journals, shared by these community members, i.e, authors. To tackle this drawback, we present in this paper a new approach called CoMRing for community detection from heterogeneous multi-relational network which incorporate the multiple types of objects and relationships, derived from a bibliographic networks. We firstly propose to construct the Concept Lattice Family (CLF) to model the different objects and relations in the multi-relational bibliographic networks using the Relational Concept Analysis (RCA) methods. Then after we introduce a new method, called QueryExploration, that explores such CLF for community detection. Carried out experiments on real-datasets enhance the effectiveness of our proposal and open promising issues.
database and expert systems applications | 2015
Mohamed Chebel; Chiraz Latiri; Eric Gaussier
The scarcity of bilingual and multilingual parallel corpora has prompted many researchers to accentuate the need for new methods to enhance the quality of comparable corpora. In this paper, we highlight the interest and usefulness of Formal Concept Analysis in multiligual document clustering to improve corpora comparability. We propose a statistical approach for clustering multiligual documents based on multilingual Closed Concepts Mining to partition the documents belonging to one or more collections, writing in more than one language, in a set of classes. Experimental evaluation was conducted on two collections and showed a significant improvement of comparability of the generated classes.
database and expert systems applications | 2018
Sourour Belhaj Rhouma; Chiraz Latiri; Catherine Berrut
Recently, different works on bilingual lexicon extraction from comparable corpora have been proposed. This paper presents how to combine differents methods for bilingual lexicon extraction based on standard context vectors and advanced text mining methods. In this respect, we focus on combining bilingual lexicons based on context vectors, association rules and contextual meta-rules. The combination of lexicons leads to a less sparse representation in order to extract the most effective translations from these lexicons and create an optimal bilingual lexicon. An experimental validation conducted on two pairs of languages of the CLEF 2003 campaign evaluation, shows that the combination of the models give a significant improvement compared to the standard approach.
cross language evaluation forum | 2018
Malek Hajjem; Jean Valére Cossu; Chiraz Latiri; Eric SanJuan
MC2 lab mainly focuses on developing processing methods and resources to mine the social media (SM) sphere surrounding cultural events such as festivals, music, books, movies and museums. Following previous editions (CMC 2016 and MC2 2017), the 2018 edition focused on argumentative mining and multilingual cross SM search. Public microblogs about cultural events like festivals are promotional announcements by organizers or artists, very few are personal and argumentative, the challenge is to find them before they eventually become viral. We report the main lessons learned from this 2018 CLEF task.
Procedia Computer Science | 2018
Asma Ouertatani; Ghada Gasmi; Chiraz Latiri
Abstract With the explosive growth of social media (e.g., blogs, microblogs, Twitter, and postings in social network sites) on the web, individuals and organizations are increasingly using the content to find and monitor opinionated data and distill the information contained in them: view or judgment formed about something or someone. Several researchers have focused on the identification of opinion and defined it formally. We attempt in this paper to define and characterize an opinion with the associated prevailing arguments components and we called it argued opinion. we conducted experiments using differents classifying models.
Procedia Computer Science | 2018
Mohamed Ettaleb; Chiraz Latiri; Patrice Bellot
Abstract Most of the queries submitted to search engines are composed of keywords but it is not enough for users to express their needs. Through verbose natural language queries, users can express complex or highly specific information needs. However, it is difficult for search engine to deal with this type of queries. Moreover, the emergence of social medias allows users to get opinions, suggestions or recommendations from other users about complex information needs. In order to increase the understandability of user needs, tasks as the CLEF Social Book Search Suggestion Track have been proposed from 2011 to 2016. The aim is to investigate techniques to support users in searching for books in catalogs of professional metadata and complementary social media. In this context, we introduce in the current paper a statical approach to deal with long verbose queries in Social Information Retrieval (SIR) by taking Social Book Search(SBS) as a study case. firstly, a morphosyntactic analysis was introduced to reduce verbose queries, the second step is based on expanding the reduced queries using association rules mining combined with Pseudo relevance feedback. Experiments on SBS 2014 and 2016 collections show significant improvement in the retrieval performance.
Procedia Computer Science | 2017
Malek Hajjem; Chiraz Latiri
Abstract Twitter is a networking micro-blogging service where users post millions of short messages every day. Building multilingual corpora from these microblogs contents can be useful to perform several computational tasks such as opinion mining. However, Twitter data gathering involves the problem of irrelevant included data. Recent literary works have proved that topic models such as Latent Dirichlet Allocation (LDA) are not consistent when applied to short texts like tweets. In order to prune the irrelevant tweets, we investigate in this paper a novel method to improve topics learned from Twitter content without modifying the basic machinery of LDA. This latter is based on a pooling process which combines Information retrieval (IR) approach and LDA.This is achieved through an aggregation strategy based on IR task to retrieve similar tweets in a same cluster. The result of tweet pooling is then used as an input for a basic LDA to overcome the sparsity problem of Twitter content. Empirical results highlight that tweets aggregation based on IR and LDA leads to an interesting improvement in a variety of measures for topic coherence, in comparison to unmodified LDA baseline and a variety of pooling schemes.