Souheyl Mallat
University of Monastir
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Souheyl Mallat.
international conference on computational collective intelligence | 2015
Emna Hkiri; Souheyl Mallat; Mohsen Maraoui; Mounir Zrigui
Event Named entity Recognition (NER) is different from most past research on NER in Arabic texts. Most of the effort in named entity recognition focused on a specific domains and general classes especially the categories; Organization, Location and Person. In this work, we build a system for Event named entities annotation and recognition. To reach our goal we combined between linguistic resources and tools. Our method is fully automatic and aims to ameliorate the performance of our machine translation system.
conference on intelligent text processing and computational linguistics | 2015
Souheyl Mallat; Emna Hkiri; Mohsen Maraoui; Mounir Zrigui
In this paper, we present our method of lexical enrichment applied on a semantic network in the context of query disambiguation. This network represents the list of relevant sentences in French (noted by listRSF) that respond to a given Arabic query. In a first step we generate the semantic network covering the content of the listRSF. The generation of the network is based on our approach of semantic and conceptual indexing. In a second step, we apply a contextual enrichment on this network using association rules model. The evaluation of our method shows the impact of this model on the semantic network enrichment. As a result, this enrichment increases the F-measure from 71% to 81% in terms of the (listeRSF) coverage.
ieee international conference on fuzzy systems | 2016
Souheyl Mallat; Mohsen Maraoui; Emna Hkiri; Mounir Zrigui
In this paper, we present a statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps; we propose, in the first step, the extraction of index terms using the multilingual lexical resource EuroWordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This latter is generated using the EWN resource and the association rules model (in attempt to discover the non taxonomic relations or contextual relations between the concepts of a document). These lasts are latent relations, buried in the text, and carried by the semantic context of the co-occurrence of concepts in the document. The proposed approach can be applied to several languages because it builds a linguistic and statistical process. This approach is validated by a set of experiments and comparison with other methods of indexing based on a corpus of TREC evaluation campaign 2001 and 2002 of the ad hoc task. We prove that the proposed indexing approach provides encouraging results.
international conference on information and communication technology | 2015
Emna Hkiri; Souheyl Mallat; Mounir Zrigui
Named entity recognition (NER) is the problem of identifying (locating and categorizing) atomic entities in a given text that fall into predefined categories or classes. In this work, we developed a bilingual Arabic-English lexicon of named entities (NE) to improve the performance of Arabic rule-based systems. To reach our goal, we followed different steps starting by the pre-editing of the DBpedia linked data entities and the parallel corpus and then applying our automatic model for detection, extraction and translation of Arabic-English Named Entities. Our approach is fully automatic and hybrid, it combines linguistic and statistical methods.
international conference on information and communication technology | 2015
Souheyl Mallat; Houssem Abdellaoui; Mohsen Maraoui; Mounir Zrigui
In this paper, we propose a method is to improve the performance of information retrieval systems (IRS) by increasing the selectivity of relevant documents on the web. Indeed, a significant number of relevant documents on the web are not returned by an IRS (specifically a search engine), because of the richness of natural language Arabics. For this purpose the search engine does not reach high performance and does not meet the needs of users. To remedy this problem, we propose a method of enrichment of the query. This method relies on many steps. First, identification of significant terms (simple and composed) present in the query. Then, generation of a descriptive list and its assignment to each term that has been identified as significant in the query. A descriptive list is a set of linguistic knowledge of different types (morphological, syntactic and semantic). In this paper we are interested in the statistical treatment, based on the similarity method. This method exploits the weighting functions of Salton TF-IDF and TF-IEF on the list generated in the previous step. TF-IDF function identifies relevant documents, while the TF-IEFs role is to identify the relevant sentence. The terms of high weight (which are terms which may be correlated to the context of the response) are incorporated into the original query. The application of this method is based on a corpus of documents belonging to a closed domain.
International Journal on Semantic Web and Information Systems | 2015
Souheyl Mallat; Emna Hkiri; Mohsen Maraoui; Mounir Zrigui
In this paper, the authors propose formalism for representing a knowledge base KB by network. The objective is to achieve a high coverage of this base. This type of network is similar to the semantic network with the difference that the arcs are quantified by a value indicating the semantic proximity between the concepts. This semantic proximity presents taxonomic relations, synonyms, and non-taxonomic relations contextual relations. This latter are discovered based on the association rules model. This model is based on i indexing method ii the French lexical database EuroWordNet EWNF and iii the Apriori algorithm. The contextual relations are the latent relations buried in the KB, carried by the semantic context. Evaluating our representation formalism shows better result about 80% of coverage of the KB.
international conference on information and communication technology | 2013
Souheyl Mallat; Anis Zouaghi; Emna Hkiri; Mounir Zrigui
In this paper, we present an automatic query translation in Arabic for information retrieval. This system, implements a method for lexical disambiguation of terms in a query. To choose the best sense, each term of the query is projected on the French EuroWordNet and we extract his related concepts in order to form a semantic network. In a second time using the same type of network, but by integrating LSI (Latent Semantic Index), and extraction of contextual hidden links between the concepts of the list (listSRF). This list extracted by the automatic alignment by Mkalign tools, from the knowledge base defined by the Monde Diplomatic parallel corpus. This tool will allow us to map between the relevant sentences in Arabic identified in our previous work with French sentences to build a listSRF. This list forms the basis of the lexical disambiguation method. Finally, we propose a mechanism for selecting the best sense of the ambiguous term of the query, based on the matching between each network corresponds to a word in the query with the network listSRF to extract an adequate sense with the highest degree of similarity. An evaluation and comparison are conducted to measure the quality of our translation system.
International Journal of Information Retrieval Research archive | 2013
Souheyl Mallat; Anis Zouaghi; Emna Hkiri; Mounir Zrigui
computer and information technology | 2014
Souheyl Mallat; Mohamed Achraf Ben Mohamed; Emna Hkiri; Anis Zouaghi; Mounir Zrigui
The International Arab Journal of Information Technology | 2015
Mohamed Achraf Ben Mohamed; Souheyl Mallat; Mohamed Amine Nahdi; Mounir Zrigui