Mariem Ellouze | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mariem Ellouze is active.

Explore More

Publication

Featured researches published by Mariem Ellouze.

international conference natural language processing | 2009

An Arabic question-answering system for factoid questions

Wissal Brini; Mariem Ellouze; Slim Mesfar; Lamia Hadrich Belguith

In this paper, we propose an Arabic Question-Answering (Q-A) system called QASAL «Question -Answering system for Arabic Language». QASAL accepts as an input a natural language question written in Modern Standard Arabic (MSA) and generates as an output the most efficient and appropriate answer. The proposed system is composed of three modules: A question analysis module, a passage retrieval module and an answer extraction module. To process these three modules we use the NooJ Platform which represents a linguistic development environment.

conference on intelligent text processing and computational linguistics | 2015

Arabic Transliteration of Romanized Tunisian Dialect Text: A Preliminary Investigation

Abir Masmoudi; Nizar Habash; Mariem Ellouze; Yannick Estève; Lamia Hadrich Belguith

In this paper, we describe the process of converting Tunisian Dialect text that is written in Latin script (also called Arabizi) into Arabic script following the CODA orthography convention for Dialectal Arabic. Our input consists of messages and comments taken from SMS, social networks and broadcast videos. The language used in social media and SMS messaging is characterized by the use of informal and non-standard vocabulary such as repeated letters for emphasis, typos, non-standard abbreviations, and nonlinguistic content, such as emoticons. There is a high degree of variation is spelling in Arabic dialects due to the lack of orthographic widely supported standards in both Arabic and Latin scripts. In the context of natural language processing, transliterating from Arabizi to Arabic script is a necessary step since most recently available tools for processing Arabic Dialects expect Arabic script input.

international conference natural language processing | 2002

Relevant Information Extraction Driven with Rhetorical Schemas to Summarize Scientific Papers

Mariem Ellouze; Abdelmajid Ben Hamadou

Automatic summaries are often subject to several criticisms (e.g., lack of cohesion and coherence). In this paper, we propose an approach that uses coherent Summary-Schemas (templates) conceived from the rhetorical structure of scientific papers including their abstracts. The Summary-Schemas embed rhetorical roles specified by signatures (sets of positional, structural, linguistic and thematic features) that guide the search for appropriate sentences in the source text.

applications of natural language to data bases | 2014

Fine-Grained POS Tagging of Spoken Tunisian Dialect Corpora

Rahma Boujelbane; Mariem Mallek; Mariem Ellouze; Lamia Hadrich Belguith

Arabic Dialects (AD) have recently begun to receive more attention from the speech science and technology communities. The use of dialects in language technologies will contribute to improve the development process and the usability of applications such speech recognition, speech comprehension, or speech synthesis. However, AD faces the problem of lack of resources compared to the Modern Standard Arabic (MSA). This paper deals with the problem of tagging an AD: The Tunisian Dialect (TD). We present, in this work, a method for building a fine grained POS (Part Of Speech tagger) for the TD. This method consists on adapting a MSA POS tagger by generating a training TD corpus from a MSA corpus using a bilingual lexicon MSA-TD. The evaluation of the TD tagger on a corpus of text transcriptions achieved an accuracy of 78.5%.

Archive | 2016

White-Rot Fungi and their Enzymes as a Biotechnological Tool for Xenobiotic Bioremediation

Mariem Ellouze; Sami Sayadi

A huge amount of hazardous organopollutants, often persistent and toxic, is pro‐ duced annually over the world and may contaminate soil, water, ground water, and air. Being from various sources such as wastewater, landfill leachates, and solid residues, xenobiotics include phenols, plastics, hydrocarbons, paints, dyes, pesticides and insecticides, paper and pulp mills, and pharmaceuticals. Among biological processes for degradation of xenobiotics, fungal ones, being eco-friendly and cost cheap, have been investigated extensively because most of basidiomycetes are more tolerant to high concentrations of pollutants. Fungal bioremediation is a promising technology using their metabolic potential to remove or reduce xenobiotics. Basidiomycetes are the unique microorganisms that show high capacities of degrading a wide range of toxic xenobiotics. They act via the extracellular ligninolytic enzymes, including laccase, manganese peroxidase, and lignin peroxidase. Their capacities to remove xenobiotic substances and produce polymeric products make them a useful tool for bioremedia‐ tion purposes. During fungal remediation, they utilize hazardous compounds, even the insoluble ones, as the nutrient source and convert them to simple fragmented forms. The aim of this chapter is to elucidate the ability of basidiomycetes to degrade xenobiotics. This is an overview to present the importance of extracellular enzymes for efficient bioremediation of a large variety of xenobiotics.

International Conference on Advanced Intelligent Systems and Informatics | 2016

Arabic Fine-Grained Opinion Categorization Using Discriminative Machine Learning Technique

Imen Touati; Marwa Graja; Mariem Ellouze; Lamia Hadrich Belguith

This paper presents an approach of fine-grained opinion categorization in Arabic news articles. This approach is based on lexical semantic analysis. We propose to categorize every opinion expression using a proposed typology of four top-level semantic categories: reporting, judgment, advice and sentiment. Each word or opinion expression will be annotated with a semantic representation which takes in consideration specificities of Arabic language. To the best of our knowledge, there is no annotated Arabic opinion corpus with the proposed semantic representation. The task of categorization is considered as a classification problem. So, we use a Conditional Random Fields (CRF) as a discriminative model that we consider as a good contribution, because of the lack of similar fine-grained opinion categorization performed with CRF. The obtained results show that the integration of CRF models is important for opinion classification of the Arabic language.

language resources and evaluation | 2018

Automatic Speech Recognition System for Tunisian Dialect

Abir Masmoudi; Fethi Bougares; Mariem Ellouze; Yannick Estève; Lamia Hadrich Belguith

AbstractAlthough Modern Standard Arabic is taught in schools and used in written communication and TV/radio broadcasts, all informal communication is typically carried out in dialectal Arabic. In this work, we focus on the design of speech tools and resources required for the development of an Automatic Speech Recognition system for the Tunisian dialect. The development of such a system faces the challenges of the lack of annotated resources and tools, apart from the lack of standardization at all linguistic levels (phonological, morphological, syntactic and lexical) together with the mispronunciation dictionary needed for ASR development. In this paper, we present a historical overview of the Tunisian dialect and its linguistic characteristics. We also describe and evaluate our rule-based phonetic tool. Next, we go deeper into the details of Tunisian dialect corpus creation. This corpus is finally approved and used to build the first ASR system for Tunisian dialect with a Word Error Rate of 22.6%.

Proceedings of the 2nd Mediterranean Conference on Pattern Recognition and Artificial Intelligence | 2018

Opinion Target Extraction from Arabic News Articles Using shallow Features

Imen Touati; Marwa Graja; Mariem Ellouze; Lamia Hadrich Belguith

Target identification is one of the important tasks related to opinion mining. Indeed, there are few works in this field that deals with Arabic Language because of the lack of annotated corpora. In this paper, we propose to investigate the problem of opinion target identification from Arabic news articles using Conditional Random Fields (CRF) as discriminative framework. Opinion target recognition task consists in determining terms forming the target span. To the best of our knowledge, there is no similar work done in this field for Arabic language and especially for news articles. Experiments show that we can perform excellent results with consideration of semantic correlation between words and without relying on deep syntactic features. Our proposed method identifies opinion target with 95% F-measure, for a given opinion word using bi-gram feature, words in context and other features.

conference of the international speech communication association | 2016

Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion.

Abir Masmoudi; Mariem Ellouze; Fethi Bougares; Yannick Esètve; Lamia Hadrich Belguith

Conditional Random Fields (CRFs) represent an effective approach for monotone string-to-string translation tasks. In this work, we apply the CRF model to perform graphemeto-phoneme (G2P) conversion for the Tunisian Dialect. This choice is motivated by the fact that CRFs give a long term prediction and assume relaxed state independence conditions compared to HMMs [7]. The CRF model needs to be trained on a 1-to-1 alignement between graphemes and phonemes. Alignments are generated using Joint-Multigram Model (JMM) and GIZA++ toolkit. We trained CRF model for each generated alignment. We then compared our models to state-of-the-art G2P systems based on Sequitur G2P and Phonetisaurus toolkit. We also investigate the CRF prediction quality with different training size. Our results show that CRF perform slightly better using JMM alignment and outperform both Sequitur and Phonetisaurus systems with different training size. At the end, our system gets a phone error rate of 14.09%.

Proceedings of the Mediterranean Conference on Pattern Recognition and Artificial Intelligence | 2016

Towards Arabic semantic opinion mining: identifying opinion, polarity and intensity

Imen Touati; Marwa Graja; Mariem Ellouze; Lamia Hadrich Belguith

Arabic opinion mining is a challenging task because Arabic is morphologically and semantically rich language. In this paper, we are interested in analyzing opinions in Arabic news articles. We propose to use a machine learning technique to classify opinions or sentiments at the expression level. Our approach involves determining the semantic category of the expression. It also includes the classification of the opinion expression into positive or negative and the classification of its intensity into high, medium and low. Our method relies on wide range of features which are used in the literature like n-grams, morphological, stylistic features, etc. In addition, we propose new features inspired from contextual, semantic information and others specific for Arabic language. In the same context, we try to have a good contribution in opinion mining in Arabic by proposing to use Conditional Random Fields as a discriminative model. We carry out many experiments by combining at the same time different set of features to find the best combination that yield the best results. We evaluate our method at the expression level using a corpus of Arabic news articles. Our method achieves a good result that reaches 84.93% for contextual polarity classification and 87.54% for semantic opinion expression categorization.

Explore More