Maher Jaoua | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maher Jaoua is active.

Explore More

Publication

Featured researches published by Maher Jaoua.

international conference on computational linguistics | 2003

Automatic text summarization of scientific articles based on classification of extract's population

Maher Jaoua; Abdelmajid Ben Hamadou

We propose in this paper a summarization method that creates indicative summaries from scientific papers. Unlike conventional methods that extract important sentences, our method considers the extract as the minimal unit for extraction and uses two steps: the generation and the classification. The first step combines text sentences to produce a population of extracts. The second step evaluates each extract using global criteria in order to select the best one. In this case, the criteria are defined according to the whole extract rather than sentences. We have developed a prototype of the summarization system for French language called ExtraGen that implements a genetic algorithm simulating the mechanism of generation and classification.

international conference on computational linguistics | 2013

Orthographic transcription for spoken tunisian arabic

Inès Zribi; Marwa Graja; Mariem Ellouze Khmekhem; Maher Jaoua; Lamia Hadrich Belguith

Transcribing spoken Arabic dialects is an important task for building speech corpora. Therefore, it is necessary to follow a definite orthography and a definite annotation to transcribe speech data. In this paper, we present OTTA, Orthographic Transcription for Tunisian Arabic. This convention proposes the use of some rules based on the standard Arabic transcription conventions and we define a set of conventions which preserve the particularities of Tunisian dialect.

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing | 2013

Discriminative framework for spoken tunisian dialect understanding

Marwa Graja; Maher Jaoua; Lamia Hadrich Belguith

In this paper, we propose to evaluate the performance of a discriminative model to semantically label spoken Tunisian dialect turns which are not segmented into utterances. We evaluate discriminative algorithm based on Conditional Random Fields (CRF). We check the performance of the CRF model to concept labeling on raw data in Tunisian dialect which are not analyzed in advance. We compared its performance with different types of preprocessing data until arriving to well treated data. CRF model showed the ability to ameliorate the accuracy of labeling task for spoken language understanding of not segmented and not treated speech in Tunisian dialect.

IEEE Transactions on Audio, Speech, and Language Processing | 2015

Statistical framework with knowledge base integration for robust speech understanding of the Tunisian dialect

Marwa Graja; Maher Jaoua; L. Hadrich Belguith

In this paper, we propose a hybrid method for the spoken Tunisian dialect understanding within a limited task. This method couples a discriminative statistical method with a domain ontology. The statistical method is based on conditional random field (CRF) models learned from a little size corpus to perform conceptual labeling task. These models are able to detect the semantic dependency between words. However, the domain ontology is used to add prior knowledge about the task. Our experiments are based on a real spoken Tunisian dialect corpus. The obtained results show that the proposed method is able to improve the performance of CRF models for speech understanding by the integration of the domain ontology. Our method can be exploited for under-resourced languages and Arabic dialects to overcome the lack of linguistic resources .

international conference on neural information processing | 2011

Towards Understanding Spoken Tunisian Dialect

Marwa Graja; Maher Jaoua; Lamia Hadrich Belguith

This paper presents a method for semantic interpretation designed for Tunisian dialect. Our method is based on lexical semantics to overcome the lack of resources for the studied dialect. This method is Ontology-based which allows exploiting the ontological concepts for semantic annotation and ontological relations for interpretation. This combination reduces inaccuracies and increases the rate of comprehension. This paper also details the process of building the Ontology used for annotation and interpretation of Tunisian dialect utterances in the context of speech understanding in dialogue systems.

Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres | 2017

Machine Learning Approach to Evaluate MultiLingual Summaries.

Samira Ellouze; Maher Jaoua; Lamia Hadrich Belguith

The present paper introduces a new Multiling text summary evaluation method. This method relies on machine learning approach which operates by combining multiple features to build models that predict the human score (overall responsiveness) of a new summary. We have tried several single and “ensemble learning” classiers to build the best model. We have experimented our method in summary level evaluation where we evaluate the quality of each text summary separately. The correlation between built models and human score is better than the correlation between the baselines and the manual score.

applications of natural language to data bases | 2016

Automatic Evaluation of a Summary’s Linguistic Quality

Samira Ellouze; Maher Jaoua; Lamia Hadrich Belguith

The Evaluation of a summary’s linguistic quality is a difficult task because several linguistic aspects (e.g. grammaticality, coherence, etc.) must be verified to ensure the well formedness of a text’s summary. In this paper, we report the result of combining “Adapted ROUGE” scores and linguistic quality features to assess linguistic quality. We build and evaluate models for predicting the manual linguistic quality score using linear regression. We construct models for evaluating the quality of each text summary (summary level evaluation) and of each summarizing system (system level evaluation). We assess the performance of a summarizing system using the quality of a set of summaries generated by the system. All models are evaluated using the Pearson correlation and the Root mean squared error.

Document numérique | 2012

Évaluation de l'impact de l'intégration des étapes de filtrage et de compression dans le processus d'automatisation du résumé

Maher Jaoua; Fatma Kallel Jaoua; Lamia Hadrich Belguith; Abdelmajid Ben Hamadou

Dans cet article, nous proposons une evaluation de l’impact de l’integration des etapes de compression et de filtrage dans la chaine de resume automatique. Cette evaluation se base sur un certain nombre d’experiences que nous avons menees sur des sous-corpus dissemines lors la conference DUC-TAC. Afin de mener ces experiences, nous avons adopte une methode d’extraction qui considere le processus de resume comme etant un probleme d’optimisation ou il s’agit d’en determiner la meilleure partition qui repond a des criteres predetermines. Les resultats obtenus montrent l’importance de l’integration des etapes de filtrage et de compression.

applications of natural language to data bases | 2016

PIRAT: A Personalized Information Retrieval System in Arabic Texts Based on a Hybrid Representation of a User Profile

Houssem Safi; Maher Jaoua; Lamia Hadrich Belguith

The work presented in this paper aims at developing a Personalized Information Retrieval system in Arabic Texts (“PIRAT”) based on the user’s preferences/interests. For this reason, we proposed a user’s modeling and a personalized matching method document-query. The proposed user’s modeling is based on a hybrid representation of the user profile. In this approach, we introduce an algorithm which automatically builds a hierarchical user profile that represents his implicit personal interests and domain. It is to represent the interests and the domain with a conceptual network of nodes linked together through relationships respecting the linking topology defined in the domain of hierarchies and ontologies (hyperonymy, hyponymy, and synonymy). Then, we address the problem of unavailable language resources by building (i) a large Arabic text corpus entitled “WCAT” and (ii) Building our own Arabic queries corpus entitled “AQC2” in order to evaluate the suggested PIRAT system and AXON system. The results of this evaluation are promising.

international conference on computer and electrical engineering | 2009

Experimentation of Two Compression Strategies for Multi-document Summarization

Jaoua Kallel Fatma; Lamia Hadrich Belguith; Maher Jaoua; Abdelmajid Ben Hamadou

In this paper, we compare two strategies for the integration of a compression module in the automatic summarization chain. The first strategy, that we call precompression uses sentence compression in the first stage of summarization by producing all reduced forms of original sentences. The second strategy, called post-compression, reduces extract’s sentences in order to generate the final extract. The experiment results are presented on a document set extracted from the DUC’04 evaluation conference.

Explore More