Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lamia Hadrich Belguith is active.

Publication


Featured researches published by Lamia Hadrich Belguith.


international conference on communications | 2013

Arabic WordNet semantic relations enrichment through morpho-lexical patterns

Mohamed Mahdi Boudabous; Nouha Chaâben Kammoun; Nacef Khedher; Lamia Hadrich Belguith; Fatiha Sadat

Arabic WordNet (AWN) ontology is one of the most interesting lexical resources for Modern Standard Arabic. Although, its development is based on Princeton WordNet, it suffers from some weaknesses such as the absence of some words and some semantic relations between synsets. In this paper we propose a linguistic method based on morpho-lexical patterns to add semantic relations between synsets in order to improve the AWN performance. This method relies on two steps: morpho-lexical patterns definition and Semantic relations enrichment. We will take advantage of defined patterns to propose a hybrid method for building Arabic ontology based on Wikipedia.


international conference natural language processing | 2009

An Arabic question-answering system for factoid questions

Wissal Brini; Mariem Ellouze; Slim Mesfar; Lamia Hadrich Belguith

In this paper, we propose an Arabic Question-Answering (Q-A) system called QASAL «Question -Answering system for Arabic Language». QASAL accepts as an input a natural language question written in Modern Standard Arabic (MSA) and generates as an output the most efficient and appropriate answer. The proposed system is composed of three modules: A question analysis module, a passage retrieval module and an answer extraction module. To process these three modules we use the NooJ Platform which represents a linguistic development environment.


international conference on computational linguistics | 2013

Orthographic transcription for spoken tunisian arabic

Inès Zribi; Marwa Graja; Mariem Ellouze Khmekhem; Maher Jaoua; Lamia Hadrich Belguith

Transcribing spoken Arabic dialects is an important task for building speech corpora. Therefore, it is necessary to follow a definite orthography and a definite annotation to transcribe speech data. In this paper, we present OTTA, Orthographic Transcription for Tunisian Arabic. This convention proposes the use of some rules based on the standard Arabic transcription conventions and we define a set of conventions which preserve the particularities of Tunisian dialect.


SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing | 2013

Discriminative framework for spoken tunisian dialect understanding

Marwa Graja; Maher Jaoua; Lamia Hadrich Belguith

In this paper, we propose to evaluate the performance of a discriminative model to semantically label spoken Tunisian dialect turns which are not segmented into utterances. We evaluate discriminative algorithm based on Conditional Random Fields (CRF). We check the performance of the CRF model to concept labeling on raw data in Tunisian dialect which are not analyzed in advance. We compared its performance with different types of preprocessing data until arriving to well treated data. CRF model showed the ability to ameliorate the accuracy of labeling task for spoken language understanding of not segmented and not treated speech in Tunisian dialect.


international conference natural language processing | 2010

Digital learning for summarizing Arabic documents

Mohamed Mahdi Boudabous; Mohamed Hédi Maaloul; Lamia Hadrich Belguith

We present in this paper an automatic summarization method of Arabic documents. This method is based on a numerical approach which uses a semi-supervised learning technique. The proposed method consists of two phases. The first one is the learning phase and the second is the use phase. The learning phase is based on the Support Vector Machine (SVM) algorithm. In order to evaluate our method, we conducted a comparative study that involves the results generated by our system AIS (Arabic Intelligent Summarizer) with that realized by a human expert. The obtained results are very encouraging and we plan to extend our evaluation on a larger corpus to ensure the performance of our system.


ACM Transactions on Asian Language Information Processing | 2014

Splitting Arabic Texts into Elementary Discourse Units

Iskandar Keskes; Farah Benamara Zitoune; Lamia Hadrich Belguith

In this article, we propose the first work that investigates the feasibility of Arabic discourse segmentation into elementary discourse units within the segmented discourse representation theory framework. We first describe our annotation scheme that defines a set of principles to guide the segmentation process. Two corpora have been annotated according to this scheme: elementary school textbooks and newspaper documents extracted from the syntactically annotated Arabic Treebank. Then, we propose a multiclass supervised learning approach that predicts nested units. Our approach uses a combination of punctuation, morphological, lexical, and shallow syntactic features. We investigate how each feature contributes to the learning process. We show that an extensive morphological analysis is crucial to achieve good results in both corpora. In addition, we show that adding chunks does not boost the performance of our system.


conference on intelligent text processing and computational linguistics | 2015

Arabic Transliteration of Romanized Tunisian Dialect Text: A Preliminary Investigation

Abir Masmoudi; Nizar Habash; Mariem Ellouze; Yannick Estève; Lamia Hadrich Belguith

In this paper, we describe the process of converting Tunisian Dialect text that is written in Latin script (also called Arabizi) into Arabic script following the CODA orthography convention for Dialectal Arabic. Our input consists of messages and comments taken from SMS, social networks and broadcast videos. The language used in social media and SMS messaging is characterized by the use of informal and non-standard vocabulary such as repeated letters for emphasis, typos, non-standard abbreviations, and nonlinguistic content, such as emoticons. There is a high degree of variation is spelling in Arabic dialects due to the lack of orthographic widely supported standards in both Arabic and Latin scripts. In the context of natural language processing, transliterating from Arabizi to Arabic script is a necessary step since most recently available tools for processing Arabic Dialects expect Arabic script input.


applications of natural language to data bases | 2010

An automatic definition extraction in Arabic language

Omar Trigui; Lamia Hadrich Belguith; Paolo Rosso

During the last few years, a lot of researches have focused on automatic definition extraction in the context of question answering systems. Although, these researches have been conducted for different languages, no research has been proposed for Arabic. In this paper, we tackle the automatic definition extraction in the context of Question Answering systems. We propose a method based on patterns to automatically identify a definition answer to a definition question. The proposed method is implemented in an Arabic definitional question answering system. We experimented this system using a set of 50 definition questions, and a corpus of 2000 snippets collected from the Web. The obtained results are very encouraging: 94% of the definition questions have complete definitions among their first 5 answers.


applications of natural language to data bases | 2014

Fine-Grained POS Tagging of Spoken Tunisian Dialect Corpora

Rahma Boujelbane; Mariem Mallek; Mariem Ellouze; Lamia Hadrich Belguith

Arabic Dialects (AD) have recently begun to receive more attention from the speech science and technology communities. The use of dialects in language technologies will contribute to improve the development process and the usability of applications such speech recognition, speech comprehension, or speech synthesis. However, AD faces the problem of lack of resources compared to the Modern Standard Arabic (MSA). This paper deals with the problem of tagging an AD: The Tunisian Dialect (TD). We present, in this work, a method for building a fine grained POS (Part Of Speech tagger) for the TD. This method consists on adapting a MSA POS tagger by generating a training TD corpus from a MSA corpus using a bilingual lexicon MSA-TD. The evaluation of the TD tagger on a corpus of text transcriptions achieved an accuracy of 78.5%.


acs/ieee international conference on computer systems and applications | 2014

Chunking Arabic texts using Conditional Random Fields

Nabil Khoufi; Chafik Aloulou; Lamia Hadrich Belguith

Chunking or shallow syntactic parsing is proving to be a task of interest to many natural language processing applications. The problem gets worse for the Arabic language because of its specific features that make it quite different and even more ambiguous than other natural languages when processed. In this paper, we present a method for chunking Arabic texts based on supervised learning. We use the Conditional Random Fields algorithm and the Penn Arabic Treebank to train the model. For the experimentation, we use over than 10,100 sentences as training data and 2,524 sentences for the test. The evaluation of the method consists of the calculation of the generated model accuracy and the results are very encouraging.

Collaboration


Dive into the Lamia Hadrich Belguith's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fatiha Sadat

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge