Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Abdelati Hawwari is active.

Publication


Featured researches published by Abdelati Hawwari.


workshop on computational approaches to code switching | 2014

Overview for the First Shared Task on Language Identification in Code-Switched Data

Thamar Solorio; Elizabeth Blair; Suraj Maharjan; Steven Bethard; Mona T. Diab; Mahmoud Ghoneim; Abdelati Hawwari; Fahad AlGhamdi; Julia Hirschberg; Alison Chang; Pascale Fung

We present an overview of the first shared task on language identification on codeswitched data. The shared task included code-switched data from four language pairs: Modern Standard ArabicDialectal Arabic (MSA-DA), MandarinEnglish (MAN-EN), Nepali-English (NEPEN), and Spanish-English (SPA-EN). A total of seven teams participated in the task and submitted 42 system runs. The evaluation showed that language identification at the token level is more difficult when the languages present are closely related, as in the case of MSA-DA, where the prediction performance was the lowest among all language pairs. In contrast, the language pairs with the higest F-measure where SPA-EN and NEP-EN. The task made evident that language identification in code-switched data is still far from solved and warrants further research.


empirical methods in natural language processing | 2014

A Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic

Abdelati Hawwari; Mohammed Attia; Mona T. Diab

In this paper we describe a framework for classifying and annotating Egyptian Arabic Multiword Expressions (EMWE) in a specialized computational lexical resource. The framework intends to encompass comprehensive linguistic information for each MWE including: a. phonological and orthographic information; b. POS tags; c. structural information for the phrase structure of the expression; d. lexicographic classification; e. semantic classification covering semantic fields and semantic relations; f. degree of idiomaticity where we adopt a three-level rating scale; g. pragmatic information in the form of usage labels; h. Modern Standard Arabic equivalents and English translations, thereby rendering our resource a three-way ‐ Egyptian Arabic, Modern Standard Arabic and English ‐ repository for MWEs.


meeting of the association for computational linguistics | 2015

A Pilot Study on Arabic Multi-Genre Corpus Diacritization

Houda Bouamor; Wajdi Zaghouani; Mona T. Diab; Ossama Obeid; Kemal Oflazer; Mahmoud Ghoneim; Abdelati Hawwari

Arabic script writing is typically underspecified for short vowels and other mark up, referred to as diacritics. Apart from the lexical ambiguity found in words, similar to that exhibited in other languages, the lack of diacritics in written Arabic script adds another layer of ambiguity which is an artifact of the orthography. Diacritization of written text has a significant impact on Arabic NLP applications. In this paper, we present a pilot study on building a diacritized multi-genre corpus in Arabic. We annotate a sample of nondiacritized words extracted from five text genres. We explore different annotation strategies: Basic where we present only the bare undiacritized forms to the annotators, Intermediate (Basic forms+their POS tags), and Advanced (automatically diacritized words). We present the impact of the annotation strategy on annotation quality. Moreover, we study different diacritization schemes in the process.


Language, Culture, Computation (3) | 2014

Arabic Multiword Expressions

Kfir Bar; Mona T. Diab; Abdelati Hawwari

In this work we address the problem of automatic multiword expression identification and classification in Arabic running text. We propose a supervised machine learning approach using a relatively small manually annotated data augmented with an increasing size of automatically tagged data, labeled using a deterministic pattern-matching algorithm. In particular, in this chapter, we show the impact of explicitly modeling morpho-syntactic features calculated on the detection task. Moreover, we present the first work to address the problem of handling gapped verb-noun constructions in running text. We show that using the syntactic construction classes as labels improves identification results for verb-noun and verb-particle constructions. Our best identification algorithm yields an F-measure of 61.4%, which is a significant improvement over our baseline of 48.8%.


workshop on computational approaches to code switching | 2016

Part of Speech Tagging for Code Switched Data.

Fahad AlGhamdi; Giovanni Molina; Mona T. Diab; Thamar Solorio; Abdelati Hawwari; Victor Soto; Julia Hirschberg

We address the problem of Part of Speech tagging (POS) in the context of linguistic code switching (CS). CS is the phenomenon where a speaker switches between two languages or variants of the same language within or across utterances, known as intra-sentential or inter-sentential CS, respectively. Processing CS data is especially challenging in intra-sentential data given state of the art monolingual NLP technology since such technology is geared toward the processing of one language at a time. In this paper we explore multiple strategies of applying state of the art POS taggers to CS data. We investigate the landscape in two CS language pairs, Spanish-English and Modern Standard Arabic-Arabic dialects. We compare the use of two POS taggers vs. a unified tagger trained on CS data. Our results show that applying a machine learning framework using two state of the art POS taggers achieves better performance compared to all other approaches that we investigate.


Proceedings of the Third Arabic Natural Language Processing Workshop | 2017

A Layered Language Model based Hybrid Approach to Automatic Full Diacritization of Arabic.

Mohamed Al-Badrashiny; Abdelati Hawwari; Mona T. Diab

In this paper we present a system for automatic Arabic text diacritization using three levels of analysis granularity in a layered back off manner. We build and exploit diacritized language models (LM) for each of three different levels of granularity: surface form, morphologically segmented into prefix/stem/suffix, and character level. For each of the passes, we use Viterbi search to pick the most probable diacritization per word in the input. We start with the surface form LM, followed by the morphological level, then finally we leverage the character level LM. Our system outperforms all of the published systems evaluated against the same training and test data. It achieves a 10.87% WER for complete full diacritization including lexical and syntactic diacritization, and 3.0% WER for lexical diacritization, ignoring syntactic diacritization.


International Journal of Speech Technology | 2016

AMPN: a semantic resource for Arabic morphological patterns

Wajdi Zaghouani; Abdelati Hawwari; Mona T. Diab; Tim O'Gorman; Ahmed Badran

Abstract In this paper, we present a pilot Arabic morphological Pattern Net study based on a lexical semantic resource. During this study, a limited number of Arabic Morphological Patterns have been selected in order to analyze the structure and the behavior of the verbs in the Arabic PropBank, which is a semantically annotated corpus of newswire text from the Annahar Journal. Our goal is twofold: (a) to study whether there is a direct relationship between morphological patterns and verbal semantic roles; and, (b) to verify that this direct relationship is a pervasive component of Arabic verb morphology. The approach to building our morphological Patterns database is based on linguistic generalization of the semantic roles of the verbal predicates. The results obtained show promising outcome for a future, more comprehensive study.


Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology | 2012

A Morphological Analyzer for Egyptian Arabic

Nizar Habash; Ramy Eskander; Abdelati Hawwari


language resources and evaluation | 2014

Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon

Mona T. Diab; Mohamed Al-Badrashiny; Maryam Aminian; Mohammed Attia; Heba Elfardy; Nizar Habash; Abdelati Hawwari; Wael Salloum; Pradeep Dasigi; Ramy Eskander


meeting of the association for computational linguistics | 2012

Building an Arabic Multiword Expressions Repository

Abdelati Hawwari; Kfir Bar; Mona T. Diab

Collaboration


Dive into the Abdelati Hawwari's collaboration.

Top Co-Authors

Avatar

Mona T. Diab

George Washington University

View shared research outputs
Top Co-Authors

Avatar

Mahmoud Ghoneim

George Washington University

View shared research outputs
Top Co-Authors

Avatar

Wajdi Zaghouani

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Houda Bouamor

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ossama Obeid

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Kemal Oflazer

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thamar Solorio

University of Alabama at Birmingham

View shared research outputs
Researchain Logo
Decentralizing Knowledge