Gilbert Badaro
American University of Beirut
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gilbert Badaro.
empirical methods in natural language processing | 2014
Gilbert Badaro; Ramy Baly; Hazem M. Hajj; Nizar Habash; Wassim El-Hajj
Most opinion mining methods in English rely successfully on sentiment lexicons, such as English SentiWordnet (ESWN). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies: limited size, unclear usability plan given Arabic’s rich morphology, or nonavailability publicly. In this paper, we address all of these issues and produce the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) using a combination of existing resources: ESWN, Arabic WordNet, and the Standard Arabic Morphological Analyzer (SAMA). We compare and combine two methods of constructing this lexicon with an eye on insights for Arabic dialects and other low resource languages. We also present an extrinsic evaluation in terms of subjectivity and sentiment analysis.
meeting of the association for computational linguistics | 2015
Gilbert Badaro; Ramy Baly; Rana Akel; Linda Fayad; Jeffrey Khairallah; Hazem M. Hajj; Khaled Bashir Shaban; Wassim El-Hajj
Most advanced mobile applications require server-based and communication. This often causes additional energy consumption on the already energy-limited mobile devices. In this work, we provide to address these limitations on the mobile for Opinion Mining in Arabic. Instead of relying on compute-intensive NLP processing, the method uses an Arabic lexical resource stored on the device. Text is stemmed, and the words are then matched to our own developed ArSenL. ArSenL is the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) developed using a combination of English SentiWordnet (ESWN), Arabic WordNet, and the Arabic Morphological Analyzer (AraMorph). The scores from the matched stems are then processed through a classifier for determining the polarity. The method was tested on a published set of Arabic tweets, and an average accuracy of 67% was achieved. The developed mobile application is also made publicly available. The application takes as input a topic of interest and retrieves the latest Arabic tweets related to this topic. It then displays the tweets superimposed with colors representing sentiment labels as positive, negative or neutral. The application also provides visual summaries of searched topics and a history showing how the sentiments for a certain topic have been evolving.
acm transactions on asian and low resource language information processing | 2017
Ahmad Al-Sallab; Ramy Baly; Hazem M. Hajj; Khaled Bashir Shaban; Wassim El-Hajj; Gilbert Badaro
While research on English opinion mining has already achieved significant progress and success, work on Arabic opinion mining is still lagging. This is mainly due to the relative recency of research efforts in developing natural language processing (NLP) methods for Arabic, handling its morphological complexity, and the lack of large-scale opinion resources for Arabic. To close this gap, we examine the class of models used for English and that do not require extensive use of NLP or opinion resources. In particular, we consider the Recursive Auto Encoder (RAE). However, RAE models are not as successful in Arabic as they are in English, due to their limitations in handling the morphological complexity of Arabic, providing a more complete and comprehensive input features for the auto encoder, and performing semantic composition following the natural way constituents are combined to express the overall meaning. In this article, we propose A Recursive Deep Learning Model for Opinion Mining in Arabic (AROMA) that addresses these limitations. AROMA was evaluated on three Arabic corpora representing different genres and writing styles. Results show that AROMA achieved significant performance improvements compared to the baseline RAE. It also outperformed several well-known approaches in the literature.
Proceedings of the Third Arabic Natural Language Processing Workshop | 2017
Ramy Baly; Gilbert Badaro; Georges El-Khoury; Rawan Moukalled; Rita Aoun; Hazem M. Hajj; Wassim El-Hajj; Nizar Habash; Khaled Bashir Shaban
Opinion mining in Arabic is a challenging task given the rich morphology of the language. The task becomes more challenging when it is applied to Twitter data, which contains additional sources of noise, such as the use of unstandardized dialectal variations, the nonconformation to grammatical rules, the use of Arabizi and code-switching, and the use of non-text objects such as images and URLs to express opinion. In this paper, we perform an analytical study to observe how such linguistic phenomena vary across different Arab regions. This study of Arabic Twitter characterization aims at providing better understanding of Arabic Tweets, and fostering advanced research on the topic. Furthermore, we explore the performance of the two schools of machine learning on Arabic Twitter, namely the feature engineering approach and the deep learning approach. We consider models that have achieved state-of-the-art performance for opinion mining in English. Results highlight the advantages of using deep learning-based models, and confirm the importance of using morphological abstractions to address Arabic’s complex morphology.
social network mining and analysis | 2014
Gilbert Badaro; Hazem M. Hajj; Ali Haddad; Wassim El-Hajj; Khaled Bashir Shaban
Recommender systems face performance challenges when dealing with sparse data. This paper addresses these challenges and proposes the use of Harmonic Analysis. The method provides a novel approach to the user-item matrix and extracts the interplay between users and items at multiple resolution levels. New affinity matrices are defined to measure similarities among users, among items, and across items and users. Furthermore, the similarities are assessed at multiple levels of granularity allowing individual and group level similarities. These affinity matrices thus produce multiresolution groupings of items and users, and in turn lead to higher accuracy in matching similar context for ratings, and more accurate prediction of new ratings. Evaluation results show superiority of the approach compared to state of the art solutions.
international conference on data mining | 2014
Gilbert Badaro; Hazem M. Hajj; Ali Haddad; Wassim El-Hajj; Khaled Bashir Shaban
Recommender systems provide recommendations on variety of personal activities or relevant items of interest. They can play a significant role for E-commerce and in daily personal decisions. However, existing recommender systems still face challenges in dealing with sparse data and still achieving high accuracy and reasonable performance. The issue with missing rating leads to inaccuracies when trying to match items or users for rating prediction. In this paper, we propose to address these challenges with the use of Harmonic Analysis. The paper extends on our previous work, and provides a comprehensive coverage of the method with additional experiments. The method provides a novel multiresolution approach to the user-item matrix and extracts the interplay between users and items at multiple resolution levels. New affinity matrices are defined to measure similarities among users, among items, and across items and users. Furthermore, the similarities are assessed at multiple levels of granularity allowing individual and group level similarities. These affinity matrices thus produce multiresolution groupings of items and users, and in turn lead to higher accuracy in matching similar context for ratings, and more accurate prediction of new ratings. The evaluation of the system shows superiority of the solution compared to state of the art solutions for user-based collaborative filtering and item-based collaborative filtering.
meeting of the association for computational linguistics | 2015
Ahmad A. Al Sallab; Hazem M. Hajj; Gilbert Badaro; Ramy Baly; Wassim El Hajj; Khaled Bashir Shaban
international conference on wireless communications and mobile computing | 2013
Gilbert Badaro; Hazem M. Hajj; Wassim El-Hajj; Lama Nachman
meeting of the association for computational linguistics | 2017
Ramy Baly; Gilbert Badaro; Ali Hamdi; Rawan Moukalled; Rita Aoun; Georges El-Khoury; Ahmad A. Al Sallab; Hazem M. Hajj; Nizar Habash; Khaled Bashir Shaban; Wassim El-Hajj
north american chapter of the association for computational linguistics | 2018
Gilbert Badaro; Obeida El Jundi; Alaa Khaddaj; Alaa Maarouf; Raslan Kain; Hazem M. Hajj; Wassim El-Hajj