Ayah Zirikly
George Washington University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ayah Zirikly.
north american chapter of the association for computational linguistics | 2015
Ayah Zirikly; Mona T. Diab
The majority of research on Arabic Named Entity Recognition (NER) addresses the the task for newswire genre, where the language used is Modern Standard Arabic (MSA), however, the need to study this task in social media is becoming more vital. Social media is characterized by the use of both MSA and Dialectal Arabic (DA), with often code switching between the two language varieties. Despite some common characteristics between MSA and DA, there are significant differences between which result in poor performance when MSA targeting systems are applied for NER in DA. Additionally, most NER systems rely primarily on gazetteers, which can be more challenging in a social media processing context due to an inherent low coverage. In this paper, we present a gazetteers-free NER system for Dialectal data that yields an F1 score of 72.68% which is an absolute improvement of 2 3% over a comparable state-ofthe-art gazetteer based DA-NER system.
empirical methods in natural language processing | 2014
Ayah Zirikly; Mona T. Diab
To date, majority of research for Arabic Named Entity Recognition (NER) addresses the task for Modern Standard Arabic (MSA) and mainly focuses on the newswire genre. Despite some common characteristics between MSA and Dialectal Arabic (DA), the significant differences between the two language varieties hinder such MSA specific systems from solving NER for Dialectal Arabic. In this paper, we present an NER system for DA specifically focusing on the Egyptian Dialect (EGY). Our system delivers ≈ 16% improvement in F1-score over state-of-theart features.
international joint conference on natural language processing | 2015
Ayah Zirikly; Masato Hagiwara
We propose an approach to cross-lingual named entity recognition model transfer without the use of parallel corpora. In addition to global de-lexicalized features, we introduce multilingual gazetteers that are generated using graph propagation, and cross-lingual word representation mappings without the use of parallel data. We target the e-commerce domain, which is challenging due to its unstructured and noisy nature. The experiments have shown that our approaches beat the strong MT baseline, where the English model is transferred to two languages: Spanish and Chinese.
north american chapter of the association for computational linguistics | 2016
Ayah Zirikly; Varun Kumar; Philip Resnik
Suicide is the third leading cause for death for young people, and in an average U.S. high school classroom, 30% have experienced a long period of feeling hopeless, 20% have been bullied, 16.7% have seriously considered suicide, and 6.7% of students have actually made a suicide attempt.1 The 2016 ACL Workshop on Computational Linguistics and Clinical Psychology (CLPsych) included a shared task focusing on classification of posts to ReachOut, an online information and support service that provides help to teens and young adults (aged 15-24) who are struggling with mental health issues.2 The primary goal of the shared task is to identify posts that require urgent attention and review from the ReachOut team (i.e. moderators).
applications of natural language to data bases | 2013
Ayah Zirikly; Mona T. Diab
Identifying the different aliases used by or for an entity is emerging as a significant problem in reliable Information Extraction systems, especially with the proliferation of social media and their ever growing impact on different aspects of modern life such as politics, finance, security, etc. In this paper, we address the novel problem of Named Entity Aliasing Resolution (NEAR). We attempt to solve the NEAR problem in a language-independent setting by extracting the different aliases and variants of person named entities. We generate feature vectors for the named entities by building co-occurrence models that use different weighting schemes. The aliasing resolution process applies unsupervised machine learning techniques over the vector space models in order to produce groups of entities along with their aliases. We test our approach on two languages: Arabic and English. We study the impact of varying the level of morphological preprocessing of the words, as well as the part of speech tags surrounding the person named entities, and the named entities’ distribution in the data set. We create novel evaluation data sets for both languages. NEAR yields better overall performance in Arabic than in English for comparable amounts of data, effectively using the POS tag information to improve performance. Our approach achieves an F β = 1score of 67.85% and 70.03% for raw English and Arabic data sets, respectively.
Archive | 2014
Ayah Zirikly; Mona T. Diab
international conference on computational linguistics | 2016
Ayah Zirikly; Bart Desmet; Mona T. Diab
arXiv: Computation and Language | 2018
Denis Newman-Griffis; Ayah Zirikly
arXiv: Computation and Language | 2018
Sean MacAvaney; Bart Desmet; Arman Cohan; Luca Soldaini; Andrew Yates; Ayah Zirikly; Nazli Goharian
Proceedings of the Fifth Workshop on Computational Linguistics and#N# Clinical Psychology: From Keyboard to Clinic | 2018
Han-Chin Shing; Suraj Nair; Ayah Zirikly; Meir Friedenberg; Hal Daumé; Philip Resnik