Is this you? Create Your Porfile

Hassan Sajjad

Qatar Computing Research Institute

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hassan Sajjad is active.

Explore More

Publication

Featured researches published by Hassan Sajjad.

conference of the european chapter of the association for computational linguistics | 2014

Integrating an Unsupervised Transliteration Model into Statistical Machine Translation

Nadir Durrani; Hassan Sajjad; Hieu Hoang; Philipp Koehn

We investigate three methods for integrating an unsupervised transliteration model into an end-to-end SMT system. We induce a transliteration model from parallel data and use it to translate OOV words. Our approach is fully unsupervised and language independent. In the methods to integrate transliterations, we observed improvements from 0.23-0.75 ( 0.41) BLEU points across 7 language pairs. We also show that our mined transliteration corpora provide better rule coverage and translation quality compared to the gold standard transliteration corpora.

Social Network Analysis and Mining | 2015

Bridging social media via distant supervision

Walid Magdy; Hassan Sajjad; Tarek El-Ganainy; Fabrizio Sebastiani

Microblog classification has received a lot of attention in recent years. Different classification tasks have been investigated, most of them focusing on classifying microblogs into a small number of classes (five or less) using a training set of manually annotated tweets. Unfortunately, labelling data is tedious and expensive, and finding tweets that cover all the classes of interest is not always straightforward, especially when some of the classes do not frequently arise in practice. In this paper, we study an approach to tweet classification based on distant supervision, whereby we automatically transfer labels from one social medium to another for a single-label multi-class classification task. In particular, we apply YouTube video classes to tweets linking to these videos. This provides for free a virtually unlimited number of labelled instances that can be used as training data. The classification experiments we have run show that training a tweet classifier via these automatically labelled data achieves substantially better performance than training the same classifier with a limited amount of manually labelled data; this is advantageous, given that the automatically labelled data come at no cost. Further investigation of our approach shows its robustness when applied with different numbers of classes and across different languages.

empirical methods in natural language processing | 2014

Verifiably Effective Arabic Dialect Identification

Kareem Darwish; Hassan Sajjad; Hamdy Mubarak

Several recent papers on Arabic dialect identification have hinted that using a word unigram model is sufficient and effective for the task. However, most previous work was done on a standard fairly homogeneous dataset of dialectal user comments. In this paper, we show that training on the standard dataset does not generalize, because a unigram model may be tuned to topics in the comments and does not capture the distinguishing features of dialects. We show that effective dialect identification requires that we account for the distinguishing lexical, morphological, and phonological phenomena of dialects. We show that accounting for such can improve dialect detection accuracy by nearly 10% absolute.

empirical methods in natural language processing | 2015

How to Avoid Unwanted Pregnancies: Domain Adaptation using Neural Network Models

Shafiq R. Joty; Hassan Sajjad; Nadir Durrani; Kamla Al-Mannai; Ahmed Abdelali; Stephan Vogel

We present novel models for domain adaptation based on the neural network joint model (NNJM). Our models maximize the cross entropy by regularizing the loss function with respect to in-domain model. Domain adaptation is carried out by assigning higher weight to out-domain sequences that are similar to the in-domain data. In our alternative model we take a more restrictive approach by additionally penalizing sequences similar to the outdomain data. Our models achieve better perplexities than the baseline NNJM models and give improvements of up to 0.5 and 0.6 BLEU points in Arabic-to-English and English-to-German language pairs, on a standard task of translating TED talks.

workshop on statistical machine translation | 2015

How do Humans Evaluate Machine Translation

Francisco Guzmán; Ahmed Abdelali; Irina P. Temnikova; Hassan Sajjad; Stephan Vogel

In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task ‐ the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.

Computational Linguistics | 2017

Statistical models for unsupervised, semi-supervised, and supervised transliteration mining

Hassan Sajjad; Helmut Schmid; Alexander M. Fraser; Hinrich Schütze

We present a generative model that efficiently mines transliteration pairs in a consistent fashion in three different settings: unsupervised, semi-supervised, and supervised transliteration mining. The model interpolates two sub-models, one for the generation of transliteration pairs and one for the generation of non-transliteration pairs (i.e., noise). The model is trained on noisy unlabeled data using the EM algorithm. During training the transliteration sub-model learns to generate transliteration pairs and the fixed non-transliteration model generates the noise pairs. After training, the unlabeled data is disambiguated based on the posterior probabilities of the two sub-models. We evaluate our transliteration mining system on data from a transliteration mining shared task and on parallel corpora. For three out of four language pairs, our system outperforms all semi-supervised and supervised systems that participated in the NEWS 2010 shared task. On word pairs extracted from parallel corpora with fewer than 2% transliteration pairs, our system achieves up to 86.7% F-measure with 77.9% precision and 97.8% recall.

north american chapter of the association for computational linguistics | 2016

Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement

Hassan Sajjad; Francisco Guzmán; Nadir Durrani; Ahmed Abdelali; Houda Bouamor; Irina P. Temnikova; Stephan Vogel

Poorly translated text is often disfluent and difficult to read. In contrast, well-formed translations require less time to process. In this paper, we model the differences in reading patterns of Machine Translation (MT) evaluators using novel features extracted from their gaze data, and we learn to predict the quality scores given by those evaluators. We test our predictions in a pairwise ranking scenario, measuring Kendall’s tau correlation with the judgments. We show that our features provide information beyond fluency, and can be combined with BLEU for better predictions. Furthermore, our results show that reading patterns can be used to build semi-automatic metrics that anticipate the scores given by the evaluators.

Computer Speech & Language | 2017

Domain adaptation using neural network joint model

Shafiq R. Joty; Nadir Durrani; Hassan Sajjad; Ahmed Abdelali

Two sets of novel extensions of NNJM model are proposed. The NDAM models that regularizes the loss function with respect to in-domain model, give an improvement of up to +0.4 BLEU points. The NFM models that fuse in- and out-domain NNJM models give an improvement of up to +0.9 BLEU points. The NFM models also beat state-of-the-art phrase-table adaptation methods. The gains obtained from NNJM and phrase-table adaptation were found to be additive. We explore neural joint models for the task of domain adaptation in machine translation in two ways: (i)we apply state-of-the-art domain adaptation techniques, such as mixture modelling and data selection using the recently proposed Neural Network Joint Model (NNJM) (Devlin etal., 2014); (ii)we propose two novel approaches to perform adaptation through instance weighting and weight readjustment in the NNJM framework. In our first approach, we propose a pair of models called Neural Domain Adaptation Models (NDAM) that minimizes the cross entropy by regularizing the loss function with respect to in-domain (and optionally to out-domain) model. In the second approach, we present a set of Neural Fusion Models (NFM) that combines the in- and the out-domain models by readjusting their parameters based on the in-domain data.We evaluated our models on the standard task of translating English-to-German and Arabic-to-English TED talks. The NDAM models achieved better perplexities and modest BLEU improvements compared to the baseline NNJM, trained either on in-domain or on a concatenation of in- and out-domain data. On the other hand, the NFM models obtained significant improvements of up to +0.9 and +0.7 BLEU points, respectively. We also demonstrate improvements over existing adaptation methods such as instance weighting, phrasetable fill-up, linear and log-linear interpolations.

empirical methods in natural language processing | 2015

An Unsupervised Method for Discovering Lexical Variations in Roman Urdu Informal Text

Abdul Rafae; Abdul Qayyum; Muhammad Moeenuddin; Asim Karim; Hassan Sajjad; Faisal Kamiran

We present an unsupervised method to find lexical variations in Roman Urdu informal text. Our method includes a phonetic algorithm UrduPhone, a featurebased similarity function, and a clustering algorithm Lex-C. UrduPhone encodes roman Urdu strings to their phonetic equivalent representations. This produces an initial grouping of different spelling variations of a word. The similarity function incorporates word features and their context. Lex-C is a variant of k-medoids clustering algorithm that group lexical variations. It incorporates a similarity threshold to balance the number of clusters and their maximum similarity. We test our system on two datasets of SMS and blogs and show an f-measure gain of up to 12% from baseline systems.

meeting of the association for computational linguistics | 2013