Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Avneesh Saluja is active.

Publication


Featured researches published by Avneesh Saluja.


meeting of the association for computational linguistics | 2014

Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data

Avneesh Saluja; Hany Hassan; Kristina Toutanova; Chris Quirk

Statistical phrase-based translation learns translation rules from bilingual corpora, and has traditionally only used monolingual evidence to construct features that rescore existing translation candidates. In this work, we present a semi-supervised graph-based approach for generating new translation rules that leverages bilingual and monolingual data. The proposed technique first constructs phrase graphs using both source and target language monolingual corpora. Next, graph propagation identifies translations of phrases that were not observed in the bilingual corpus, assuming that similar phrases have similar translations. We report results on a large Arabic-English system and a medium-sized Urdu-English system. Our proposed approach significantly improves the performance of competitive phrasebased systems, leading to consistent improvements between 1 and 4 BLEU points on standard evaluation sets.


empirical methods in natural language processing | 2014

Latent-Variable Synchronous CFGs for Hierarchical Translation

Avneesh Saluja; Chris Dyer; Shay B. Cohen

Data-driven refinement of non-terminal categories has been demonstrated to be a reliable technique for improving monolingual parsing with PCFGs. In this paper, we extend these techniques to learn latent refinements of single-category synchronous grammars, so as to improve translation performance. We compare two estimators for this latent-variable model: one based on EM and the other is a spectral algorithm based on the method of moments. We evaluate their performance on a Chinese–English translation task. The results indicate that we can achieve significant gains over the baseline with both approaches, but in particular the momentsbased estimator is both faster and performs better than EM.


Machine Translation | 2014

Online discriminative learning for machine translation with binary-valued feedback

Avneesh Saluja; Ying Zhang

Viewing machine translation (MT) as a structured classification problem has provided a gateway for a host of structured prediction techniques to enter the field. In particular, large-margin methods for discriminative training of feature weights, such as the structured perceptron or MIRA, have started to match or exceed the performance of existing methods such as MERT. One issue with these problems in general is the difficulty in obtaining fully structured labels, e.g. in MT, obtaining reference translations or parallel sentence corpora for arbitrary language pairs. Another issue, more specific to the translation domain, is the difficulty in online training and updating of MT systems, since existing methods often require bilingual knowledge to correct translation outputs online. The problem is an important one, especially with the usage of MT in the mobile domain: in the process of translating user inputs, these systems can also receive feedback from the user on the quality of the translations produced. We propose a solution to these two problems, by demonstrating a principled way to incorporate binary-labeled feedback (i.e. feedback on whether a translation hypothesis is a “good” or understandable one or not), a form of supervision that can be easily integrated in an online and monolingual manner, into an MT framework. Experimental results on Chinese–English and Arabic–English corpora for both sparse and dense feature sets show marked improvements by incorporating binary feedback on unseen test data, with gains in some cases exceeding 5.5 BLEU points. Experiments with human evaluators providing feedback present reasonable correspondence with the larger-scale, synthetic experiments and underline the relative ease by which binary feedback for translation hypotheses can be collected, in comparison to parallel data.


empirical methods in natural language processing | 2014

Language Modeling with Power Low Rank Ensembles

Ankur P. Parikh; Avneesh Saluja; Chris Dyer; Eric P. Xing

We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context. Our method can be understood as a generalization of n-gram modeling to non-integer n, and includes standard techniques such as absolute discounting and Kneser-Ney smoothing as special cases. PLRE training is efficient and our approach outperforms state-of-the-art modified Kneser Ney baselines in terms of perplexity on large corpora as well as on BLEU score in a downstream machine translation task.


international conference on artificial intelligence and statistics | 2012

Age-Layered Expectation Maximization for Parameter Learning in Bayesian Networks

Avneesh Saluja; Priya Krishnan Sundararajan; Ole J. Mengshoel


graph based methods for natural language processing | 2013

Graph-Based Unsupervised Learning of Word Similarities Using Heterogeneous Feature Types

Avneesh Saluja; Jiri Navratil


The Association for Computational Linguistics | 2014

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Avneesh Saluja; Chris Dyer; Shay B. Cohen


north american chapter of the association for computational linguistics | 2018

Using Aspect Extraction Approaches to Generate Review Summaries and User Profiles.

Christopher Mitcheltree; Veronica Wharton; Avneesh Saluja


arXiv: Computation and Language | 2018

Paraphrase-Supervised Models of Compositionality.

Avneesh Saluja; Chris Dyer; Jean-David Ruvini


international conference on data mining | 2013

Infinite Mixed Membership Matrix Factorization

Avneesh Saluja; Mahdi Pakdaman; Dongzhen Piao; Ankur P. Parikh

Collaboration


Dive into the Avneesh Saluja's collaboration.

Top Co-Authors

Avatar

Chris Dyer

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Ankur P. Parikh

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Shay B. Cohen

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dongzhen Piao

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Eric P. Xing

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Frank Mokaya

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge