Milos Stanojevic
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Milos Stanojevic.
workshop on statistical machine translation | 2013
Milos Stanojevic; Amir Kamran; Philipp Koehn; Ondřej Bojar
This paper presents the results of the WMT15 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT15 Shared Translation Task. We collected scores of 46 metrics from 11 research groups. In addition to that, we computed scores of 7 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system level correlation (how well each metric’s scores correlate with WMT15 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence).
workshop on statistical machine translation | 2014
Milos Stanojevic; Khalil Sima'an
We present the UvA-ILLC submission of the BEER metric to WMT 14 metrics task. BEER is a sentence level metric that can incorporate a large number of features combined in a linear model. Novel contributions are (1) efficient tuning of a large number of features for maximizing correlation with human system ranking, and (2) novel features that give smoother sentence level scores.
empirical methods in natural language processing | 2014
Milos Stanojevic; Khalil Sima'an
Sentence level evaluation in MT has turned out far more difficult than corpus level evaluation. Existing sentence level metrics employ a limited set of features, most of which are rather sparse at the sentence level, and their intricate models are rarely trained for ranking. This paper presents a simple linear model exploiting 33 relatively dense features, some of which are novel while others are known but seldom used, and train it under the learning-to-rank framework. We evaluate our metric on the standard WMT12 data showing that it outperforms the strong baseline METEOR. We also analyze the contribution of individual features and the choice of training data, language-pair vs. target-language data, providing new insights into this task.
workshop on statistical machine translation | 2015
Milos Stanojevic; Khalil Sima'an
We describe the submissions of ILLC UvA to the metrics and tuning tasks on WMT15. Both submissions are based on the BEER evaluation metric originally presented on WMT14 (Stanojevic and Sima’an, 2014a). The main changes introduced this year are: (i) extending the learning-to-rank trained sentence level metric to the corpus level (but still decomposable to sentence level), (ii) incorporating syntactic ingredients based on dependency trees, and (iii) a technique for finding parameters of BEER that avoid “gaming of the metric” during tuning.
empirical methods in natural language processing | 2015
Milos Stanojevic; Khalil Sima'an
We present a novel approach for unsupervised induction of a Reordering Grammar using a modified form of permutation trees (Zhang and Gildea, 2007), which we apply to preordering in phrase-based machine translation. Unlike previous approaches, we induce in one step both the hierarchical structure and the transduction function over it from word-aligned parallel corpora. Furthermore, our model (1) handles non-ITG reordering patterns (up to 5-ary branching), (2) is learned from all derivations by treating not only labeling but also bracketing as latent variable, (3) is entirely unlexicalized at the level of reordering rules, and (4) requires no linguistic annotation. Our model is evaluated both for accuracy in predicting target order, and for its impact on translation quality. We report significant performance gains over phrase reordering, and over two known preordering baselines for English-Japanese.
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers | 2016
Joachim Daiber; Milos Stanojevic; Wilker Aziz; Khalil Sima'an
We study the relationship between word order freedom and preordering in statistical machine translation. To assess word order freedom, we first introduce a novel entropy measure which quantifies how difficult it is to predict word order given a source sentence and its syntactic analysis. We then address preordering for two target languages at the far ends of the word order freedom spectrum, German and Japanese, and argue that for languages with more word order freedom, attempting to predict a unique word order given source clues only is less justified. Subsequently, we examine lattices of n-best word order predictions as a unified representation for languages from across this broad spectrum and present an effective solution to a resulting technical issue, namely how to select a suitable source word order from the lattice during training. Our experiments show that lattices are crucial for good empirical performance for languages with freer word order (English‐German) and can provide additional improvements for fixed word order languages (English‐
empirical methods in natural language processing | 2014
Milos Stanojevic; Khalil Sima'an
Automatically evaluating word order of MT system output at the sentence-level is challenging. At the sentence-level, ngram counts are rather sparse which makes it difficult to measure word order quality effectively using lexicalized units. Recent approaches abstract away from lexicalization by assigning a score to the permutation representing how word positions in system output move around relative to a reference translation. Metrics over permutations exist (e.g., Kendal tau or Spearman Rho) and have been shown to be useful in earlier work. However, none of the existing metrics over permutations groups word positions recursively into larger phrase-like blocks, which makes it difficult to account for long-distance reordering phenomena. In this paper we explore novel metrics computed over Permutation Forests (PEFs), packed charts of Permutation Trees (PETs), which are tree decompositions of a permutation into primitive ordering units. We empirically compare PEFs metric against five known reordering metrics on WMT13 data for ten language pairs. The PEFs metric shows better correlation with human ranking than the other metrics almost on all language pairs. None of the other metrics exhibits as stable behavior across language pairs.
logical aspects of computational linguistics | 2016
Milos Stanojevic
Current chart-based parsers of Minimalist Grammars exhibit prohibitively high polynomial complexity that makes them unusable in practice. This paper presents a transition-based parser for Minimalist Grammars that approximately searches through the space of possible derivations by means of beam search, and does so very efficiently: the worst case complexity of building one derivation is
meeting of the association for computational linguistics | 2017
Milos Stanojevic; Khalil Sima'an
international conference on computational linguistics | 2016
Joachim Daiber; Milos Stanojevic; Khalil Sima'an
On^2