Is this you? Create Your Porfile

Markus Saers

Hong Kong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Markus Saers is active.

Explore More

Publication

Featured researches published by Markus Saers.

empirical methods in natural language processing | 2007

Single Malt or Blended? A Study in Multilingual Parser Optimization

Johan Hall; Jens Nilsson; Joakim Nivre; G"ulsen Eryigit; Beáta Megyesi; Mattias Nilsson; Markus Saers

We describe a two-stage optimization of the MaltParser system for the ten languages in the multilingual track of the CoNLL 2007 shared task on dependency parsing. The first stage consists in tuning a single-parser system for each language by optimizing parameters of the parsing algorithm, the feature model, and the learning algorithm. The second stage consists in building an ensemble system that combines six different parsing strategies, extrapolating from the optimal parameter settings for each language. When evaluated on the official test sets, the ensemble system significantly outperformed the single-parser system and achieved the highest average labeled attachment score of all systems participating in the shared task.

international workshop conference on parsing technologies | 2009

Learning Stochastic Bracketing Inversion Transduction Grammars with a Cubic Time Biparsing Algorithm

Markus Saers; Joakim Nivre; Dekai Wu

We present a biparsing algorithm for Stochastic Bracketing Inversion Transduction Grammars that runs in O(bn3) time instead of O(n6). Transduction grammars learned via an EM estimation procedure based on this biparsing algorithm are evaluated directly on the translation task, by building a phrase-based statistical MT system on top of the alignments dictated by Viterbi parses under the induced bigrammars. Translation quality at different levels of pruning are compared, showing improvements over a conventional word aligner even at heavy pruning levels.

north american chapter of the association for computational linguistics | 2009

Improving Phrase-Based Translation via Word Alignments from Stochastic Inversion Transduction Grammars

Markus Saers; Dekai Wu

We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed word alignments can lead to flawed phrase translations that damage translation accuracy. Yet the IBM word alignments usually used today are known to be flawed, in large part because IBM models (1) model reordering by allowing unrestricted movement of words, rather than constrained movement of compositional units, and therefore must (2) attempt to compensate via directed, asymmetric distortion and fertility models. The conventional heuristics for attempting to recover from the resulting alignment errors involve estimating two directed models in opposite directions and then intersecting their alignments -- to make up for the fact that, in reality, word alignment is an inherently joint relation. A natural alternative is provided by Inversion Transduction Grammars, which estimate the joint word alignment relation directly, eliminating the need for any of the conventional heuristics. We show that this alignment ultimately produces superior translation accuracy on BLEU, NIST, and METEOR metrics over three distinct language pairs.

conference on computational natural language learning | 2008

Mixing and Blending Syntactic and Semantic Dependencies

Yvonne Samuelsson; Oscar Täckström; Sumithra Velupillai; Johan Eklund; Mark Fishel; Markus Saers

Our system for the CoNLL 2008 shared task uses a set of individual parsers, a set of stand-alone semantic role labellers, and a joint system for parsing and semantic role labelling, all blended together. The system achieved a macro averaged labelled F1-score of 79.79 (WSJ 80.92, Brown 70.49) for the overall task. The labelled attachment score for syntactic dependencies was 86.63 (WSJ 87.36, Brown 80.77) and the labelled F1-score for semantic dependencies was 72.94 (WSJ 74.47, Brown 60.18).

meeting of the association for computational linguistics | 2014

XMEANT: Better semantic MT evaluation without reference translations

Chi-kiu Lo; Meriem Beloucif; Markus Saers; Dekai Wu

We introduce XMEANT—a new cross-lingual version of the semantic frame based MT evaluation metric MEANT—which can correlate even more closely with human adequacy judgments than monolingual MEANT and eliminates the need for expensive human references. Previous work established that MEANT reflects translation adequacy with state-of-the-art accuracy, and optimizing MT systems against MEANT robustly improves translation quality. However, to go beyond tuning weights in the loglinear SMT model, a cross-lingual objective function that can deeply integrate semantic frame criteria into the MT training pipeline is needed. We show that cross-lingual XMEANT outperforms monolingual MEANT by (1) replacing the monolingual context vector model in MEANT with simple translation probabilities, and (2) incorporating bracketing ITG constraints.

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing | 2013

Iterative rule segmentation under minimum description length for unsupervised transduction grammar induction

Markus Saers; Karteek Addanki; Dekai Wu

We argue that for purely incremental unsupervised learning of phrasal inversion transduction grammars, a minimum description length driven, iterative top-down rule segmentation approach that is the polar opposite of Saers, Addanki, and Wus previous 2012 bottom-up iterative rule chunking model yields significantly better translation accuracy and grammar parsimony. We still aim for unsupervised bilingual grammar induction such that training and testing are optimized upon the same exact underlying model--a basic principle of machine learning and statistical prediction that has become unduly ignored in statistical machine translation models of late, where most decoders are badly mismatched to the training assumptions. Our novel approach learns phrasal translations by recursively subsegmenting the training corpus, as opposed to our previous model--where we start with a token-based transduction grammar and iteratively build larger chunks. Moreover, the rule segmentation decisions in our approach are driven by a minimum description length objective, whereas the rule chunking decisions were driven by a maximum likelihood objective. We demonstrate empirically how this trades off maximum likelihood against model size, aiming for a more parsimonious grammar that escapes the perfect overfitting to the training data that we start out with, and gradually generalizes to previously unseen sentence translations so long as the model shrinks enough to warrant a looser fit to the training data. Experimental results show that our approach produces a significantly smaller and better model than the chunking-based approach.

language and automata theory and applications | 2018

Handling Ties Correctly and Efficiently in Viterbi Training Using the Viterbi Semiring

Markus Saers; Dekai Wu

The handling of ties between equiprobable derivations during Viterbi training is often glossed over in research paper, whether they are broken randomly when they occur, or on an ad-hoc basis decided by the algorithm or implementation, or whether all equiprobable derivations are enumerated with the counts uniformly distributed among them, is left to the readers imagination. The first hurts rarely occurring rules, which run the risk of being randomly eliminated, the second suffers from algorithmic biases, and the last is correct but potentially very inefficient. We show that it is possible to Viterbi train correctly without enumerating all equiprobable best derivations. The method is analogous to expectation maximization, given that the automatic differentiation view is chosen over the reverse value/outside probability view, as the latter calculates the wrong quantity for reestimation under the Viterbi semiring. To get the automatic differentiation to work we devise an unbiased subderivative for the \(\mathrm {max}\) function.

Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP | 2016

Learning Translations for Tagged Words: Extending the Translation Lexicon of an ITG for Low Resource Languages

Markus Saers; Dekai Wu

We tackle the challenge of learning part-ofspeech classified translations as part of an inversion transduction grammar, by learning translations for English words with known part-of-speech tags, both from existing translation lexica and from parallel corpora. When translating from a low resource language into English, we can expect to have rich resources for English, such as treebanks, and small amounts of bilingual resources, such as translation lexica and parallel corpora. We solve the problem of integrating these heterogeneous resources into a single model using stochastic Inversion Transduction Grammars, which we augment with wildcards to handle unknown translations.

international conference on computational linguistics | 2014

Lexical Access Preference and Constraint Strategies for Improving Multiword Expression Association within Semantic MT Evaluation

Dekai Wu; Chi-kiu Lo; Markus Saers

We examine lexical access preferences and constraints in computing multiword expression associations from the standpoint of a high-impact extrinsic task-based performance measure, namely semantic machine translation evaluation. In automated MT evaluation metrics, machine translations are compared against human reference translations, which are almost never worded exactly the same way except in the most trivial of cases. Because of this, one of the most important factors in correctly predicting semantic translation adequacy is the accuracy of recognizing alternative lexical realizations of the same multiword expressions in semantic role fillers. Our results comparing bag-of-words, maximum alignment, and inversion transduction grammars indicate that cognitively motivated ITGs provide superior lexical access characteristics for multiword expression associations, leading to state-of-the-art improvements in correlation with human adequacy judgments.

empirical methods in natural language processing | 2014

Ternary Segmentation for Improving Search in Top-down Induction of Segmental ITGs

Markus Saers; Dekai Wu

We show that there are situations where iteratively segmenting sentence pairs topdown will fail to reach valid segments and propose a method for alleviating the problem. Due to the enormity of the search space, error analysis has indicated that it is often impossible to get to a desired embedded segment purely through binary segmentation that divides existing segmental rules in half – the strategy typically employed by existing search strategies – as it requires two steps. We propose a new method to hypothesize ternary segmentations in a single step, making the embedded segments immediately discoverable.

Explore More