Jörg Tiedemann
University of Zagreb
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jörg Tiedemann.
international conference on computational linguistics | 2014
Marcos Zampieri; Liling Tan; Nikola Ljubešić; Jörg Tiedemann
This paper summarizes the methods, results and findings of the Discriminating between Similar Languages (DSL) shared task 2014. The shared task provided data from 13 different languages and varieties divided into 6 groups. Participants were required to train their systems to discriminate between languages on a training and development set containing 20,000 sentences from each language (closed submission) and/or any other dataset (open submission). One month later, a test set containing 1,000 unidentified instances per language was released for evaluation. The DSL shared task received 22 inscriptions and 8 final submissions. The best system obtained 95.7% average accuracy.
Proceedings of the Fourth Workshop on NLP for Similar Languages,#N# Varieties and Dialects (VarDial) | 2017
Marcos Zampieri; Shervin Malmasi; Nikola Ljubešić; Preslav Nakov; Ahmed M. Ali; Jörg Tiedemann; Yves Scherrer; Noëmi Aepli
We present the results of the VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the VarDial workshop at EACL’2017. This year, we included four shared tasks: Discriminating between Similar Languages (DSL), Arabic Dialect Identification (ADI), German Dialect Identification (GDI), and Cross-lingual Dependency Parsing (CLP). A total of 19 teams submitted runs across the four tasks, and 15 of them wrote system description papers.
arXiv: Computation and Language | 2017
Jörg Tiedemann
This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little.
The Prague Bulletin of Mathematical Linguistics | 2016
Robert Östling; Jörg Tiedemann
Abstract We present EFMARAL, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC) inference. Through careful selection of data structures and model architecture we are able to surpass the fast_align system, commonly used for performance-critical word alignment, both in computational efficiency and alignment accuracy. Our evaluation shows that a phrase-based statistical machine translation (SMT) system produces translations of higher quality when using word alignments from EFMARAL than from fast_align, and that translation quality is on par with what is obtained using GIZA++, a tool requiring orders of magnitude more processing time. More generally we hope to convince the reader that Monte Carlo sampling, rather than being viewed as a slow method of last resort, should actually be the method of choice for the SMT practitioner and others interested in word alignment.
conference of the european chapter of the association for computational linguistics | 2017
Robert Östling; Jörg Tiedemann
empirical methods in natural language processing | 2017
Jörg Tiedemann; Yves Scherrer
arXiv: Computation and Language | 2018
Jörg Tiedemann
computational linguistics in the netherlands | 2017
Erik F. Tjong Kim Sang; Marcel Bollmann; Remko Boschker; Francisco Casacuberta; Feike Dietz; Stefanie Dipper; Miguel Domingo; Robe van der Goot; Marjo van Koppen; Nikola Ljubešić; Robert Östling; Florian Petran; Eva Pettersson; Yves Scherrer; Marijn Schraagen; Leen Sevens; Jörg Tiedemann; Tom Vanallemeersch; Kalliopi Zervanou
Proceedings of the Second Conference on Machine Translation | 2017
Robert Östling; Yves Scherrer; Jörg Tiedemann; Gongbo Tang; Tommi Nieminen
language resources and evaluation | 2018
Pierre Lison; Jörg Tiedemann; Milen Kouylekov