Malte Nuhn
RWTH Aachen University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Malte Nuhn.
meeting of the association for computational linguistics | 2014
Malte Nuhn; Hermann Ney
This paper addresses the problem of EMbased decipherment for large vocabularies. Here, decipherment is essentially a tagging problem: Every cipher token is tagged with some plaintext type. As with other tagging problems, this one can be treated as a Hidden Markov Model (HMM), only here, the vocabularies are large, so the usualO(NV 2 ) exact EM approach is infeasible. When faced with this situation, many people turn to sampling. However, we propose to use a type of approximate EM and show that it works well. The basic idea is to collect fractional counts only over a small subset of links in the forward-backward lattice. The subset is different for each iteration of EM. One option is to use beam search to do the subsetting. The second method restricts the successor words that are looked at, for each hypothesis. It does this by consulting pre-computed tables of likely n-grams and likely substitutions.
empirical methods in natural language processing | 2014
Malte Nuhn; Kevin Knight
Manual analysis and decryption of enciphered documents is a tedious and error prone work. Often—even after spending large amounts of time on a particular cipher—no decipherment can be found. Automating the decryption of various types of ciphers makes it possible to sift through the large number of encrypted messages found in libraries and archives, and to focus human effort only on a small but potentially interesting subset of them. In this work, we train a classifier that is able to predict which encipherment method has been used to generate a given ciphertext. We are able to distinguish 50 different cipher types (specified by the American Cryptogram Association) with an accuracy of 58.5%. This is a 11.2% absolute improvement over the best previously published classifier.
international conference on frontiers in handwriting recognition | 2014
Michal Kozielski; Malte Nuhn; Patrick Doetsch; Hermann Ney
We present a method for training an off-line handwriting recognition system in an unsupervised manner. For an isolated word recognition task, we are able to bootstrap the system without any annotated data. We then retrain the system using the best hypothesis from a previous recognition pass in an iterative fashion. Our approach relies only on a prior language model and does not depend on an explicit segmentation of words into characters. The resulting system shows a promising performance on a standard dataset in comparison to a system trained in a supervised fashion for the same amount of training data.
empirical methods in natural language processing | 2014
Malte Nuhn; Julian Schamper; Hermann Ney
In this paper, we present two improvements to the beam search approach for solving homophonic substitution ciphers presented in Nuhn et al. (2013): An improved rest cost estimation together with an optimized strategy for obtaining the order in which the symbols of the cipher are deciphered reduces the beam size needed to successfully decipher the Zodiac-408 cipher from several million down to less than one hundred: The search effort is reduced from several hours of computation time to just a few seconds on a single CPU. These improvements allow us to successfully decipher the second part of the famous Beale cipher (see (Ward et al., 1885) and e.g. (King, 1993)): Having 182 different cipher symbols while having a length of just 762 symbols, the decipherment is way more challenging than the decipherment of the previously deciphered Zodiac408 cipher (length 408, 54 different symbols). To the best of our knowledge, this cipher has not been deciphered automatically before.
international joint conference on natural language processing | 2015
Malte Nuhn; Julian Schamper; Hermann Ney
In this paper we present the UNRAVEL toolkit: It implements many of the recently published works on decipherment, including decipherment for deterministic ciphers like e.g. the ZODIAC-408 cipher and Part two of the BEALE ciphers, as well as decipherment of probabilistic ciphers and unsupervised training for machine translation. It also includes data and example configuration files so that the previously published experiments are easy to reproduce.
international conference on computational linguistics | 2012
Joern Wuebker; Matthias Huck; Stephan Peitz; Malte Nuhn; Markus Freitag; Jan-Thorsten Peter; Saab Mansour; Hermann Ney
meeting of the association for computational linguistics | 2012
Malte Nuhn; Arne Mauser; Hermann Ney
meeting of the association for computational linguistics | 2013
Malte Nuhn; Julian Schamper; Hermann Ney
meeting of the association for computational linguistics | 2013
Malte Nuhn; Hermann Ney
Archive | 2013
David Vilar; Daniel Stein; Matthias Huck; Joern Wuebker; Markus Freitag; Stephan Peitz; Malte Nuhn; Jan-Thorsten Peter