Malte Nuhn | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Malte Nuhn is active.

Explore More

Publication

Featured researches published by Malte Nuhn.

meeting of the association for computational linguistics | 2014

EM Decipherment for Large Vocabularies

Malte Nuhn; Hermann Ney

This paper addresses the problem of EMbased decipherment for large vocabularies. Here, decipherment is essentially a tagging problem: Every cipher token is tagged with some plaintext type. As with other tagging problems, this one can be treated as a Hidden Markov Model (HMM), only here, the vocabularies are large, so the usualO(NV 2 ) exact EM approach is infeasible. When faced with this situation, many people turn to sampling. However, we propose to use a type of approximate EM and show that it works well. The basic idea is to collect fractional counts only over a small subset of links in the forward-backward lattice. The subset is different for each iteration of EM. One option is to use beam search to do the subsetting. The second method restricts the successor words that are looked at, for each hypothesis. It does this by consulting pre-computed tables of likely n-grams and likely substitutions.

empirical methods in natural language processing | 2014

Cipher Type Detection

Malte Nuhn; Kevin Knight

Manual analysis and decryption of enciphered documents is a tedious and error prone work. Often—even after spending large amounts of time on a particular cipher—no decipherment can be found. Automating the decryption of various types of ciphers makes it possible to sift through the large number of encrypted messages found in libraries and archives, and to focus human effort only on a small but potentially interesting subset of them. In this work, we train a classifier that is able to predict which encipherment method has been used to generate a given ciphertext. We are able to distinguish 50 different cipher types (specified by the American Cryptogram Association) with an accuracy of 58.5%. This is a 11.2% absolute improvement over the best previously published classifier.

international conference on frontiers in handwriting recognition | 2014

Towards Unsupervised Learning for Handwriting Recognition

Michal Kozielski; Malte Nuhn; Patrick Doetsch; Hermann Ney

We present a method for training an off-line handwriting recognition system in an unsupervised manner. For an isolated word recognition task, we are able to bootstrap the system without any annotated data. We then retrain the system using the best hypothesis from a previous recognition pass in an iterative fashion. Our approach relies only on a prior language model and does not depend on an explicit segmentation of words into characters. The resulting system shows a promising performance on a standard dataset in comparison to a system trained in a supervised fashion for the same amount of training data.

empirical methods in natural language processing | 2014

Improved Decipherment of Homophonic Ciphers

Malte Nuhn; Julian Schamper; Hermann Ney

In this paper, we present two improvements to the beam search approach for solving homophonic substitution ciphers presented in Nuhn et al. (2013): An improved rest cost estimation together with an optimized strategy for obtaining the order in which the symbols of the cipher are deciphered reduces the beam size needed to successfully decipher the Zodiac-408 cipher from several million down to less than one hundred: The search effort is reduced from several hours of computation time to just a few seconds on a single CPU. These improvements allow us to successfully decipher the second part of the famous Beale cipher (see (Ward et al., 1885) and e.g. (King, 1993)): Having 182 different cipher symbols while having a length of just 762 symbols, the decipherment is way more challenging than the decipherment of the previously deciphered Zodiac408 cipher (length 408, 54 different symbols). To the best of our knowledge, this cipher has not been deciphered automatically before.

international joint conference on natural language processing | 2015

UNRAVELâ€”A Decipherment Toolkit

Malte Nuhn; Julian Schamper; Hermann Ney

In this paper we present the UNRAVEL toolkit: It implements many of the recently published works on decipherment, including decipherment for deterministic ciphers like e.g. the ZODIAC-408 cipher and Part two of the BEALE ciphers, as well as decipherment of probabilistic ciphers and unsupervised training for machine translation. It also includes data and example configuration files so that the previously published experiments are easy to reproduce.

international conference on computational linguistics | 2012