Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew M. Finch is active.

Publication


Featured researches published by Andrew M. Finch.


workshop on statistical machine translation | 2008

Dynamic Model Interpolation for Statistical Machine Translation

Andrew M. Finch; Eiichiro Sumita

This paper presents a technique for class-dependent decoding for statistical machine translation (SMT). The approach differs from previous methods of class-dependent translation in that the class-dependent forms of all models are integrated directly into the decoding process. We employ probabilistic mixture weights between models that can change dynamically on a segment-by-segment basis depending on the characteristics of the source segment. The effectiveness of this approach is demonstrated by evaluating its performance on travel conversation data. We used the approach to tackle the translation of questions and declarative sentences using class-dependent models. To achieve this, our system integrated two sets of models specifically built to deal with sentences that fall into one of two classes of dialog sentence: questions and declarations, with a third set of models built to handle the general class. The technique was thoroughly evaluated on data from 17 language pairs using 6 machine translation evaluation metrics. We found the results were corpus-dependent, but in most cases our system was able to improve translation performance, and for some languages the improvements were substantial.


Computer Speech & Language | 2007

Improving statistical machine translation using shallow linguistic knowledge

Young-Sook Hwang; Andrew M. Finch; Yutaka Sasaki

We describe methods for improving the performance of statistical machine translation (SMT) between four linguistically different languages, i.e., Chinese, English, Japanese, and Korean by using morphosyntactic knowledge. For the purpose of reducing the translation ambiguities and generating grammatically correct and fluent translation output, we address the use of shallow linguistic knowledge, that is: (1) enriching a word with its morphosyntactic features, (2) obtaining shallow linguistically-motivated phrase pairs, (3) iteratively refining word alignment using filtered phrase pairs, and (4) building a language model from morphosyntactically enriched words. Previous studies reported that the introduction of syntactic features into SMT models resulted in only a slight improvement in performance in spite of the heavy computational expense, however, this study demonstrates the effectiveness of morphosyntactic features, when reliable, discriminative features are used. Our experimental results show that word representations that incorporate morphosyntactic features significantly improve the performance of the translation model and language model. Moreover, we show that refining the word alignment using fine-grained phrase pairs is effective in improving system performance.


conference of the european chapter of the association for computational linguistics | 2003

A corpus-centered approach to spoken language translation

Eiichiro Sumita; Yasuhiro Akiba; Takao Doi; Andrew M. Finch; Kenji Imamura; Michael J. Paul; Mitsuo Shimohata; Taro Watanabe

This paper reports the latest performance of components and features of a project named Corpus-Centered Computation (C3), which targets a translation technology suitable for spoken language translation. C3 places corpora at the center of the technology. Translation knowledge is extracted from corpora by both EBMT and SMT methods, translation quality is gauged by referring to corpora, the best translation among multiple-engine outputs is selected based on corpora and the corpora themselves are paraphrased or filtered by automated processes.


empirical methods in natural language processing | 2009

Bidirectional Phrase-based Statistical Machine Translation

Andrew M. Finch; Eiichiro Sumita

This paper investigates the effect of direction in phrase-based statistial machine translation decoding. We compare a typical phrase-based machine translation decoder using a left-to-right decoding strategy to a right-to-left decoder. We also investigate the effectiveness of a bidirectional decoding strategy that integrates both mono-directional approaches, with the aim of reducing the effects due to language specificity. Our experimental evaluation was extensive, based on 272 different language pairs, and gave the surprising result that for most of the language pairs, it was better decode from right-to-left than from left-to-right. As expected the relative performance of left-to-right and right-to-left strategies proved to be highly language dependent. The bidirectional approach outperformed the both the left-to-right strategy and the right-to-left strategy, showing consistent improvements that appeared to be unrelated to the specific languages used for translation. Bidirectional decoding gave rise to an improvement in performance over a left-to-right decoding strategy in terms of the BLEU score in 99% of our experiments.


Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009) | 2009

Transliteration by Bidirectional Statistical Machine Translation

Andrew M. Finch; Eiichiro Sumita

The system presented in this paper uses phrase-based statistical machine translation (SMT) techniques to directly transliterate between all language pairs in this shared task. The technique makes no language specific assumptions, uses no dictionaries or explicit phonetic information. The translation process transforms sequences of tokens in the source language directly into to sequences of tokens in the target. All language pairs were transliterated by applying this technique in a single unified manner. The machine translation system used was a system comprised of two phrase-based SMT decoders. The first generated from the first token of the target to the last. The second system generated the target from last to first. Our results show that if only one of these decoding strategies is to be chosen, the optimal choice depends on the languages involved, and that in general a combination of the two approaches is able to outperform either approach.


meeting of the association for computational linguistics | 2015

Neural Network Transduction Models in Transliteration Generation

Andrew M. Finch; Lemao Liu; Xiaolin Wang; Eiichiro Sumita

In this paper we examine the effectiveness of neural network sequence-to-sequence transduction in the task of transliteration generation. In this year’s shared evaluation we submitted two systems into all tasks. The primary system was based on the system used for the NEWS 2012 workshop, but was augmented with an additional feature which was the generation probability from a neural network. The secondary system was the neural network model used on its own together with a simple beam search algorithm. Our results show that adding the neural network score as a feature into the phrase-based statistical machine transliteration system was able to increase the performance of the system. In addition, although the neural network alone was not able to match the performance of our primary system (which exploits it), it was able to deliver a respectable performance for most language pairs which is very promising considering the recency of this technique.


international conference on acoustics, speech, and signal processing | 2011

Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model

Sakriani Sakti; Andrew M. Finch; Ryosuke Isotani; Hisashi Kawai; Satoshi Nakamura

Korean is an agglutinative language that does not have explicit word boundaries. It is also a highly inflective language that exhibits severe coarticulation effects. These characteristics pose a challenge in developing large-vocabulary continuous speech recognition (LVCSR) systems. Many existing Korean LVCSR systems attempt to overcome these difficulties by defining a set of “word” units using morphological analysis (rule-based) or statistical methods. These approaches usually require a great deal of linguistic knowledge or at least some explicit information about the statistical distribution of the units. However, exceptions or uncommon words (e.g., foreign proper nouns) still exist that cannot be covered by rules alone. In this paper, we investigate the use of an unsupervised, nonparametric Bayesian approach to automatically determining efficient units for a Korean LVCSR system. Specifically, we utilize a Dirichlet process model trained using Bayesian inference through block Gibbs sampling. Our approach provides a principled way of learning units without explicit linguistic knowledge or any static parameters. Experiments were conducted on a travel domain corpus, which includes many foreign words and proper nouns. In our experiments we compared our method to a set of state-of-the-art baseline systems that relied on either morphological analysis or segmentation heuristics. Our system was able to produce a considerably more compact set of “word” units than the best baseline system (the lexical dictionary was approximately half the size), with a recognition accuracy 5.89% higher in terms of the relative word error rate than the best baseline system.


workshop on statistical machine translation | 2009

NICT@WMT09: Model Adaptation and Transliteration for Spanish-English SMT

Michael Paul; Andrew M. Finch; Eiichiro Sumita

This paper describes the NICT statistical machine translation (SMT) system used for the WMT 2009 Shared Task (WMT09) evaluation. We participated in the Spanish-English translation task. The focus of this years participation was to investigate model adaptation and transliteration techniques in order to improve the translation quality of the baseline phrase-based SMT system.


meeting of the association for computational linguistics | 2014

Empirical Study of Unsupervised Chinese Word Segmentation Methods for SMT on Large-scale Corpora

Xiaolin Wang; Masao Utiyama; Andrew M. Finch; Eiichiro Sumita

Unsupervised word segmentation (UWS) can provide domain-adaptive segmentation for statistical machine translation (SMT) without annotated data, and bilingual UWS can even optimize segmentation for alignment. Monolingual UWS approaches of explicitly modeling the probabilities of words through Dirichlet process (DP) models or Pitman-Yor process (PYP) models have achieved high accuracy, but their bilingual counterparts have only been carried out on small corpora such as basic travel expression corpus (BTEC) due to the computational complexity. This paper proposes an efficient unified PYP-based monolingual and bilingual UWS method. Experimental results show that the proposed method is comparable to supervised segmenters on the in-domain NIST OpenMT corpus, and yields a 0.96 BLEU relative increase on NTCIR PatentMT corpus which is out-of-domain.


ACM Transactions on Asian Language Information Processing | 2013

A Bayesian Alignment Approach to Transliteration Mining

Takaaki Fukunishi; Andrew M. Finch; Seiichi Yamamoto; Eiichiro Sumita

In this article we present a technique for mining transliteration pairs using a set of simple features derived from a many-to-many bilingual forced-alignment at the grapheme level to classify candidate transliteration word pairs as correct transliterations or not. We use a nonparametric Bayesian method for the alignment process, as this process rewards the reuse of parameters, resulting in compact models that align in a consistent manner and tend not to over-fit. Our approach uses the generative model resulting from aligning the training data to force-align the test data. We rely on the simple assumption that correct transliteration pairs would be well modeled and generated easily, whereas incorrect pairs---being more random in character---would be more costly to model and generate. Our generative model generates by concatenating bilingual grapheme sequence pairs. The many-to-many generation process is essential for handling many languages with non-Roman scripts, and it is hard to train well using a maximum likelihood techniques, as these tend to over-fit the data. Our approach works on the principle that generation using only grapheme sequence pairs that are in the model results in a high probability derivation, whereas if the model is forced to introduce a new parameter in order to explain part of the candidate pair, the derivation probability is substantially reduced and severely reduced if the new parameter corresponds to a sequence pair composed of a large number of graphemes. The features we extract from the alignment of the test data are not only based on the scores from the generative model, but also on the relative proportions of each sequence that are hard to generate. The features are used in conjunction with a support vector machine classifier trained on known positive examples together with synthetic negative examples to determine whether a candidate word pair is a correct transliteration pair. In our experiments, we used all data tracks from the 2010 Named-Entity Workshop (NEWS’10) and use the performance of the best system for each language pair as a reference point. Our results show that the new features we propose are powerfully predictive, enabling our approach to achieve levels of performance on this task that are comparable to the state of the art.

Collaboration


Dive into the Andrew M. Finch's collaboration.

Top Co-Authors

Avatar

Masao Utiyama

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Michael Paul

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Ye Kyaw Thu

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Keiji Yasuda

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Satoshi Nakamura

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Chiori Hori

National Institute of Information and Communications Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge