Hiroshi Echizen-ya
Hokkai Gakuen University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hiroshi Echizen-ya.
international conference on computational linguistics | 1996
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi; Koji Tochinai
We have proposed a method of machine translation, which acquires translation rules from translation examples using inductive learning, and have evaluated the method. And we have confirmed that the method requires many translation examples. To resolve this problem, we applied genetic algorithms and evaluated it by some experiments. We confirmed that the accuracy rate of translation increased from 52.8% to 61.9% by applying genetic algorithms.
international conference on computational linguistics | 2002
Hiroshi Echizen-ya; Yoshio Momouchi; Kenji Araki; Koji Tochinai
A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text corpora. This means that it is difficult for the systems to acquire the translation rules from sparse data. As a result, these methods require large amounts of training data in order to acquire high-quality translation rules. To overcome this problem, we propose a method of machine translation using a Recursive Chain-link-type Learning. In our new method, the system can acquire many new high-quality translation rules from sparse translation examples based on already acquired translation rules. Therefore, acquisition of new translation rules results in the generation of more new translation rules. Such a process of acquisition of translation rules is like a linked chain. From the results of evaluation experiments, we confirmed the effectiveness of Recursive Chain-link-type Learning.
international conference on knowledge based and intelligent information and engineering systems | 2005
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi
This paper presents a new learning method for automatic acquisition of translation knowledge from parallel corpora. We apply this learning method to automatic extraction of bilingual word pairs from parallel corpora. In general, similarity measures are used to extract bilingual word pairs from parallel corpora. However, similarity measures are insufficient because of the sparse data problem. The essence of our learning method is this presumption: in local parts of bilingual sentence pairs, the equivalents of words that adjoin the source language words of bilingual word pairs also adjoin the target language words of bilingual word pairs. Such adjacent information is acquired automatically in our method. We applied our method to systems based on various similarity measures, thereby confirming the effectiveness of our method.
workshop on statistical machine translation | 2014
Hiroshi Echizen-ya; Kenji Araki; Eduard H. Hovy
As described in this paper, we propose a new automatic evaluation metric for machine translation. Our metric is based on chunking between the reference and candidate translation. Moreover, we apply a prize based on sentence-length to the metric, dissimilar from penalties in BLEU or NIST. We designate this metric as Automatic Evaluation of Machine Translation in which the Prize is Applied to a Chunkbased metric (APAC). Through metaevaluation experiments and comparison with several metrics, we confirmed that our metric shows stable correlation with human judgment.
Information Processing and Management | 2006
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system must deal with various languages. Therefore, automatic extraction of bilingual word pairs from parallel corpora with various languages is important. However, previous works based on statistical methods are insufficient because of the sparse data problem. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any prior preparation of a bilingual resource (e.g., a bilingual dictionary, a machine translation system). We call this learning method Inductive Chain Learning (ICL). Moreover, the system using ICL can extract bilingual word pairs even from bilingual sentence pairs for which the grammatical structures of the source language differ from the grammatical structures of the target language because the acquired rules have the information to cope with the different word orders of source language and target language in local parts of bilingual sentence pairs. Evaluation experiments demonstrated that the recalls of systems based on several statistical approaches were improved through the use of ICL.
meeting of the association for computational linguistics | 2005
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi
In this paper, we propose a new learning method to solve the sparse data problem in automatic extraction of bilingual word pairs from parallel corpora with various languages. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any bilingual resource (e.g., a bilingual dictionary, machine translation systems) beforehand. We call this method Inductive Chain Learning (ICL). The ICL can limit the search scope for the decision of equivalents. Using ICL, the recall in three systems based on similarity measures improved respectively 8.0, 6.1 and 6.0 percentage points. In addition, the recall value of GIZA++ improved 6.6 percentage points using ICL.
knowledge discovery and data mining | 2005
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi
In this paper, we propose a new learning method for extraction of low-frequency bilingual word pairs from parallel corpora with various languages. It is important to extract low-frequency bilingual word pairs because the frequencies of many bilingual word pairs are very low when large-scale parallel corpora are unobtainable. We use the following inference to extract low frequency bilingual word pairs: the word equivalents that adjoin the source language words of bilingual word pairs also adjoin the target language words of bilingual word pairs in local parts of bilingual sentence pairs. Evaluation experiments indicated that the extraction rate of our system was more than 8.0 percentage points higher than the extraction rate of the system based on the Dice coefficient. Moreover, the extraction rates of bilingual word pairs for which the frequencies are one and two respectively improved 11.0 and 6.6 percentage points using AIL.
mexican international conference on artificial intelligence | 2004
Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi; Koji Tochinai
Numerous methods have been developed for generating a machine translation (MT) bilingual dictionary from a parallel text corpus. Such methods extract bilingual collocations from sentence pairs of source and target language sentences. Then those collocations are registered in an MT bilingual dictionary. Bilingual collocations are lexically corresponding pairs of parts extracted from sentence pairs. This paper describes a new method for automatic extraction of bilingual collocations from a parallel text corpus using no linguistic knowledge. We use Recursive Chain-link-type Learning (RCL), which is a learning algorithm, to extract bilingual collocations. Our method offers two main advantages. One benefit is that this RCL system requires no linguistic knowledge. The other advantage is that it can extract many bilingual collocations, even if the frequency of appearance of the bilingual collocations is very low. Experimental results verify that our system extracts bilingual collocations efficiently. The extraction rate of bilingual collocations was 74.9% for all bilingual collocations that corresponded to nouns in the parallel corpus.
conference on applied natural language processing | 1997
Hiroshi Echizen-ya; Kenji Araki; Yoshikazu Miyanaga; Koji Tochinai
We proposed a method of machine translation using inductive learning with genetic algorithms, and confirmed the effectiveness of applying genetic algorithms. However, the system based on this method produces many erroneous translation rules that cannot be completely removed from the dictionary. Therefore, we need to improve how to apply genetic algorithms to be able to remove erroneous translation rules from the dictionary. In this paper, we describe this improvement in the selection process and the results of evaluation experiments.
meeting of the association for computational linguistics | 2010
Hiroshi Echizen-ya; Kenji Araki