Koji Tochinai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Koji Tochinai is active.

Explore More

Publication

Featured researches published by Koji Tochinai.

international conference on computational linguistics | 2002

Study of practical effectiveness for machine translation using recursive chain-link-type learning

Hiroshi Echizen-ya; Yoshio Momouchi; Kenji Araki; Koji Tochinai

A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text corpora. This means that it is difficult for the systems to acquire the translation rules from sparse data. As a result, these methods require large amounts of training data in order to acquire high-quality translation rules. To overcome this problem, we propose a method of machine translation using a Recursive Chain-link-type Learning. In our new method, the system can acquire many new high-quality translation rules from sparse translation examples based on already acquired translation rules. Therefore, acquisition of new translation rules results in the generation of more new translation rules. Such a process of acquisition of translation rules is like a linked chain. From the results of evaluation experiments, we confirmed the effectiveness of Recursive Chain-link-type Learning.

discovery science | 2003

Bacterium Lingualis – The Web-Based Commonsensical Knowledge Discovery Method

Rafal Rzepka; Kenji Araki; Koji Tochinai

The Bacterium Lingualis is a knowledge discovery method for commonsensical reasoning based on textual WWW resources. During developing a talking agent without a domain limit, we understood that our system needs an unsupervised reinforcement learning algorithm, which could speed up the language and commonsensical knowledge discovery. In this paper we introduce our idea and the results of preliminary experiments.

international conference on computational linguistics | 2002

A word segmentation method with dynamic adapting to text using inductive learning

Zhongjian Wang; Kenji Araki; Koji Tochinai

We have proposed a method of word segmentation for non-segmented language using Inductive Learning. This method uses only surface information of a text, so that it has an advantage that is entirely not dependent on any specific language. In this method, we consider that a character string of appearing frequently in a text has a high possibility as a word. The method predicts unknown words by recursively extracting common character strings. With the proposed method, the segmentation results can adapt to different users and fields. To evaluate effectivety for Chinese word segmentation and adaptability for different fields, we have done the evaluation experiment with Chinese text of the two fields.

australian joint conference on artificial intelligence | 2002

Effectiveness for Machine Translation Method Using Inductive Learning on Number Representation

Masafumi Matsuhara; Kenji Araki; Koji Tochinai

On our proposed method, source language is translated into target language via Number Representation. A text in the source language is translated into a number representation text. The number representation text is the number string corresponding to the original source language text. The number representation text is translated into a number representation text for the target language. The number representation text is translated into a text in the target language. The text is the translation result finally. A number representation text is more abstract than the original text because the number representation text corresponds to several texts. The system based on our proposed method is able to acquire more translation rules on number representation than that on the original text by Inductive Learning. Moreover, the system disambiguates number representation by its own adaptability. In the experiment, the correct translation rate for our proposed method is higher than that for the method without number representation. Thus, it is proved that our proposed method is more effective for machine translation.

computational intelligence | 2009

Word Segmentation of Chinese Text with Multiple Hybrid Methods

Zhongjian Wang; Jun Xu; Kenji Araki; Koji Tochinai

To deal with unknown word and segmentation ambiguity, segmentation rules and tri-gram was used in Inductive Learning method. Rules were used for elementary segmentation and for better processing effectiveness in following steps. Those rules were acquired by manual labor through analyzing a tagged corpus. Inductive Learning method recognized, extracted the unknown words from segmentation text recursively. The tri-gram model was used to deal with segmentation ambiguity, to select the better segmentation candidate by calculating a sentence probability. Experimental results indicated that unknown words processing and segmentation error were improved.

Archive | 2003

EBMT of POS-Tagged Sentences by Recursive Division Via Inductive Learning

Tantely Andriamanankasina; Kenji Araki; Koji Tochinai

We present an Example-Based Machine Translation approach which recursively divides the sentence to be translated, and translates each part separately. The sentence is divided according to the structure of similar examples extracted during the matching process. The approach is especially intended for languages where resources and tools are pretty much unavailable. POS taggers are the only tools utilized, and the bilingual corpus the only resource employed. In addition, the translation system contains an analogy-based sub-sentential alignment module, which predicts word correspondences between new pairs of sentences. This module causes the corpus to grow because new examples can be appended automatically. Consequently, a relatively small initial corpus is sufficient for the translation system to start. The approach has been tested on a French-Japanese corpus of spoken language and produced promising results worthy of further investigation.

mexican international conference on artificial intelligence | 2004

Automatic Building of a Machine Translation Bilingual Dictionary Using Recursive Chain-Link-Type Learning from a Parallel Corpus

Hiroshi Echizen-ya; Kenji Araki; Yoshio Momouchi; Koji Tochinai

Numerous methods have been developed for generating a machine translation (MT) bilingual dictionary from a parallel text corpus. Such methods extract bilingual collocations from sentence pairs of source and target language sentences. Then those collocations are registered in an MT bilingual dictionary. Bilingual collocations are lexically corresponding pairs of parts extracted from sentence pairs. This paper describes a new method for automatic extraction of bilingual collocations from a parallel text corpus using no linguistic knowledge. We use Recursive Chain-link-type Learning (RCL), which is a learning algorithm, to extract bilingual collocations. Our method offers two main advantages. One benefit is that this RCL system requires no linguistic knowledge. The other advantage is that it can extract many bilingual collocations, even if the frequency of appearance of the bilingual collocations is very low. Experimental results verify that our system extracts bilingual collocations efficiently. The extraction rate of bilingual collocations was 74.9% for all bilingual collocations that corresponded to nouns in the parallel corpus.

conference on intelligent text processing and computational linguistics | 2004

Evaluation of Japanese Dialogue Processing Method Based on Similarity Measure Using tf· AoI

Yasutomo Kimura; Kenji Araki; Koji Tochinai

In this paper, we propose a Japanese dialogue processing method based on a similarity measure using tf· AoI(termfrequency × Amountof Information). Keywords are specially used in a spoken dialogue system because a user utterance includes an erroneous recognition, filler and a noise. However, when a system uses keywords for robustness, it is difficult to realize detailed differences. Therefore, our method calculates similarity between two sentences without deleting any word from an input sentence, and we use a weight which multiplies term frequency and amount of information(tf · AoI). We use 173 open data sets which are collected from 12,095 sentences in SLDB. The experimental result using our method has a correct response rate of 67.1%. We confirmed that correct response rate of our method was 11.6 points higher than that of the matching rate measure between an input sentence and a comparison sentence. Furthermore that of our method was 7.0 points higher than that of tf · idf.

australasian joint conference on artificial intelligence | 2003

Effectiveness of A Direct Speech Transform Method Using Inductive Learning from Laryngectomee Speech to Normal Speech

Koji Murakami; Kenji Araki; Makoto Hiroshige; Koji Tochinai

This paper proposes and evaluates a new direct speech transform method with waveforms from laryngectomee speech to normal speech. Almost all conventional speech recognition systems and other speech processing systems are not able to treat laryngectomee speech with satisfactory results. One of the major causes is difficulty preparing corpora. It is very hard to record a large amount of clear and intelligible utterance data because the acoustical quality depends strongly on the individual status of such people.

meeting of the association for computational linguistics | 2002

Evaluation of Direct Speech Translation Method Using Inductive Learning for Conversations in the Travel Domain

Koji Murakami; Makoto Hiroshige; Kenji Araki; Koji Tochinai

This paper evaluates a direct speech translation Method with waveforms using the Inductive Learning method for short conversation. The method is able to work without conventional speech recognition and speech synthesis because syntactic expressions are not needed for translation in the proposed method. We focus only on acoustic characteristics of speech waveforms of source and target languages without obtaining character strings from utterances. This speech translation method can be utilized for any language because the system has no processing dependent on an individual character of a specific language. Therefore, we can utilize the speech of a handicapped person who is not able to be treated by conventional speech recognition systems, because we do not need to segment the speech into phonemes, syllables, or words to realize speech translation. Our method is realized by learning translation rules that have acoustic correspondence between two languages inductively. In this paper, we deal with a translation between Japanese and English.

Explore More