Takehiko Yoshimi
Ryukoku University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Takehiko Yoshimi.
international conference on acoustics, speech, and signal processing | 2008
Takashi Shichiri; Hiroaki Nanjo; Takehiko Yoshimi
This paper addresses automatic speech recognition (ASR) oriented for speech based information retrieval (IR). Since the significance of words differs in IR, in ASR for IR, ASR performance should be evaluated based on weighted word error rate (WWER), which gives a different weight on each word recognition error from the viewpoint of IR, instead of word error rate (WER), which treats all words uniformly. In this paper, we firstly discuss an automatic estimation method of word significance (weights), and then, we perform ASR based on Minimum Bayes-Risk framework using the presumed word significance, and show that the ASR approach that minimizes WWER calculated from the presumed word weighs is effective for speech based IR.
International Symposium on Emerging Technologies for Education | 2016
Katsunori Kotani; Takehiko Yoshimi
Previous research on the ease of listening comprehension (henceforth, listenability) has measured listenability on the basis of sentence properties such as the length of words/sentences and speech rate. Recent research has included features of listeners, which are required for the measurement of listenability for English learners because their listening proficiencies vary greatly from the beginner to the advanced level. Given the importance of listening proficiency as a listener feature, this study developed listenability measurement methods based on the costs of compiling listener features: expensive features extracted from test scores and inexpensive features extracted from learners’ experiences. The experimental results showed that inexpensive features made substantial contributions to the measurement of middle-range listenability.
international conference on the computer processing of oriental languages | 2009
Katsunori Kotani; Takehiko Yoshimi; Takeshi Kutsumi; Ichiko Sata
Because human evaluation of machine translation is extensive but expensive, we often use automatic evaluation in developing a machine translation system. From viewpoint of evaluation cost, there are two types of evaluation methods: one uses (multiple) reference translation, e.g., METEOR, and the other classifies machine translation either into machine-like or human-like translation based on translation properties, i.e., a classification-based method. Previous studies showed that classification-based methods could perform evaluation properly. These studies constructed a classifier by learning linguistic properties of translation such as length of a sentence, syntactic complexity, and literal translation, and their classifiers marked high classification accuracy. These previous studies, however, have not examined whether their classification accuracy could present translation quality. Hence, we investigated whether classification accuracy depends on translation quality. The experiment results showed that our method could correctly distinguish the degrees of translation quality.
hellenic conference on artificial intelligence | 2010
Katsunori Kotani; Takehiko Yoshimi
Constructing a classifier that distinguishes machine translations from human translations is a promising approach to automatic evaluation of machine-translated sentences Using this approach, we constructed a classifier using Support Vector Machines based on word-alignment distributions between source sentences and human or machine translations This paper investigates the validity of the classification-based method by comparing it with well-known evaluation methods The experimental results show that our classification-based method can accurately evaluate fluency of machine translations.
International Conference of the Pacific Association for Computational Linguistics | 2017
Katsunori Kotani; Takehiko Yoshimi
Language teachers using listening materials on the Internet need to examine the ease of the listening materials (hereafter, listenability) they choose in order to maintain learners’ motivation for listening practice since the listenability of such materials is not controlled, unlike commercially available teaching materials. This study proposes to use a listenability index based on learners’ transcription performance. Transcription performance was determined using the normalized edit distance (hereafter, NED) from a learner’s transcription to a reference sentence. We examined the reliability and validity of NED as a dependent variable for listenability measurement using multiple regression analysis in an experiment comprising 50 learners of English as a foreign language. The results supported the reliability and validity of NED.
meeting of the association for computational linguistics | 2015
Katsunori Kotani; Takehiko Yoshimi
In order to develop effective computerassisted language teaching systems for learners of English as a foreign language, it is first necessary to identify gaps between learners and native speakers in the four basic linguistic skills (reading, writing, pronunciation, and listening). To identify these gaps, the accuracy and fluency in language use between learners and native speakers should be compared using a learner corpus. However, previous corpora have not included all necessary types of linguistic data. Therefore, in this study, we aimed to design and build a new corpus comprising all types of linguistic data necessary for comparing accuracy and fluency in basic linguistic skills between learners and native speakers.
International Conference of the Pacific Association for Computational Linguistics | 2015
Katsunori Kotani; Takehiko Yoshimi
The Internet serves as a source of authentic reading material, enabling learners to practice English in real contexts when learning English as a foreign language. An adaptive computer-assisted language learning and teaching system can assist in obtaining authentic materials such as news articles from the Internet. However, to match material level to a learner’s reading proficiency, the system must be equipped with a method to measure proficiency-based readability. Therefore, we developed a method for doing so. With our method, readability is measured through regression analysis using both learner and linguistic features as independent variables. Learner features account for learner reading proficiency, and linguistic features explain lexical, syntactic, and semantic difficulties of sentences. A cross validation test showed that readability measured with our method exhibited higher correlation (r = 0.60) than readability measured only with linguistic features (r = 0.46). A comparison of our method with the method without learner features showed a statistically significant difference. These results suggest the effectiveness of combined learner and linguistic features for measuring reading proficiency-based readability.
International Symposium on Natural Language Processing | 2016
Katsunori Kotani; Takehiko Yoshimi
A design for a listening learner corpus as a language resource for computer-assisted language learning systems is proposed, and a pilot learner corpus compiled from 20 university learners of English as a foreign language is reported. The learners dictated a news report. The corpus was annotated with part-of-speech tags and error tags for the dictation. The validity of the proposed learner corpus design was assessed on the basis of the pilot learner corpus, and was determined by examining the distribution of errors and whether the distribution properly demonstrated the learners’ listening ability. The validity of the corpus was further assessed by developing a listenability measurement method that explains the ease of listening of a listening material for learners. The results suggested the dictation-based corpus data was useful for assessing the ease of listening English materials for learners, which could lead to the development of a computer-assisted language learning system tool.
international conference on the computer processing of oriental languages | 2006
Takeshi Kutsumi; Takehiko Yoshimi; Katsunori Kotani; Ichiko Sata; Hitoshi Isahara
This paper gives a method of expanding bilingual dictionaries by creating a new multi-word entry (MWE) and its possible translation previously unregistered in bilingual dictionaries by replacing one of the components of a registered MWE with its semantically similar words, and then selecting appropriate lexical entries from the pairs of new MWEs and their possible translations according to a prioritizing method. In the proposed method, the pairs of new nominal MWEs and their possible translations are prioritized by referring to more than one thesaurus and considering the number of original MWEs from which a single new MWE is created. As a result, the pairs which are effective for improving translation quality, if registered in bilingual dictionaries, are acquired with an improvement of 55.0% for the top 500 prioritized pairs. This accuracy rate exceeds the one marked with the baseline method.
european conference on information retrieval | 2005
Takeshi Kutsumi; Takehiko Yoshimi; Katsunori Kotani; Ichiko Sata; Hitoshi Isahara
Bilingual dictionaries are essential components of cross-lingual information retrieval applications. The automatic acquisition of proper names and their translations from bilingual corpora is especially important, because a significant portion of the entries not listed in the dictionaries would be proper names.
Collaboration
Dive into the Takehiko Yoshimi's collaboration.
National Institute of Information and Communications Technology
View shared research outputsNational Institute of Information and Communications Technology
View shared research outputs