Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tomoyosi Akiba is active.

Publication


Featured researches published by Tomoyosi Akiba.


Journal of Information Processing | 2009

Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data

Tomoyosi Akiba; Kiyoaki Aikawa; Yoshiaki Itoh; Tatsuya Kawahara; Hiroaki Nanjo; Hiromitsu Nishizaki; Norihito Yasuda; Yoichi Yamashita; Katunobu Itou

The lecture is one of the most valuable genres of audiovisual data. Though spoken document processing is a promising technology for utilizing the lecture in various ways, it is difficult to evaluate because the evaluation require a subjective judgment and/or the verification of large quantities of evaluation data. In this paper, a test collection for the evaluation of spoken lecture retrieval is reported. The test collection consists of the target spoken documents of about 2, 700 lectures (604 hours) taken from the Corpus of Spontaneous Japanese (CSJ), 39 retrieval queries, the relevant passages in the target documents for each query, and the automatic transcription of the target speech data. This paper also reports the retrieval performance targeting the constructed test collection by applying a standard spoken document retrieval (SDR) method, which serves as a baseline for the forthcoming SDR studies using the test collection.


2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA) | 2016

Effects of class-based statistical machine translation on unknown names

Hitoshi Mizukami; Tomoyosi Akiba

This paper address the issue of dealing with named entities in statistical machine translation (SMT). Named entities (NEs) tend to become out-of-vocabulary of the translation models so that the translation quality of sentences including them are often degraded. For this problem, we propose to take advantage of class-based translation models. We employ a named entity recognizer to replace the named entities on the training corpus with a unique class label, then train the translation and language models. The input sentence is also NE-replaced and translated into the target sentence, whose class labels are then converted back into their original names. The experimental evaluation revealed that the proposed approach was effective on OOV named entities. We also found that acombination of class-based and standard SMTs further improved performance.


spoken language technology workshop | 2012

Incorporating syllable duration into line-detection-based spoken term detection

Teppei Ohno; Tomoyosi Akiba

A conventional method for spoken term detection (STD) is to apply approximate string matching to subword sequences in a spoken document obtained by speech recognition. An STD method that considers string matching as line detection in a syllable distance plane has been proposed. While this has demonstrated fast ordered-by-distance detections, it has still suffered from the insertion and deletion errors introduced by the speech recognition. In this work, we aim to improve detection performance by employing syllable-duration information. The proposed method enables robust detection by introducing a distance plane that uses frames as units instead of using syllables as units. Our experimental evaluation showed that the incorporation of syllable-duration achieved higher detection performance in high-recall regions.


2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) | 2017

ICD-10 code retrieval based on distributional semantics of diagnosis descriptions

Tomoyosi Akiba; Bon K. Sy; Ayman Zeidan

In this paper, we propose a method for extracting ICD-10 codes from the natural language description of a patient illness complaint. The proposed method is based on distributional semantics of terms that appeared in the two natural language expressions: a patients complaint and an ICD-10 code description. In order to locate the relevant fragment of words within a given long and noisy patients expression, word-to-word alignment is performed before evaluating the match between a patients complaint and an ICD code. The data set used for the preliminary study consists of 81 test patient records. For each record, the proposed system retrieves a set of codes from a total of 69,000 ICD-10 codes. Through the experimental evaluation, we found that, on average, the system was able to return 3.6 correct codes from the top 10 results. By making use of a users interaction, the performance was further improved to suggest about four correct codes from the top 10.


2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) | 2017

Addressing unknown word problem for neural machine translation using distributee representations of words as input features

Tomoki Nishimura; Tomoyosi Akiba

In recent years, the machine translation system based on neural network, called Neural Machie Translation, have attracted much attention, in which the entire translation steps are implemented in a single large neural network. In this framework, dealing with a large vocabulary size on its input (source) and output (target) often make the training computationally intractable. Therefore, the most frequent words in training data are retained to form a small vocabulary (shortlist), and the other, not frequent, words are all mapped to a single shared token. That causes so-called the unknown word problem. In this work, we propose three, rather simple, methods to overcome the unknown word problem based on distributed representation of words. We compared the translation performances of baseline and the proposed methods through experimental evaluation. Though the proposed methods did not improved the baseline in terms of BLEU, we found several evidences that the proposed methods successfully select appropriate target words even if their source words are out-of-vocabulary.


NTCIR | 2011

Overview of the IR for Spoken Documents Task in NTCIR-9 Workshop

Tomoyosi Akiba; Hiromitsu Nishizaki; Kiyoaki Aikawa; Tatsuya Kawahara; Tomoko Matsui


conference of the international speech communication association | 2010

Constructing Japanese Test Collections for Spoken Term Detection

Yoshiaki Itoh; Hiromitsu Nishizaki; Xinhui Hu; Hiroaki Nanjo; Tomoyosi Akiba; Tatsuya Kawahara; Seiichi Nakagawa; Tomoko Matsui; Yoichi Yamashita; Kiyoaki Aikawa


NTCIR | 2004

Question Answering using "Common Sense" and Utility Maximization Principle

Tomoyosi Akiba; Katunobu Itou; Atsushi Fujii


NTCIR | 2007

Non-factoid Question Answering Experiments at NTCIR-6: Towards Answer Type Detection for Realworld Questions.

Junta Mizuno; Tomoyosi Akiba; Atsushi Fujii; Katunobu Itou


conference of the international speech communication association | 2010

Metric subspace indexing for fast spoken term detection.

Taisuke Kaneko; Tomoyosi Akiba

Collaboration


Dive into the Tomoyosi Akiba's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kiyoaki Aikawa

Tokyo University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tomoko Matsui

International Christian University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yoshiaki Itoh

Iwate Prefectural University

View shared research outputs
Researchain Logo
Decentralizing Knowledge