Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Takanobu Nishiura is active.

Publication


Featured researches published by Takanobu Nishiura.


ieee automatic speech recognition and understanding workshop | 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance

Norihide Kitaoka; Kazumasa Yamamoto; Tomohiro Kusamizu; Seiichi Nakagawa; Takeshi Yamada; Satoru Tsuge; Chiyomi Miyajima; Takanobu Nishiura; Masato Nakayama; Yuki Denda; Masakiyo Fujimoto; Tetsuya Takiguchi; Satoshi Tamura; Shingo Kuroiwa; Kazuya Takeda; Satoshi Nakamura

Voice activity detection (VAD) plays an important role in speech processing including speech recognition, speech enhancement, and speech coding in noisy environments. We developed an evaluation framework for VAD in such environments, called corpus and environment for noisy speech recognition 1 concatenated (CENSREC-1-C). This framework consists of noisy continuous digit utterances and evaluation tools for VAD results. By adoptiong two evaluation measures, one for frame-level detection performance and the other for utterance-level detection performance, we provide the evaluation results of a power-based VAD method as a baseline. When using VAD in speech recognizer, the detected speech segments are extended to avoid the loss of speech frames and the pause segments are then absorbed by a pause model. We investigate the balance of an explicit segmentation by VAD and an implicit segmentation by a pause model using an experimental simulation of segment extension and show that a small extension improves speech recognition.


IEICE Transactions on Information and Systems | 2006

Robust Talker Direction Estimation Based on Weighted CSP Analysis and Maximum Likelihood Estimation

Yuki Denda; Takanobu Nishiura; Yoichi Yamashita

This paper describes a new talker direction estimation method for front-end processing to capture distant-talking speech by using a microphone array. The proposed method consists of two algorithms: One is a TDOA (Time Delay Of Arrival) estimation algorithm based on a weighted CSP (Cross-power Spectrum Phase) analysis with an average speech spectrum and CSP coefficient subtraction. The other is a talker direction estimation algorithm based on ML (Maximum Likelihood) estimation in a time sequence of the estimated TDOAs. To evaluate the effectiveness of the proposed method, talker direction estimation experiments were carried out in an actual office room. The results confirmed that the talker direction estimation performance of the proposed method is superior to that of the conventional methods in both diffused- and directional-noise environments.


international conference on computer graphics and interactive techniques | 2010

Virtual Yamahoko parade in virtual Kyoto

Woong Choi; Takahiro Fukumori; Kohei Furukawa; Kozaburo Hachimura; Takanobu Nishiura; Keiji Yano

Recently, extensive research has been undertaken on digital archiving of cultural properties in the field of cultural heritage. These investigations have examined the processes of recording and preserving both tangible and intangible materials through the use of digital technologies.


Journal of the Acoustical Society of America | 2003

Study of environmental sound source identification based on hidden Markov model for robust speech recognition

Takanobu Nishiura; Satoshi Nakamura

Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands‐free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self‐moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)‐based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound‐added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional ...


international conference on multimedia and expo | 2002

Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding

Satoshi Nakamura; Kazuo Hiyane; Futoshi Asano; Yutaka Kaneda; Takeshi Yamada; Takanobu Nishiura; Tetsunori Kobayashi; Shiro Ise; Hiroshi Saruwatari

The sound data for open evaluation is necessary for studies such as sound source localization, sound retrieval, sound recognition and hands-free speech recognition in real acoustic environments. This paper reports on our project for acoustic data collection. There are many kinds of sound scenes in real environments. The sound scene is specified by sound sources and room acoustics. The number of combinations of the sound sources, source positions and rooms is huge in real acoustic environments. We assumed that the sound in the environments can be simulated by convolution of the isolated sound sources and impulse responses. As an isolated sound source, hundred kinds of environment sounds and speech sounds are collected. The impulse responses are collected in various acoustic environments. Additionally we collected sounds from a moving source. In this paper, progress of our sound scene database collection project and application to environment sound recognition and hands-free speech recognition are described.


asia-pacific signal and information processing association annual summit and conference | 2013

Estimation of speech recognition performance in noisy and reverberant environments using PESQ score and acoustic parameters

Takahiro Fukumori; Masato Nakayama; Takanobu Nishiura; Yoichi Yamashita

The automatic speech recognition (ASR) performance is degraded in noisy and reverberant environments. Although various techniques against degradation of the ASR performance have been proposed, it is difficult to properly apply them in evaluation environments with unknown noisy and reverberant conditions. It is possible to properly apply these techniques for improving the ASR performance if we can estimate the relationship between the ASR performance and degradation factors including both noise and reverberation. In this study, we here propose new noisy and reverberant criteria which are referred as “Noisy and Reverberant Speech Recognition with the PESQ and the Dn (NRSR-PDn)”. We first designed the “NRSR-PDn” using the relationships among the D value, the PESQ score, and the ASR performance. We then estimated the ASR performance with the designed criteria “NRSR-PDn” in evaluation experiments. Experimental evaluations demonstrated that our proposed criteria make the well suited for robustly estimating the ASR performance in noisy and reverberant environments.


international symposium on consumer electronics | 2009

A fundamental study of novel speech interface for computer games

Hiroaki Nanjo; Hiroki Mikami; Suguru Kunimatsu; Hiroshi Kawano; Takanobu Nishiura

A novel speech interface for computer game systems is addressed. For typical computer game systems, users have input game commands with touching devices such as gamepad, keyboard, touch-panel and foot-board. Acceleration sensors which can detect players motion have been general. Some systems have speech interface for controlling games, namely, voice controller. A simple speech interface just detects large sound signals, and some game systems can recognize what users say with automatic speech recognition (ASR) module. Although ASR works well for polite speech, it dose not work well for exciting speeches such as shout by excited game players. In this paper, we focus on the speech interface which can deal with excited speech, and describe ASR which can discriminate shout from natural speech based on the understanding of speech events.


information assurance and security | 2009

Acoustic-Based Security System: Towards Robust Understanding of Emergency Shout

Hiroaki Nanjo; Takanobu Nishiura; Hiroshi Kawano

We have been investigating a speech processing system for ensuring safety and security, namely, acoustic-based security system. Focusing on indoor security, we have been studying for an advanced security system which can discriminate emergency shout from the other acoustic sound events based on automatic understanding of speech events.In this paper, we present our investigations, and describe fundamental results.


international symposium on mixed and augmented reality | 2007

A Two-by-Two Mixed Reality System That Merges Real and Virtual Worlds in Both Audio and Visual Senses

Kyota Higa; Takanobu Nishiura; Asako Kimura; Fumihisa Shibata; Hideyuki Tamura

There have been many implementations of virtual reality, using audio and visual senses. However, implementations of mixed reality (MR) have thus far only dealt with the visual sense. We have developed an MR system that merges real and virtual worlds in both the audio and visual senses, wherein the geometric consistency of the audio sense was fully coordinated with the visual sense. We tried two approaches for merging real and virtual worlds in the audio sense, using open-air and closed-air headphones.


international conference on data engineering | 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework

Masakiyo Fujimoto; Satoshi Nakamura; Toshiki Endo; Kazuya Takeda; Chiyomi Miyajima; Shingo Kuroiwa; Takeshi Yamada; Norihide Kitaoka; Kazumasa Yamamoto; Mitsunori Mizumachi; Takanobu Nishiura; Akira Sasou

This paper introduces a common database, an evaluation framework, and its baseline recognition results for in-car speech recognition, CENSREC-3, as an outcome of IPSJ-SIG SLP Noisy Speech Recognition Evaluation Working Group. CENSREC-3 which is a sequel of AURORA-2J is designed as the evaluation framework of isolated word recognition in real driving car environments. Speech data was collected using 2 microphones, a close-talking microphone and a hands-free microphone, under carefully controlled 16 different driving conditions, i.e., combinations of 3 car speeds and 5 car conditions. CENSREC-3 provides 6 evaluation environments which are designed using speech data collected in these car conditions.

Collaboration


Dive into the Takanobu Nishiura's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Satoshi Nakamura

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yuki Denda

Ritsumeikan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge