Akira Sasou
National Institute of Advanced Industrial Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Akira Sasou.
Speech Communication | 2006
Akira Sasou; Futoshi Asano; Satoshi Nakamura; Kazuyo Tanaka
In this paper, we describe a hidden Markov model (HMM)-based feature-compensation method. The proposed method compensates for noise-corrupted speech features in the mel-frequency cepstral coefficient (MFCC) domain using the output probability density functions (pdfs) of the HMM. In compensating the features, the output pdfs are adaptively weighted according to forward path probabilities. Because of this, the proposed method can minimize degradation of feature-compensation accuracy due to a temporarily changing noise environment. We evaluated the proposed method based on the AURORA2 database. All the experiments were conducted under clean conditions. The experimental results indicate that the proposed method, combined with cepstral mean subtraction, can achieve a word accuracy of 87.64%. We also show that the proposed method is useful in a transient pulse noise environment.
international conference on acoustics, speech, and signal processing | 2005
Akira Sasou; Masataka Goto; Satoru Hayamizu; Kazuyo Tanaka
We have previously described an auto-regressive hidden Markov model (AR-HMM) and an accompanying parameter estimation method. The AR-HMM was obtained by combining an AR process with an HMM introduced as a non-stationary excitation model. We demonstrated that the AR-HMM can accurately estimate the characteristics of both articulatory systems and excitation signals from high-pitched speech. In this paper, we apply the AR-HMM to feature extraction from singing voices and evaluate the recognition accuracy of the AR-HMM-based approach.
EURASIP Journal on Advances in Signal Processing | 2009
Akira Sasou; Hiroaki Kojima
Conventional voice-driven wheelchairs usually employ headset microphones that are capable of achieving sufficient recognition accuracy, even in the presence of surrounding noise. However, such interfaces require users to wear sensors such as a headset microphone, which can be an impediment, especially for the hand disabled. Conversely, it is also well known that the speech recognition accuracy drastically degrades when the microphone is placed far from the user. In this paper, we develop a noise robust speech recognition system for a voice-driven wheelchair. This system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors. We verified the effectiveness of our system in experiments in different environments, and confirmed that our system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors.
international conference on robot communication and coordination | 2009
Akira Sasou
In this paper, we propose an acoustic-based head orientation estimation method using a microphone array mounted on a wheelchair, and apply it to a novel interface for controlling a powered wheelchair. The proposed interface does not require disabled people to wear any microphones or utter recognizable voice commands. By mounting the microphone array system on the wheelchair, our system can easily distinguish user utterances from other voices without using a speaker identification technique. The proposed interface is also robust to interference from surrounding noise. From the experimental results, we confirm the feasibility and effectiveness of the proposed method.
international conference on data engineering | 2005
Masakiyo Fujimoto; Satoshi Nakamura; Toshiki Endo; Kazuya Takeda; Chiyomi Miyajima; Shingo Kuroiwa; Takeshi Yamada; Norihide Kitaoka; Kazumasa Yamamoto; Mitsunori Mizumachi; Takanobu Nishiura; Akira Sasou
This paper introduces a common database, an evaluation framework, and its baseline recognition results for in-car speech recognition, CENSREC-3, as an outcome of IPSJ-SIG SLP Noisy Speech Recognition Evaluation Working Group. CENSREC-3 which is a sequel of AURORA-2J is designed as the evaluation framework of isolated word recognition in real driving car environments. Speech data was collected using 2 microphones, a close-talking microphone and a hands-free microphone, under carefully controlled 16 different driving conditions, i.e., combinations of 3 car speeds and 5 car conditions. CENSREC-3 provides 6 evaluation environments which are designed using speech data collected in these car conditions.
international symposium on universal communication | 2008
Akira Sasou
In this paper, we propose an acoustic-based head orientation estimation method using a microphone array mounted on a chair, and also propose a novel strategy for how to integrate the head orientation estimation with a speech recognition system for improving noise robustness. We apply the proposed system to voice-driven control of home electronics. In the proposed system, the user can indicate a target by facing it. This interface is more intuitive than indicating the target by uttering the targets name. From the experimental results, we can confirm that the proposed head orientation estimation is very stable and reliable irrespective of the mixed noise levels, and the integration of the head orientation estimation with the speech recognition makes the system more robust to noises.
Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004. | 2004
Akira Sasou; Masataka Goto; Satoru Hayamizu; Kazuyo Tanaka
Previously, we proposed an auto-regressive hidden Markov model (AR-HMM) and an accompanying parameter estimation method. An AR-HMM was obtained by combining an AR process with an HMM introduced as a non-stationary excitation model. We demonstrated that the AR-HMM can accurately estimate the characteristics of both articulatory systems and excitation signals from high-pitched speech. As the parameter estimation method iteratively executes learning processes of HMM parameters, the proposed method was calculation-intensive. Here, we propose two novel kinds of auto-regressive, non-stationary excited signal parameter estimation methods to reduce the amount of calculation required
international workshop on machine learning for signal processing | 2011
Akira Sasou
The importance of video-surveillance applications has been increasing with the increase of crime and terrorism. In addition to traditional video cameras, the use of acoustic sensors in surveillance and monitoring applications is also becoming increasingly important. In this paper, we apply a High-order Local Auto-Correlation (HLAC) system, which has succeeded in video surveillance application, to extract features from acoustic signals for acoustic-surveillance systems. Experiment results confirmed that the proposed acoustic-surveillance system outperforms a cepstrum-based one under all SNR conditions.
Signal Processing | 2003
Akira Sasou; Kazuyo Tanaka
This paper describes a novel method for segregating concurrent monaural sounds. In a real environment, there are many types of sounds, such as periodic sound, aperiodic sound, impulsive sound and so on, and several sounds usually occur simultaneously. In order to recognize the sounds, it is necessary to be able to model such various type of sounds and segregate the concurrent sounds. The proposed method adopts a waveform generation model consisting of an Auto- Regressive process and a Hidden Markov Model as a template model and achieves segregation of monaural concurrent sounds based on the mixed AR-HMMs. Experiments were conducted to confirm the feasibility of the method using ten types of non-speech sounds. The experimental results indicate that the proposed method is effective for various types of sounds.
international conference on information and communication technologies | 2018
Akira Sasou; Nyamerdene Odontsengel; Shumpei Matsuoka
Japan is becoming a super-aged society and the population of elderly people is increasing, although the overall population in Japan is decreasing. In order to support a safe and secure autonomous life and to improve the quality of life for elderly people living alone, the development of monitoring and life-support systems is a pressing matter. In this paper, we propose a monitoring system that would enable relatives and other interested parties to easily monitor the daily life of elderly people from afar by using mobile devices. With such a monitoring system, it is very important to protect the privacy of the people being monitored. The proposed monitoring system seeks to approximately but recognizably reconstruct the status of elderly peoples daily life by using computer graphics (CG) based on information obtained from various types of sensors, mainly consisting of acoustic sensors such as microphone arrays that are utilized to track the walking patterns of elderly people based solely on the sound of their footsteps.
Collaboration
Dive into the Akira Sasou's collaboration.
National Institute of Advanced Industrial Science and Technology
View shared research outputs