Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Akinori Kawamura is active.

Publication


Featured researches published by Akinori Kawamura.


international conference on acoustics, speech, and signal processing | 2006

Robust Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction

Koichi Yamamoto; Firas Jabloun; Klaus Reinhard; Akinori Kawamura

Accurate endpoint detection is important for improving the speech recognition capability. This paper proposes a novel endpoint detection method which combines energy-based and likelihood ratio-based voice activity detection (VAD) criteria, where the likelihood ratio is calculated with speech/non-speech Gaussian mixture models (GMMs). Moreover, the proposed method introduces the discriminative feature extraction technique (DFE) in order to improve the speech/non-speech classification. The DFE is used in the training of parameters required for calculating the likelihood ratio. Experimental results have shown that the proposed endpointer achieves good performance compared to an energy-based endpointer in terms of start-of-speech (SOS) and end-of-speech (EOS) detections. Due to the improvement of the endpointer, the performance of automatic speech recognition (ASR) has also been improved


international conference on document analysis and recognition | 1997

An on-line Japanese character recognition method using length-based stroke correspondence algorithm

Yojiro Tonouchi; Akinori Kawamura

Because most Japanese characters consist of several strokes, many recognition methods determine stroke correspondence before calculating the distance between the input character and the template character. When the writing style is cursive, however, the stroke number is changeable. If stroke numbers differ between the input and template characters, stroke correspondence is not one-to-one, and it is not easy to determine the correspondence. The method proposed, uses stroke length, because these lengths are relatively stable even when the writing is cursive. The calculation of recognition using stroke lengths is faster than that of recognition using other multidimensional feature values such as the coordinates. Experimental results show the effectiveness of the proposed method.


international conference on document analysis and recognition | 2007

Text Input System Using Online Overlapped Handwriting Recognition for Mobile Devices

Yojiro Tonouchi; Akinori Kawamura

This paper proposes a novel online overlapped handwriting recognition system for mobile devices such as cellular phones. Users can input characters continuously without pauses on the single writing area. It has three features: small writing area, quick response and direct operations with handwritten gestures. Therefore, it is suitable for mobile devices such as cellular phones. The system realizes a new handwriting interface similar to touch-typing. We evaluated the system by two experiments: character recognition performance and text entry speed of Japanese sentences. Through these experiments we showed the effectiveness of the proposed system.


Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 4th Joint Workshop on | 2014

An auxiliary-function approach to online independent vector analysis for real-time blind source separation

Toru Taniguchi; Nobutaka Ono; Akinori Kawamura; Shigeki Sagayama

This paper proposes online independent vector analysis (IVA) based on an auxiliary-function approach for real-time blind speech separation. A batch auxiliary-function approach is naturally extended with autoregressive approximation of an auxiliary variable. Experimental evaluations show that the proposed online algorithm works in real time and attains relatively high signal-to-interference ratios without environment-sensitive tuning parameters such as step size under both spatially stationary and dynamic conditions compared to usual real-time IVAs using natural gradient updates or block-wise updates. Our implementation of the proposed algorithm works in real-time for four-channel observations on PCs and worked stably over 7 hours in realistic noisy environments.


international conference on acoustics, speech, and signal processing | 2006

Speech Recognition Using Syllable Duration Ratio Model

Masahide Ariu; Takashi Masuko; Shinichi Tanaka; Akinori Kawamura

This paper describes a novel approach to duration information modeling for speech recognition. To eliminate the influence of speaking rate on the duration model, we propose a model utilizing the duration ratios of two successive syllables by log-normal distributions. We refer to this model as a syllable duration ratio model (SDRM), and compare it with a syllable duration model (SDM) that represents the duration of the syllable itself. These duration models are compared in isolated word and connected digit recognition tasks under noisy conditions. Experimental results show that the SDRM outperformed the SDM, and reduced the errors by approximately 30% compared to the baseline system without duration model at 15 dB or higher SNR in 10 digits recognition tasks. In addition, we show that the SDRM is robust with respect to the difference in speaking rate between training and test data


international symposium on chinese spoken language processing | 2014

Joint-character-POC N-gram language modeling for Chinese speech recognition

Bin Wang; Zhijian Ou; Jian Li; Akinori Kawamura

The state-of-the-art language models (LMs) for Chinese speech recognition are word n-gram models. However, in Chinese, characters are morphological in meaning and words are not consistently defined. There are recent interests in building the character n-gram LM and its combination with the word n-gram LM. In this paper, in order to exploit both character-level and word-level constraints, we propose the joint n-gram LM, which is an n-gram model based on joint-state that is a pair of character and its position-of-character (POC) tag. We point out the pitfall in naive solving of the smoothing and scoring problems for joint n-gram models, and provide corrected solutions. For experimental comparison, different LMs (including word 4-grams, character 6-grams and joint 6-grams) are tested for speech recognition, using training corpus of 1.9 billion characters. The joint n-gram LM achieves performance improvements, especially in recognizing the utterances containing OOV words.


Archive | 2001

Apparatus, method, and program for handwriting recognition

Akinori Kawamura; Yojiro Tonouchi


Archive | 2005

Noise suppression apparatus and method

Tadashi Amada; Akinori Kawamura; Ryosuke Koshiba


Archive | 2011

Television apparatus and a remote operation apparatus

Kazushige Ouchi; Akinori Kawamura; Masaru Sakai; Kaoru Suzuki; Yusuke Kida


Archive | 2005

Device, program, and method for sound signal processing

Akinori Kawamura; Koichi Yamamoto; 幸一 山本; 聡典 河村

Collaboration


Dive into the Akinori Kawamura's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge