Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Akio Amano is active.

Publication


Featured researches published by Akio Amano.


Journal of the Acoustical Society of America | 1993

Speech recognition apparatus using neural network and fuzzy logic

Akio Amano; Akira Ichikawa; Nobuo Hataoka

A speech recognition apparatus has: a speech input unit for inputting a speech; a speech analysis unit for analyzing the inputted speech to output the time series of a feature vector; a candidates selection unit for inputting the time series of a feature vector from the speech analysis unit to select a plurality of candidates of recognition result from the speech categories; and a discrimination processing unit for discriminating the selected candidates to obtain a final recognition result. The discrimination processing unit includes three components in the form of a pair generation unit for generating all of the two combinations of the n-number of candidates selected by said candidate selection unit a pair discrimination unit for discriminating which of the candidates of the combinations is more certain for each of all n C2 -number of combinations (or pairs) on the basis of the extracted result of the acoustic feature intrinsic to each of said candidate speeches and a final decision unit for collecting all the pair discrimination results obtained from the pair discrimination unit for each of all the n C2 -number of combinations (or pairs) to decide the final result. The pair discrimination unit handles the extracted result of the acoustic feature intrinsic to each of the candidate speeches as fuzzy information and accomplishes the discrimination processing on the basis of fuzzy logic algorithms, and the final decision unit accomplishes its collections on the basis of the fuzzy logic algorithms.


Journal of the Acoustical Society of America | 1994

Noise reduction system using neural network

Toshiyuki Aritsuka; Akio Amano; Nobuo Hataoka; Akira Ichikawa

A noise reduction system used for transmission and/or recognition of speech includes a speech analyzer for analyzing a noisy speech input signal thereby converting the speech signal into feature vectors such as autocorrelation coefficients, and a neural network for receiving the feature vectors of the noisy speech signal as its input. The neural network extracts from a codebook an index of prototype vectors corresponding to a noise-free equivalent to the noisy speech input signal. Feature vectors of speech are read out from the codebook on the basis of the index delivered as an output from the neural network, thereby causing the speech input to be reproduced on the basis of the feature vectors of speech read out from the codebook.


international conference on acoustics speech and signal processing | 1998

Development of robust speech recognition middleware on microprocessor

Nobuo Hataoka; Hiroaki Kokubo; Yasunari Obuchi; Akio Amano

We have developed speech recognition middleware on a RISC microprocessor which has robust processing functions against environmental noise and speaker differences. The speech recognition middleware enables developers and users to use a speech recognition process for many possible speech applications, such as car navigation systems and handheld PCs. We report implementation issues of speech recognition process in middleware of microprocessors and propose robust noise handling functions using ANC (adaptive noise cancellation) and noise adaptive models. We also propose a new speaker adaptation algorithm in which the relationships among HMMs (hidden Markov models) transfer vectors are provided as a set of pre-trained interpolation coefficients. Experimental evaluations on 1000-word vocabulary speech recognition showed promising results for both robust processing functions of the proposed noise handling methods and the proposed speaker adaptation method.


multimedia signal processing | 2002

Compact and robust speech recognition for embedded use on microprocessors

Nobuo Hataoka; Hiroaki Kokubo; Yasunari Obuchi; Akio Amano

We propose a compact and noise robust embedded speech recognition system implemented on microprocessors aiming for sophisticated HMIs (human machine interfaces) of car information systems. The compactness is essential for embedded systems because there are strict restrictions of CPU (central processing unit) power and available memory capacities. In this paper, first we report noise robust acoustic HMMs (hidden Markov models) and a compact spectral subtraction (SS) method after exhausting evaluation stages using real speech data recorded at car running environments. Next, we propose very novel memory assignment of acoustic models based on the product codes or sub-vector quantization technique resulting on 1 fourth memory reduction for the 2000-word vocabulary.


Journal of the Acoustical Society of America | 1993

Speech recognition apparatus capable of discriminating between similar acoustic features of speech

Akio Amano; Nobuo Hataoka; Shunichi Yajima; Akira Ichikawa

A speech recognition apparatus including a memory for storing with respect to each feature specific to a particular phoneme a name of a process and a procedure of the process which is performed in order to search whether the presence of a feature specific to a certain type of speech is included in a feature vector series, and for storing a table in which the names of the processes in a performed for all the categories of speech to be recognized. The information stored in the memory is used to discriminate between two categories and provides ways for interpreting the results of the process. The recognition processes performed the discrimination is done in accordance with the information stored in the table.


international conference on acoustics, speech, and signal processing | 2009

DOA estimation method based on sparseness of speech sources for human symbiotic robots

Masahito Togami; Akio Amano; Takashi Sumiyoshi; Yasunari Obuchi

In this paper, direction of arrival (DOA) estimation methods (both azimuth and elevation) based on sparseness of human speech, “modified delay-and-sum beamformer based on sparseness (MDSBF)” and “stepwise phase difference restoration (SPIRE)”, are introduced for human symbiotic robots. MDSBF can achieve good DOA estimation, whose computational cost is proportional to resolution of azimuth and elevation space. DOA estimation result of SPIRE is less accurate than that of MDSBF, but computational cost is independent of resolution. To achieve more accurate DOA estimation result than SPIRE with small computational cost, we propose a novel DOA estimation method which is combination of MDSBF and SPIRE. In the proposed method, MDSBF with rough resolution is performed prior to SPIRE execution, and SPIRE precisely estimates DOA of sources after MDSBF. Experimental results show that sparseness based methods are superior to conventional methods. The proposed combination method achieved more accurate DOA estimation result than SPIRE with smaller computational cost than MDSBF.


ieee automatic speech recognition and understanding workshop | 1997

A novel speaker adaptation algorithm and its implementation on a RISC microprocessor

Yasunari Obuchi; Akio Amano; Nobuo Hataoka

We have developed speech recognition middleware on a RISC microprocessor. The speech recognition function is required in many applications of RISC microprocessors, such as ear navigation systems and handheld PCs. The speech recognition middleware provides a fundamental library for developers to make those applications. Speaker adaptation is one of the most important functions to realize robust recognition performance. As part of the speech recognition middleware, we have developed a new speaker adaptation algorithm, in which the relationships among HMM (hidden Markov model) transfer vectors are provided as a set of pre-trained interpolation coefficients. Experimental evaluations showed promising results that 28% of recognition errors are reduced using 10 words for adaptation and 52% are reduced using 50 words.


multimedia signal processing | 1999

Sophisticated speech processing middleware on microprocessor

Nobuo Hataoka; Hiroaki Kokubo; Nobuo Nukaga; Yasunari Obuchi; Akio Amano; Yoshinori Kitahara

This paper describes speech processing middleware which has been developed on RISC microprocessors for embedded speech applications. This middleware consists of a speech recognition module and a speech synthesis module, and especially the speech recognition middleware has advantages of robustness for environmental noise and speaker differences. The speech middleware provides sophisticated user interfaces to multimedia systems using microprocessors as CPUs, such as car navigation systems, mobile information equipment, and game machines.


international conference on acoustics, speech, and signal processing | 1986

VCV Segmentation and phoneme recognition in continuous speech

Nobuo Hataoka; Akio Amano; Shunichi Yajima; H. En'doh

An algorithm for phoneme recognition in continuous speech is described. The main features of this algorithm are the SElf-Adjusted SEGmentation (SEASEG), a method for segmenting continuous speech into a CV or VCV level, and novel consonant recognition, a new method for weighting a distance measure according to the speech transition information. Experimental results confirm a vowel interval segmentation rate of 97.4 % and a consonant recognition score of 85 % for a specified speaker.


international conference on acoustics, speech, and signal processing | 1990

Large vocabulary speed recognition using neural-fuzzy and concept networks

Nobuo Hataoka; Akio Amano; Toshiyuki Aritsuka; Akira Ichikawa

An algorithm for large vocabulary speech recognition using two kinds of connectionist models is described. The first one is a phoneme recognition model which uses a method combining neural nets and fuzzy inference called neural-fuzzy. This method uses neural nets as acoustic feature detectors and fuzzy logic as a decision procedure. The other is a connected-word sequence selection method using semantic information about conceptual relationships among vocabulary words. The basic idea of this method is derived from the fact that human beings can recognize words and content precisely from the topic and/or the context even when ambiguous utterances appear in conversation. The proposed method selects only word sequences that are related to each other in meaning from the several candidates, by using excitatory and inhibitory interactions with units (words). >

Collaboration


Dive into the Akio Amano's collaboration.

Researchain Logo
Decentralizing Knowledge