Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shigeru Katagiri is active.

Publication


Featured researches published by Shigeru Katagiri.


IEEE Transactions on Signal Processing | 1992

Discriminative learning for minimum error classification (pattern recognition)

Biing-Hwang Juang; Shigeru Katagiri

A formulation is proposed for minimum-error classification, in which the misclassification probability is to be minimized based on a given set of training samples. A fundamental technique for designing a classifier that approaches the objective of minimum classification error in a more direct manner than traditional methods is given. The method is contrasted with several traditional classifier designs in typical experiments to demonstrate the superiority of the new learning formulation. The method can applied to other classifier structures as well. Experimental results pertaining to a speech recognition task are provided to show the effectiveness of the technique. >


Proceedings of the IEEE | 1998

Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method

Shigeru Katagiri; Biing-Hwang Juang; Chin-Hui Lee

This paper provides a comprehensive introduction to a novel approach to pattern recognition which is based on the generalized probabilistic descent method (GPD) and its related design algorithms. The paper contains a survey of recent recognizer design techniques, the formulation of GPD, the concept of minimum classification error learning that is closely related to the GPD formalization, a relational analysis between GPD and other important design methods, and various embodiments of GPD-based design, including segmental-GPD, minimum spotting error training, discriminative utterance verification, and discriminative feature extraction. GPD development has its origins in basic pattern recognition and Bayes decision theory. It represents a simple but careful re-investigation of the classical theory and successfully leads to an innovative framework. For clarity of presentation, detailed discussions about its embodiments are provided for examples of speech pattern recognition tasks that use a distance-based classifier. Experimental results in speech pattern recognition tasks clearly demonstrate the remarkable utility of the family of GPD-based design algorithms.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

E. McDermott; Timothy J. Hazen; J. Le Roux; Atsushi Nakamura; Shigeru Katagiri

The minimum classification error (MCE) framework for discriminative training is a simple and general formalism for directly optimizing recognition accuracy in pattern recognition problems. The framework applies directly to the optimization of hidden Markov models (HMMs) used for speech recognition problems. However, few if any studies have reported results for the application of MCE training to large-vocabulary, continuous-speech recognition tasks. This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary (up to 100 k word) speech recognition tasks: the Corpus of Spontaneous Japanese lecture speech transcription task, a telephone-based name recognition task, and the MIT Jupiter telephone-based conversational weather information task. On these tasks, starting from maximum likelihood (ML) baselines, MCE training yielded relative reductions in word error ranging from 7% to 20%. Furthermore, this paper evaluates the use of different methods for optimizing the MCE criterion function, as well as the use of precomputed recognition lattices to speed up training. An overview of the MCE framework is given, with an emphasis on practical implementation issues


Speech Communication | 1990

ATR Japanese speech database as a tool of speech recognition and synthesis

Akira Kurematsu; Kazuya Takeda; Yoshinori Sagisaka; Shigeru Katagiri; Hisao Kuwabara; Kiyohiro Shikano

Abstract A large-scale Japanese speech database has been described. The database basically consists of (1) a word speech database, (2) a continuous speech database, (3) a database for a large number of speakers, and (4) a database for speech synthesis. Multiple transcriptions have been made in five different layers from simple phonemic descriptions to fine acoustic-phonetic transcriptions. The database has been used to develop algorithms in speech recognition and synthesis studies and to find acoustic, phonetic and linguistic evidence that will serve as basic data for speech technologies.


Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop | 1991

New discriminative training algorithms based on the generalized probabilistic descent method

Shigeru Katagiri; Chin-Hui Lee; Biing-Hwang Juang

The authors developed a generalized probabilistic descent (GPD) method by extending the classical theory on adaptive training by Amari (1967). Their generalization makes it possible to treat dynamic patterns (of a variable duration or dimension) such as speech as well as static patterns (of a fixed duration or dimension), for pattern classification problems. The key ideas of GPD formulations include the embedding of time normalization and the incorporation of smooth classification error functions into the gradient search optimization objectives. As a result, a family of new discriminative training algorithms can be rigorously formulated for various kinds of classifier frameworks, including the popular dynamic time warping (DTW) and hidden Markov model (HMM). Experimental results are also provided to show the superiority of this new family of GPD-based, adaptive training algorithms for speech recognition.<<ETX>>


IEEE Transactions on Speech and Audio Processing | 2001

An application of discriminative feature extraction to filter-bank-based speech recognition

Alain Biem; Shigeru Katagiri; Erik McDermott; Biing-Hwang Juang

A pattern recognizer is usually a modular system which consists of a feature extractor module and a classifier module. Traditionally, these two modules have been designed separately, which may not result in an optimal recognition accuracy. To alleviate this fundamental problem, the authors have developed a design method, named discriminative feature extraction (DFE), that enables one to design the overall recognizer, i.e., both the feature extractor and the classifier, in a manner consistent with the objective of minimizing recognition errors. This paper investigates the application of this method to designing a speech recognizer that consists of a filter-hank feature extractor and a multi-prototype distance classifier. Carefully investigated experiments demonstrate that DFE achieves the design of a better recognizer and provides an innovative recognition-oriented analysis of the filter-bank, as an alternative to conventional analysis based on psychoacoustic expertise or heuristics.


IEEE Transactions on Signal Processing | 1997

Pattern recognition using discriminative feature extraction

Alain Biem; Shigeru Katagiri; Biing-Hwang Juang

We propose a new design method, called discriminative feature extraction for practical modular pattern recognizers. A key concept of discriminative feature extraction is the design of an overall recognizer in a manner consistent with recognition error minimization. The utility of the method is demonstrated in a Japanese vowel recognition task.


international conference on acoustics, speech, and signal processing | 1993

Feature extraction based on minimum classification error/generalized probabilistic descent method

Alain Biem; Shigeru Katagiri

A novel approach to pattern recognition which comprehensively optimizes both a feature extraction process and a classification process is introduced. Assuming that the best features for recognition are the ones that yield the lowest classification error rate over unknown data, an overall recognizer, consisting of a feature extractor module and a classifier module, is trained using the minimum classification error (MCE)/generalized probabilistic descent (GPD) method. Although the proposed discriminative feature extraction approach is a direct and simple extension of MCE/GPD, it is a significant departure from conventional approaches, providing a comprehensive basis for the entire system design. Experimental results are presented for the simple example of optimally designing a cepstrum representation for vowel recognition. The results clearly demonstrate the effectiveness of the proposed method.<<ETX>>


international conference on acoustics, speech, and signal processing | 1989

Shift-invariant, multi-category phoneme recognition using Kohonen's LVQ2

Erik McDermott; Shigeru Katagiri

The authors describe a shift-tolerant neural network architecture for phoneme recognition. The system is based on LVQ2, an algorithm which pays close attention to approximating the optimal Bayes decision line in a discrimination task. Recognition performances in the 98-99% correct range were obtained for LVQ2 networks aimed at speaker-dependent recognition of phonemes in small but ambiguous Japanese phonemic classes. A correct recognition rate of 97.7% was achieved by a single, larger LVQ2 network covering all Japanese consonants. These recognition results are at least as high as those obtained in the time delay neural network system and suggest that LVQ2 could be the basis for a successful speech recognition system.<<ETX>>


Computer Speech & Language | 1994

Prototype-based minimum classification error/generalized probabilistic descent training for various speech units

Erik McDermott; Shigeru Katagiri

Abstract In previous work we reported high classification rates for learning vector quantization (LVQ) networks trained to classify phoneme tokens shifted in time. It has since been shown that the framework of minimum classification error (MCE) and generalized probabilistic descent (GPD) can treat LVQ as a special case of a general method for gradient descent on a rigorously defined classification loss measure that closely reflects the misclassification rate. This framework allows us to extend LVQ into a prototype-based minimum error classifier (PBMEC) appropriate for the classification of various speech units which the original LVQ was unable to treat. Speech categories are represented using a prototype-based multi-state architecture incorporating a dynamic time warping procedure. We present results for the difficult E-set task, as well as for isolated word recognition for a vocabulary of 5240 words, that reveal clear gains in performance as a result of using PBMEC. In addition, we discuss the issue of smoothing the loss function from the perspective of increasing classifier robustness.

Collaboration


Dive into the Shigeru Katagiri's collaboration.

Top Co-Authors

Avatar

Hideyuki Watanabe

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Atsushi Nakamura

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shigeki Matsuda

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yasuhiro Minami

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chin-Hui Lee

Georgia Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge