Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michiel Bacchiani is active.

Publication


Featured researches published by Michiel Bacchiani.


Speech Communication | 1999

Joint lexicon, acoustic unit inventory and model design

Michiel Bacchiani; Mari Ostendorf

Although most parameters in a speech recognition system are estimated from data by the use of an objective function, the unit inventory and lexicon are generally hand crafted and therefore unlikely to be optimal. This paper proposes a joint solution to the related problems of learning a unit inventory and corresponding lexicon from data. On a speaker-independent read speech task with a 1k vocabulary, the proposed algorithm outperforms phone-based systems at both high and low complexities. Obwohl die meisten Parameter eines Spracherkennungssystems aus Daten geschatzt werden, ist die Wahl der akustischen Grundeinheiten und des Lexikons normalerweise nicht automatisch und deshalb wahrscheinlich nicht optimal. Dieser Artikel stellt einen kombinierten Ansatz fur die Losung dieser verwandten Probleme dar - das Lernen von akustischen Grundeinheiten und des zugehorigen Lexikons aus Daten. Experimente mit sprecher-unabhangigen gelesenen Sprachdaten mit einem Vokabular von 1000 Wortern zeigen, da?s der vorgestellte Ansatz besser ist als ein System niedriger oder hoherer Komplexitat, das auf Phonemen basiert ist. Bien que la plupart des parametres dans un systeme de reconnaissance de la parole soient estimes a partie des donnees en utilisant une fonction objective, linventaire des unites acoustiques et le lexique sont generalement crees a la main, et donc susceptibles de ne pas etre optimeux. Cette etude propose une solution conjointe aux problemes interdependants que sont lapprentissage a partir des donnees dun inventaire des unites acoustiques et du lexique correspondant. Nous avons teste lalgorithme propose sur des echantillons lus, en reconnaissance independantes du locuteur avec un vocabulaire de 1k: il surpasse les systemes phonetiques en faible ou forte complexite.


international conference on spoken language processing | 1996

Speech recognition based on acoustically derived segment units

Toshiaki Fukada; Michiel Bacchiani; Kuldip Kumar Paliwal; Yoshinori Sagisaka

The paper describes a new method of word model generation based on acoustically derived segment units (henceforth ASUs). An ASU-based approach has the advantages of growing out of human pre-determined phonemes and of consistently generating acoustic units by using the maximum likelihood (ML) criterion. The former advantage is effective when it is difficult to map acoustics to a phone such as with highly co-articulated spontaneous speech. In order to implement an ASU-based modeling approach in a speech recognition system, one must first solve two points: (1) how does one design an inventory of acoustically-derived segmental units and (2) how does one model the pronunciations of lexical entries in terms of the ASUs. As for the second question, the authors propose an ASU-based word model generation method by composing the ASU statistics, that is, their means, variances and durations. The effectiveness of the proposed method is shown through spontaneous word recognition experiments.


international conference on acoustics speech and signal processing | 1996

Design of a speech recognition system based on acoustically derived segmental units

Michiel Bacchiani; Mari Ostendorf; Yoshinori Sagisaka; Kuldip Kumar Paliwal

The design of a speech recognition system based on acoustically-derived, segmental units can be divided in three steps: unit design, lexicon building and pronunciation modeling. We formulate an iterative unit design procedure which consistently uses a maximum likelihood (ML) objective in successive application of resegmentation and model re-estimation. The lexicon building allows multi-word entries in the lexicon but restricts the number of these entries in order to avoid a too costly search. Selected multi-word lexical entries are those with high frequency (such as function words) and those which consistently exhibit cross-word phone assimilation. The stochastic pronunciation model represents the likelihood of a particular acoustic segment sequence given the phonetic baseform of a lexical item, where the sequence of baseform phones are treated as a Markov state sequence and each state can emit multiple segments.


ieee workshop on neural networks for signal processing | 1995

Simultaneous design of feature extractor and pattern classifier using the minimum classification error training algorithm

Kuldip Kumar Paliwal; Michiel Bacchiani; Yoshinori Sagisaka

Recently, a minimum classification error training algorithm has been proposed for minimizing the misclassification probability based on a given set of training samples using a generalized probabilistic descent method. This algorithm is a type of discriminative learning algorithm, but it approaches the objective of minimum classification error in a more direct manner than the conventional discriminative training algorithms. We apply this algorithm for simultaneous design of feature extractor and pattern classifier, and demonstrate some of its properties and advantages.


international conference on acoustics, speech, and signal processing | 1994

Optimization of time-frequency masking filters using the minimum classification error criterion

Michiel Bacchiani; Kiyoaki Aikawa

The dynamic cepstrum parameter representing a masked spectrum performed extremely well in continuous speech recognition. This paper proposes a new algorithm for optimizing the dynamic cepstrum lifter array. The masking filter is represented by a set of Gaussian-shaped lifters. The standard deviation and the gain of the Gaussians are trained in order to improve the performance of the time-frequency filter. Parameterizing the lifter shape provides robustness against unknown speech samples. Because of the parameterized lifters small degree of freedom, it can avoid over-learning. The gradient descent optimizing algorithm is formulated for both a neural network classifier and an HMM classifier. The optimized dynamic cepstrum successfully improved the speech recognition performance for the speech spoken even in a different speaking style.<<ETX>>


conference of the international speech communication association | 2016

Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling

Ehsan Variani; Tara N. Sainath; Izhak Shafran; Michiel Bacchiani

State-of-the-art automatic speech recognition (ASR) systems typically rely on pre-processed features. This paper studies the time-frequency duality in ASR feature extraction methods and proposes extending the standard acoustic model with a complex-valued linear projection layer to learn and optimize features that minimize standard cost functions such as crossentropy. The proposed Complex Linear Projection (CLP) features achieve superior performance compared to pre-processed Log Mel features.


conference of the international speech communication association | 2005

Fast Vocabulary-Independent Audio Search Using Path-Based Graph Indexing

Olivier Siohan; Michiel Bacchiani


Archive | 1996

Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode

Mari Ostendorf; B. Byrne; Michiel Bacchiani; Michael Finke; A. Gunawardana; Kenneth N. Ross; Sam T. Roweis; Elizabeth Shriberg; D. Talkin; Alex Waibel; B. Wheatley; Torsten Zeppenfeld


conference of the international speech communication association | 2015

Large Vocabulary Automatic Speech Recognition for Children

Hank Liao; Golan Pundak; Olivier Siohan; Melissa K. Carroll; Noah Coccaro; Qi-Ming Jiang; Tara N. Sainath; Andrew W. Senior; Françoise Beaufays; Michiel Bacchiani


conference of the international speech communication association | 1995

Minimum classification error training algorithm for feature extractor and pattern classifier in speech recognition.

Kuldip Kumar Paliwal; Michiel Bacchiani; Yoshinori Sagisaka

Collaboration


Dive into the Michiel Bacchiani's collaboration.

Top Co-Authors

Avatar

Mari Ostendorf

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ehsan Variani

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Khe Chai Sim

National University of Singapore

View shared research outputs
Researchain Logo
Decentralizing Knowledge