Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael Savic is active.

Publication


Featured researches published by Michael Savic.


international conference on acoustics, speech, and signal processing | 1990

Variable parameter speaker verification system based on hidden Markov modeling

Michael Savic; Sunil K. Gupta

A text-independent speaker verification system based on an adaptive vocal tract model which emulates the vocal tract of the speaker is described. Each speaker is represented by a set of feature vectors derived from speech segments belonging to different classes of phonemes. Linear predictive hidden Markov modeling and maximum-likelihood Viterbi decoding are applied to a speech utterance to obtain different classes of phonemes pronounced by a speaker. It is shown that different classes of phonemes are not equally effective in discriminating between speakers and that verification performance can be considerably improved by separately classifying speech segments representing each broad phonetic category as belonging to an impostor or as belonging to the true speaker. A weighted linear combination of scores for individual categories can be used as the final verification score. The weights are chosen to reflect the effectiveness of particular classes of phonemes in discriminating between speakers and are adjusted to maximize the verification performance.<<ETX>>


Journal of the Acoustical Society of America | 1995

Detection of cholesterol deposits in arteries

Michael Savic

A method of diagnosing arterial stenosis from a received Doppler shifted acoustic signal from an artery during multiple heartbeats, each having systolic and diastolic phases uses pattern recognition techniques that are applied to the signal to determine the presence of turbulence during the diastolic phases which indicates stenosis. This is advantageously done by comparing the systolic and diastolic phases as part of the pattern recognition technique, large differences being indicative of a healthy artery and smaller differences being indicative of stenosis.


international conference on acoustics, speech, and signal processing | 1992

Phoneme based speaker verification

Michael Savic; Jeffrey Sorensen

Text-independent speaker verification systems typically depend upon averaging over a long utterance to obtain a feature set for classification. However, not all speech is equally suited to the task of speaker verification. An approach to text-independent speaker verification that uses a two-stage classifier is presented. The first stage consists of a speaker-independent phoneme detector trained to recognize a phoneme that is distinctive from speaker to speaker. The second stage is trained to recognize the frames of speech from the target speaker that are admitted by the phoneme detector. A common feature vector based on the linear predictive coding (LPC) cepstrum is projected in different directions for each of these pattern recognition tasks. Results of tests using the described speaker verification system are shown.<<ETX>>


international conference on acoustics, speech, and signal processing | 1991

An automatic language identification system

Michael Savic; Elena Acosta; Sunil K. Gupta

An automatic language identification system which makes use of several language specific features is described. The system concurrently utilizes features associated with two methodologies, the hidden Markov model, (HMM) and language-specific pitch contours. Experimental results show that transition probabilities and pitch contours show differences between the languages. However, each of the described criteria will not give in every case a definitive answer, but will lead in the correct direction. Therefore, a voting classifier should be used to combine the results from each of the criteria.<<ETX>>


Digital Signal Processing | 2004

Speech reconstruction using a generalized HSMM (GHSMM)

Michael D. Moore; Michael Savic

Abstract Speech reconstruction is a relatively new application for stochastic processes such as the hidden Markov model (HMM) and hidden semi-Markov model (HSMM). While reconstruction has been attempted within the acoustic (actual speech) vector level, statistical reconstruction at the phoneme level has received less attention. Because the regeneration time (memory) of the HMM is on the order of a single acoustic vector, HMMs are relatively unsuited for reconstruction. HSMMs have a regeneration time (memory) that is on the order of a single phoneme, and thus are capable of reconstructing multiple damaged acoustic vectors within phonemes. We describe a dual-regeneration time generalized HSMM (GHSMM) that can reconstruct damaged acoustic vectors and multiple damaged phonemes in a longer utterance. This GHSMM uses a non-stationary transition matrix that is constructed to operate over two time scales—the regeneration time of a single phoneme and the regeneration time of an utterance.


international conference on acoustics, speech, and signal processing | 1993

Leak monitoring system for gas pipelines

Igal Brodetsky; Michael Savic

An approach and a solution to the continuous leak monitoring problem in underground gas pipelines are presented. This approach places permanent monitoring units along the pipeline. These units detect acoustic signals in the pipeline and discriminate leak sounds from other man-made or natural nonleak sounds that can occur. The system uses the kNN classifier as the detector with LPC (linear predictive coding) cepstrums as signal features. To increase system performance, pipeline effects on acoustic signals were taken into account during the classifier training phase. Each unit can detect 1/4-in-diameter leaks from a distance of 300 m, yielding 600 m as the maximum distance between units.<<ETX>>


international conference on acoustics, speech, and signal processing | 1997

Co-channel speaker separation using constrained nonlinear optimization

Daniel S. Benincasa; Michael Savic

This paper describes a technique to separate the speech of two speakers recorded over a single channel. The main focus of this research is to separate overlapping voiced speech signals using constrained nonlinear optimization. Based on the assumption that voiced speech can be modeled as a slowly-varying vocal tract filter with a quasi-periodic train of impulses, the speech waveform is represented as a sum of sine waves with time-varying amplitude, frequency and phase. In this work the unknown parameters of our speech model are the amplitude, frequency and phase of the harmonics of both speech signals. Using constrained nonlinear optimization, we determine, on a frame by frame basis, the best possible parameters that provides the least mean square error (LMSE) between the original co-channel speech signal and the sum of the reconstructed speech signals,.


international conference on acoustics, speech, and signal processing | 1992

Use of semi-Markov models for speaker-independent phoneme recognition

Nimal Ratnayake; Michael Savic; Jeffrey Sorensen

Hidden Markov models (HMMs) have been used to model speech in many areas of speech processing. One characteristic of the HMM is that the probability of time spent in a particular state, or state occupancy, is geometrically distributed. This, however, becomes a serious limitation and results in inaccurate modeling when the HMMs are used for phoneme recognition. The authors use hidden semi-Markov models (HSMM) to overcome the above limitation. Semi-Markov models are a more general class of Markov chains in which the state occupancy can be explicitly modeled by an arbitrary probability mass distribution. The authors use non-parametric distributions to describe the state occupancies instead of parametric distributions such as gamma. Poisson or binomial, as analysis of actual data shows that the duration of some phonemes could not be approximated by any of the above. Preliminary tests conducted using only the linear prediction coding (LPC) cepstrum as features have shown that the use of HSMM increased the phoneme recognition accuracy to 53.7% from the 48.4% obtained using an HMM.<<ETX>>


international conference on acoustics speech and signal processing | 1998

Voicing state determination of co-channel speech

Daniel S. Benincasa; Michael Savic

This paper presents a voicing state determination algorithm (VSDA) that is used to simultaneously estimate the voicing state of two speakers present in a segment of co-channel speech. Supervised learning trains a Bayesian classifier to predict the voicing states. The possible voicing states are silence, voiced/voiced, voiced/unvoiced, unvoiced/voiced and unvoiced/unvoiced. We have assumed the silent state as a subset of the unvoiced class, except when both speakers are silent. We have chosen a binary tree decision structure. Our feature set is a projection of a 37 dimensional feature vector onto a single dimension applied at each branch of the decision tree, using the Fisher linear discriminant. We have produced co-channel speech from the TIMIT database which is used for training and testing. Preliminary results, at signal to interference ratio of 0 dB, have produced classification accuracy of 82.6%, 73.45%, and 68.24% on male/female, male/male and female/female mixtures respectively.


international conference on acoustics, speech, and signal processing | 1994

Hierarchical pattern classification for high performance text-independent speaker verification systems

Jeffrey S. Sorensen; Michael Savic

A new algorithm, the hierarchical speaker verification algorithm, is introduced. This algorithm employs a set of unique mapping functions determined from an enrolment utterance that characterize the target voice as a multidimensional martingale random walk process. For sufficiently long verification utterances, the central limit theorem insures that the accumulated scores for the target speaker will be distributed normally about the origin. Impostor speakers, which violate the martingale property, are distributed arbitrarily and widely scattered in the verification space. Excerpts of verification performance experiments are given and extensions to the algorithm for handling noisy channels and speaker template aging are discussed.<<ETX>>

Collaboration


Dive into the Michael Savic's collaboration.

Top Co-Authors

Avatar

Jeffrey Sorensen

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Daniel S. Benincasa

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Mahesh Chugani

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Michael D. Moore

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Zlatko Macek

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Atiya Husain

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Etienne Marcheret

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Huiqin Gao

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Il-Hyun Nam

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Sunil K. Gupta

Rensselaer Polytechnic Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge