Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where V.R. Algazi is active.

Publication


Featured researches published by V.R. Algazi.


workshop on applications of signal processing to audio and acoustics | 2001

The CIPIC HRTF database

V.R. Algazi; Richard O. Duda; Dennis Thompson; Carlos Avendano

This paper describes a public-domain database of high-spatial-resolution head-related transfer functions measured at the UC Davis CIPIC Interface Laboratory and the methods used to collect the data.. Release 1.0 (see http://interface.cipic.ucdavis.edu) includes head-related impulse responses for 45 subjects at 25 different azimuths and 50 different elevations (1250 directions) at approximately 5/spl deg/ angular increments. In addition, the database contains anthropometric measurements for each subject. Statistics of anthropometric parameters and correlations between anthropometry and some temporal and spectral features of the HRTFs are reported.


international conference on acoustics, speech, and signal processing | 1991

Directional interpolation of images based on visual properties and rank order filtering

V.R. Algazi; Gary E. Ford; R. Potharlanka

The goal of this research is to develop interpolation techniques which preserve or enhance the local structure critical to image quality. Preliminary results are presented which exploit either the properties of vision or the properties of the image in order to achieve the goals. Directional image interpolation is considered which is based on a local analysis of the spatial image structure. The extension of techniques for the design of linear filters based on properties of human perception reported previously to enhance the perceived quality of interpolated images is considered.<<ETX>>


international conference on acoustics speech and signal processing | 1999

An adaptable ellipsoidal head model for the interaural time difference

Richard O. Duda; Carlos Avendano; V.R. Algazi

Experimentally measured head-related transfer functions reveal that the interaural time delay varies from person to person. Furthermore, it is not constant around a cone of confusion, but can vary by as much as 18% of the maximum interaural delay. The major sources for this variation are shown to be the shape of the head and the displacement of the ears from the center of the head. A simple ellipsoidal head model is presented that can accurately account for this ITD variation and can be adapted to individual listeners.


workshop on applications of signal processing to audio and acoustics | 2001

Structural composition and decomposition of HRTFs

V.R. Algazi; Richard O. Duda; R.P. Morrison; Dennis Thompson

The analysis and modeling of the response of parts of the body provides valuable insight into many features of the head-related transfer function (HRTF). In spatial sound simulations, partial models, such as the spherical head model, can also generate simple and effective approximate localization cues. We consider the composition of an approximate HRTF from the responses of structural components by making use of detailed measurements of isolated pinnae and of a pinna-less head and torso. We determine that such a composition is sensitive to additional geometric parameters that can be obtained from anthropometry. We show that, with such parameters, simple composition rules can produce a good correspondence between measured and composite HRTFs.


IEEE Transactions on Speech and Audio Processing | 1993

Transform representation of the spectra of acoustic speech segments with applications. I. General approach and application to speech recognition

V.R. Algazi; K.L. Brown; M.J. Ready; D.H. Irvine; Christie L. Cadwell; S. Chung

An approach to modeling and capturing the time-varying structure of the spectral envelope of speech is reported. Acoustic subword decomposition and the Karhunen-Loeve transform (KLT) are used to extract and efficiently represent the highly correlated structure of the spectral envelope. Integration of the KLT with acoustic subword modeling provides concise representation of both steady-state and dynamic features of the spectra in a unified framework that very effectively captures acoustic-phonetic patterns. The physiological and perceptual basis for the approach, the frame-based and acoustic-subword-based spectral representation, and applications to speaker-dependent recognition are presented. The performance of the recognition algorithm based on this approach compares favorably with that of other techniques. >


IEEE Transactions on Communications | 1990

Compression of binary facsimile images by preprocessing and color shrinking

V.R. Algazi; P.L. Kelly; R.R. Estes

Algorithms for preprocessing, global modeling and segmentation, local modeling and representation, and binary encoding of facsimile images are proposed, and their performance is examined. Some binary morphological operations that can be used to improve the quality and increase the compressibility of binary images and the modeling of binary images are discussed. The techniques are incorporated into a set of comprehensive encoding schemes, the PCSE codes. Their effectiveness in compressing the International Telegraphy and Telephony Consultative Committee (CCITT) binary images is demonstrated. The PCSE code using preprocessing and color shrinking prior to encoding is shown to outperform the READ code by from 7 to 42% for the standard CCITT test images with a very small change in image quality. >


workshop on applications of signal processing to audio and acoustics | 1999

A head-and-torso model for low-frequency binaural elevation effects

Carlos Avendano; V.R. Algazi; Richard O. Duda

Low-frequency elevation-dependent features appear in HRTF (head related transfer function) measurements because of torso and shoulder reflections and head diffraction effects. A simple structural model that accounts for these features is presented. Listening tests show that the model produces significant elevation cues for virtual sound sources whose spectra are limited to frequencies below 3 kHz. The low-frequency binaural elevation cues are perceptually significant away from the median plane, and complement high-frequency monaural pinna cues.


international conference on acoustics, speech, and signal processing | 1989

Characterization of spectral transitions with applications to acoustic sub-word segmentation and automatic speech recognition

K.L. Brown; V.R. Algazi

A mathematical model has been developed for tracking spectral transitions within the spectral envelope of a speech signal. This technique incorporates linguistic knowledge into a mathematical framework to determine time-varying acoustic-phonetic features and describe formant transitions. The proposed model is quite robust and is capable of extracting not only rapid spectral movement, but also smoother spectral transitions that occur in vowel and sonorant sequences. This basic approach has been previously used to extract steady-state acoustic-phonetic features across spectrally homogeneous regions and to perform speaker dependent recognition in which quite successful results were attained in clean as well as noisy speech. It has now been augmented to capture the dynamics of spectral acoustic-phonetic features.<<ETX>>


international conference on acoustics, speech, and signal processing | 1989

Robust LPC analysis and synthesis using the KL transformation of acoustic subwords spectra

V.R. Algazi; S. Chung; M.J. Ready; K.L. Brown

The authors propose a novel approach to the modeling and estimation of the speech spectral envelope over acoustic subwords that exhibits robust performance in noise. The technique exploits the underlying signal structure of speech to improve parameter estimates, and it uses the perceptual properties of hearing to decrease the computational requirements in a perceptually meaningful way. The approach provides a considerable speech quality improvement over other methods.<<ETX>>


international conference on acoustics, speech, and signal processing | 1989

A radius-bucketing approach to fast vector quantization encoding

A. Madisetti; R. Subramonian; V.R. Algazi

The authors present a computationally efficient encoding scheme for vector quantization. Efficiency is achieved by combining techniques: homes are bucketed into the subset of codewords in the same region as the input point; the energy of the input point eliminates codewords not in the same energy range; the smallest hyperrectangle parallel to the coordinate axes that bounds the Voronoi region associated with the codeword acts as a discriminant; and approximations to the actual distortion are used to avoid multiplications. Simulations on Gaussian sources with ranges of codebook sizes and block sizes indicate that the encoding time, measured in multiplications, actually falls with increasing codebook size. It is shown that with no increase in signal/noise ratio the algorithm substantially outperforms tree search and binary hyperplane testing search.<<ETX>>

Collaboration


Dive into the V.R. Algazi's collaboration.

Top Co-Authors

Avatar

K.L. Brown

University of California

View shared research outputs
Top Co-Authors

Avatar

Richard O. Duda

San Jose State University

View shared research outputs
Top Co-Authors

Avatar

Gary E. Ford

University of California

View shared research outputs
Top Co-Authors

Avatar

M.J. Ready

University of California

View shared research outputs
Top Co-Authors

Avatar

S. Chung

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

D.H. Irvine

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

P.L. Kelly

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge