Alan V. McCree
Texas Instruments
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alan V. McCree.
IEEE Transactions on Speech and Audio Processing | 1995
Alan V. McCree; Thomas P. Barnwell
Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech. These vocoders can produce intelligible speech at low data rates (800-2400 b/s), but they often sound synthetic and generate annoying artifacts such as buzzes, thumps, and tonal noises. These problems increase dramatically if acoustic background noise is present at the speech input. This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech. The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise. A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system. Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment. In fact, diagnostic acceptability measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder. >
international conference on acoustics, speech, and signal processing | 1997
Lynn M. Supplee; Ronald P. Cohn; John S. Collura; Alan V. McCree
This paper describes the new U.S. Federal Standard at 2400 bps. The mixed excitation linear prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10). This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10. The MELP coder is based on the traditional LPC model, but includes additional features to improve its performance.
international conference on acoustics speech and signal processing | 1996
Alan V. McCree; Kwan Truong; E.B. George; Thomas P. Barnwell; V. Viswanathan
This paper describes our enhanced mixed excitation linear prediction (MELP) speech coder which is a candidate for the new U.S. Federal Standard at 2.4 kbits/s. The new coder is based on the MELP model, and it uses a number of enhancements as well as efficient quantization algorithms to improve performance while maintaining a low bit rate. In addition, the coder has been optimized for performance in acoustic background noise and in channel errors, as well as for efficient real-time implementation. Listening tests confirm that the enhanced 2.4 kbit/s MELP coder performs as well as the higher bit rate 4.8 kbit/s FS1016 CELP standard.
international conference on acoustics, speech, and signal processing | 1995
Levent M. Arslan; Alan V. McCree; Vishu R. Viswanathan
We propose three new adaptive noise suppression algorithms for enhancing noise-corrupted speech: smoothed spectral subtraction (SSS), vector quantization of line spectral frequencies (VQ-LSF), and modified Wiener filtering (MWF). SSS is an improved version of the well-known spectra subtraction algorithm, while the other two methods are based on generalised Wiener filtering. We have compared these three algorithms with each other and with spectral subtraction on both simulated noise and actual car noise. All three proposed methods perform substantially better than spectral subtraction, primarily because of the absence of any musical noise artifacts in the processed speech. Listening tests showed preference for MWF and SSS over VQ-LSF. Also, MWF provides a much higher mean opinion score (MOS) than does spectral subtraction. Finally, VQ-LSF provides a relatively good spectral match to the clean speech, and may, therefore, be better suited for speech recognition.
international conference on acoustics, speech, and signal processing | 1997
Erdal Paksoy; Alan V. McCree; Vishu R. Viswanathan
In general, a variable rate coder can obtain the same speech quality as a fixed rate coder, while reducing the average bit rate. We have developed a variable-rate multimodal speech coder with an average bit rate of 3 kb/s for a speech activity factor of 80% and quality comparable to the GSM full rate coder. The coder has four coding modes and uses a robust classification method involving the pitch gain, zero crossings, and a peakiness measure. Also the coder employs a novel gain-matched analysis-by-synthesis technique for very low rate coding of unvoiced frames and an improved noise-level-dependent postfilter. This paper describes the details of our algorithm and presents the results from subjective listening tests.
international conference on acoustics, speech, and signal processing | 2000
Alan V. McCree
This paper describes a new 14 kb/s wideband speech coder. The coder uses a split-band approach, where the input signal, sampled at 16 kHz, is split into two equal frequency bands from 0-4 kHz and 4-8 kHz, each of which is decimated to an 8 kHz sampling rate. The lower band is coded with a high-quality narrowband speech coder, the 11.8 kb/s G.729 Annex E, while the higher band is represented by a simple but effective parametric model. Two new features facilitate efficient coding of the high-band signal: noise modulation and high-frequency reversal. Since the encoding of the lower band is independent of the high-band signal, the narrowband encoder output can be embedded in the overall bitstream. Subjective test results show that this wideband speech coder is capable of producing high quality output speech.
international conference on acoustics, speech, and signal processing | 2005
Takahiro Unno; Alan V. McCree
It is well-known that wideband speech (0-7 kHz) provides better quality and intelligibility than narrowband speech (300-3400 Hz), but typically only narrowband speech information is available in current wireless communication systems. Narrowband to wideband extension technology has been recently investigated to artificially generate wideband speech from narrowband speech for better speech quality and intelligibility. This paper presents a robust split-band narrowband to wideband extension system based on algorithmic enhancements to the codebook mapping technique for high-band parameter estimation. Numerical measurements confirm the performance improvements of the codebook mapping process, and informal listening evaluations show the potential of the system and its robustness to input distortions and non-speech input signals.
international conference on acoustics, speech, and signal processing | 2001
Alan V. McCree; Takahiro Unno; Anand K. Anandakumar; Alexis Bernard; Erdal Paksoy
This paper presents a multi-rate wideband speech coder with bit rates from 8 to 32 kb/s. The coder uses a splitband approach, where the input signal, sampled at 16 kHz, is split into two equal frequency bands from 0-4 kHz and 4-8 kHz, each of which is decimated to an 8 kHz sampling rate. The lower band is coded using the adaptive multi-rate (AMR) family of high-quality narrowband speech coders, while the higher band is represented by a simple but effective parametric model. A complete solution including this wideband speech coder, channel coding for various GSM channels, and dynamic rate adaptation, easily passed all Selection Rules and ranked second overall in the 3GPP AMR Wideband Selection Testing. Besides the high performance, additional advantages of the embedded split-band approach include ease of implementation, reduced complexity, and simplified interoperation with narrowband speech coders.
international conference on acoustics speech and signal processing | 1998
Alan V. McCree; J.C. de Martin
This paper describes our new mixed excitation linear predictive (MELP) coder designed for very low bit rate applications. This new coder, through algorithmic improvements and enhanced quantization techniques, produces better speech quality at 1.7 kb/s than the new U.S. Federal Standard MELP coder at 2.4 kb/s. Key features of the coder are an improved pitch estimation algorithm and a line spectral frequencies (LSF) quantization scheme that requires only 21 bits per frame. With channel coding, this new MELP coder is capable of maintaining good speech quality even in severely degraded channels, at a total bit rate of only 3 kb/s.
international conference on acoustics, speech, and signal processing | 1991
Alan V. McCree; Thomas P. Barnwell
The authors introduce a novel synthesizer structure for an LPC (linear predictive coding) vocoder which increases the clarity and naturalness of the output speech. This synthesizer enhances the usual excitations of either periodic pulses or white noise by allowing pulse/noise mixtures and aperiodic pulses, and thus can generate a wider range of possible speech signals. The control algorithms for this new model replace the traditional binary voicing decision with more robust periodicity, peakiness, and power level detectors, without a significant increase in bit rate. As a result, the vocoder produces synthetic speech which is free of the usual LPC synthesis artifacts, even at bit rates below 2400 bps.<<ETX>>