Lawrence J. Fransen
United States Naval Research Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lawrence J. Fransen.
international conference on acoustics, speech, and signal processing | 1985
George S. Kang; Lawrence J. Fransen
A low-bit-rate speech encoder must employ bit-saving measures to achieve intelligible and natural sounding synthesized speech. Some important measures are: (a) quantization of parameters based on their spectral-error sensitivities (i.e., coarser quantization for spectrally less sensitive parameters), and (b) quantization of parameters in accordance with properties of auditory perception (i.e., coarser quantization of the higher frequency components of the speech spectral envelope, and finer representation of spectral peaks than valleys). The use of Line-Spectrum Pairs (LSPs) makes it possible to employ these measures more readily than the better known reflection coefficients. As a result, the intelligibility of an LSP-based, pitch-excited vocoder operating at 800 bits/second (b/s) can be made as high as 87 for three male speakers (as measured by the Diagnostic Rhyme Test (DRT)) which is only 1.4 below that of the 2400-b/s LPC. Likewise, the intelligibility of a 4800-b/s nonpitch-excited vocoder is as high as 92.3 which compares favorably with scores from current 9600-b/s vocoders.
Journal of the Acoustical Society of America | 2000
George S. Kang; Lawrence J. Fransen
A system that synchronously segments a speech waveform using pitch period and a center of the pitch waveform. The pitch waveform center is determined by finding a local minimum of a centroid histogram waveform of the low-pass filtered speech waveform for one pitch period. The speech waveform can then be represented by one or more of such pitch waveforms or segments during speech compression, reconstruction or synthesis. The pitch waveform can be modified by frequency enhancement/filtering, waveform stretching/shrinking in speech synthesis or speech disguise. The utterance rate can also be controlled to speed up or slow down the speech.
IEEE Transactions on Circuits and Systems | 1984
W. Mikhael; F. Wu; George S. Kang; Lawrence J. Fransen
A new simple formulation for the choice of the optimum convergence factor \mu in adaptive filtering using gradient techniques is given. This leads to several optimum adaptive filtering algorithms each of which is optimum under the conditions it was derived. Several of these algorithms have been tested successfully proving their optimality and yielding faster and more accurate adaptation compared with the existing conventional algorithm CA that uses fixed or imperical \mu . Two algorithms are derived and examined here. The first, the Homogeneous Algorithm HA, results in a time varying \mu which is the same for each filter but is updated at each iteration to yield optimum performance. The second, the Individual Adaptation Algorithm (IAA), has a time varying \mu which is chosen suitably for each coefficient at each iteration. The performance of the HA and IAA always outperformed the CA. Computer simulations and experimental results are given which are in agreement with theory.
Journal of the Acoustical Society of America | 1996
George S. Kang; Lawrence J. Fransen
A voice communication processing system and method for processing a speech waveform as a digital bit stream having a reduced number of bits representing speech parameters. The bit representation of amplitude parameters is reduced by storing only probable amplitude parameter transitions corresponding to amplitude parameter indices in an amplitude table and by joint encoding the amplitude parameter indices over multiple frames. The bit representation of the pitch period is reduced by storing a range of pitch periods in a pitch table and by joint encoding pitch period indices corresponding to an average pitch period over two frames. The bit representation of the vocal tract filter coefficients is reduced by storing only probable filter coefficient transitions corresponding to filter coefficient indices in a filter coefficient table and by joint encoding the filter coefficient indices over two frames. Voicing decisions are inferred by an associated vocal tract filter coefficient index obtained by searching the filter coefficient table where the table is divided according to the voicing decisions, and thus separate voicing decisions do not have to be transmitted. By providing a reduced bit representation of the various speech parameters as explained above, the present invention processes the speech waveform at a more efficient data rate. In addition, the present invention converts prediction coefficients (PCs) into line spectra pairs (LSPs) to be used as filter parameters when performing a linear predictive coder (LPC) analysis. Thus, by using LSPs, the present invention is able to more efficiently encode and decode speech.
IEEE Transactions on Circuits and Systems | 1987
George S. Kang; Lawrence J. Fransen
The presence of noise in speech has severely adverse effects on speech produced by a low-bit-rate voice terminal. This paper describes an adaptive noise-cancellation filter that was designed to suppress the broad-band, nonstationary, and intense noise often encountered in military tracked vehicles, helicopters, and high-performance aircraft. This adaptive noise-cancellation filter was developed for real-time operation using a TMS32010 microprocessor. The filter was tested under various environmental conditions with a wide range of filter parameters. According to our measurements, it reduces the noise floor by 10 to 15 dB without degrading the voice quality.
IEEE Transactions on Acoustics, Speech, and Signal Processing | 1987
George S. Kang; Lawrence J. Fransen
Line-spectrum pairs (LSPs) are frequency-domain parameters similar to formant frequencies. Thus, they have frequency-selective spectral-error characteristics which allow LSP quantization in accordance with auditory perception. In addition, ease of estimating the spectral-error sensitivity of each line spectrum makes possible encoding each line spectrum efficiently. This correspondence, for the first time, demonstrates that a 31 bit representation of LSPs provides similar intelligibility as a 41 bit representation of reflection coefficients in a current 2400 bit/s LPC. Even with a 12 bit quantization of LSPs, the loss of speech intelligibility is minor, only 2.4 points below that of a 41 bit quantization of reflection coefficients as measured by the diagnostic rhyme test (DRT) which tests initial-consonant discrimination.
international conference on acoustics, speech, and signal processing | 1981
George S. Kang; Lawrence J. Fransen; Evans L. Kline
The Navy has developed a Multirate Processor (MRP) which generates digitized speech at 2.4, 9.6, and 16 kb/s by the linear predictive coding principle. This multirate capability is achieved by embedding the 2.4 kb/s data in the 9.6 kb/s data stream and the 9.6 kb/s data in the 16 kb/s data stream. Conversion between the rates is accomplished by truncating a certain portion of the bits from the higher-data rate signal or appending extra bits to the lower-data rate signal. The MRP mediumband (9.6 kb/s or 16 kb/s) mode is a baseband residual excited LPC in which the baseband residual is transmitted in terms of Fourier spectral components. Under various operational conditions, the Diagnostic Rhyme Test (DRT) scores for the 9.6 kb/s rate of the MRP compare favorably to the DRT scores of an existing 16 kb/s rate Continuously Variable Slope Delta (CVSD) encoder.
Real-Time Signal Processing II | 1979
George S. Kang; Lawrence J. Fransen; E. L. Kline
A number of real-time linear predictive coders ( LPCs) have been developed to compress speech waveforms to 2400 bits per second (bps). Most of these LPCs employ a central processing unit (CPU) to analyze a stream of speech samples on a frame-by-frame (block-form) basis. While physical size, weight and power dissipation of these units have been decreasing steadily, the operation of a battery-powered hand-carried unit is far from realization. This paper presents the flow-form implementation of an LPC as an alternative to the block-form CPU intensive approach. The flow-form implementation of an LPC allows for decentralized, semi-autonomous, arithmetic-intensive segments which are supplemented by a microprocessor. The microprocessor performs relatively non-taxing logic operations and computations. F low-form analysis computation is highly systematic and repetitive making this form of analysis well-suited for Very Large Scale Integration (VLSI). Because flow-form analysis does not require a large array of stored data, less data memory is required and power dissipation is reduced. With current technology an LPC can be implemented using fewer than ten chips and having a total power dissipation of less than three watts. The flow-form LPC is comparable in performance to the block-form LPC, and the two units are interoperable, provided the same coding rules and data transmission formats are used.
IEEE Transactions on Circuits and Systems | 1986
Wasfy B. Mikhael; F. Wu; Leonid G. Kazovsky; George S. Kang; Lawrence J. Fransen
Archive | 1980
George S. Kang; Lawrence J. Fransen; Evans L. Kline