Henri Leich
Faculté polytechnique de Mons
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Henri Leich.
Speech Communication | 1993
Thierry Dutoit; Henri Leich
Abstract The use of the Time-Domain Pitch Synchronous OverLap-Add (TD-PSOLA) algorithm in a Text-To-Speech synthesizer is reviewed. Its drawbacks are underlined and three conditions on the speech database are examined. In order to satisfy them, a previously described high quality resynthesis process is developed and enhanced, which makes use of the well-known Multi-Band Excited (MBE) model. An important by-product of this operation is that optimal Pitch Marking turns out to be automatic. A temporal interpolation block is finally added. The resulting Multi-Band Resynthesis Pitch Synchronous OverLap Add (MBR-PSOLA) synthesis algorithm supports spectral interpolation between voiced parts of segments, with virtually no increase in complexity. It provides the basis of a high-quality Text-To-Speech (TTS) synthesizer.
Signal Processing | 1981
René Boite; Henri Leich
Abstract The approximation problem for high-order minimum phase FIR filter is solved without requiring any polynomial factorization. A modified Parks-McClellan program is used to compute the amplitude function; the minimum phase function is then derived by a method using the FFT algorithm. The procedure is illustrated by the design of various high order filters; short computation time with no numerical troubles is achieved.
international conference on acoustics, speech, and signal processing | 1993
Gao Yang; Henri Leich; René Boite
The authors introduce a novel algorithm, called forward-backward waveform prediction (FBWP), for voiced speech coding at very low bit rates, that aims to produce high-quality speech with low complexity. This algorithm encodes and transmits partial representative waveforms (RWs) with which the complete voiced speech waveforms are reconstructed in the time domain. The RW can be encoded at 20-30 ms intervals, considering the special initial conditions of short- and long-terms. The basic idea of the FBWP method is essentially consistent with that of the PWI algorithm, which is capable of reproducing high-quality voiced speech. The FBWP algorithm does not require two synchronous prototype waveforms decomposed into sinusoidal components; thus it is very fast while the high-quality speech can be obtained at a bit rate of about 3 kbit/s. Like the PWI method, the proposed algorithm is easily combined with other linear-prediction-based speech codes using a noise-like excitation to reproduce unvoiced speech.<<ETX>>
IEEE Transactions on Speech and Audio Processing | 1995
Gao Yang; Henri Leich; René Boite
Techniques for coding voiced speech at very low bit rates are investigated and a new algorithm, designed to produce high quality speech with low complexity, is proposed. This algorithm encodes and transmits partial representative waveforms (RWs) from which the complete speech waveforms are reconstructed by using a method called forward-backward waveform prediction (FBWP). The RW is encoded at 20-30 ms intervals with a low complexity approach, taking into account the special initial conditions of short- and long-term filters. The basic idea of FBWP is essentially consistent with that of the prototype waveform interpolation (PWI) algorithm, which was reported to be capable of producing high-quality voiced speech at a bit rate of between 3.0 and 4.0 kb/s. By implementing the FBWP in the time domain, fast computation is thereby made possible while high-quality speech can be obtained at bit rate of about 3 kb/s. As in the PWI method, the proposed algorithm may be combined with an LP-based speech coder which uses a noise-like excitation to reproduce unvoiced speech. >
Signal Processing | 1993
Gao Yang; Henri Leich; René Boite
Abstract This paper presents a new speech coding model targeted at the bit-rate above 4 kbit/s, referred to as multiband code-excited linear prediction (MBCELP). The analysis and synthesis of speech are accomplished in the time domain by comparing the original to the synthetic speech while a perceptual criterion is used. A usual short-term linear predictive filter is employed as the synthesis filter; the excitation signal is modelled as a linear combination of a long-term predictive excitation, periodic multiband excitations and a noise-like excitation; no voiced/unvoiced decision is required. The periodic multiband excitation is produced by convoluting a periodic impulse sequence with a sinc function corresponding to a frequency band; the noise-like excitation is represented by a codebook. We estimate a pitch which is appropriate not only to the long-term predictive filter but also to the periodic multiband excitations and to the ‘pitch’ prefilter in the decoder. Several CELP vocoders are developed as a reference to test the property of the MBCELP vocoder. Listening tests clearly indicate that this vocoder reconstructed very high quality speech without ‘buzziness’ or ‘hoarseness’ for both clean and noisy speech. A 4.8 kbit/s MBCELP vocoder is shown as an example. Its perceptual quality is virtually identical to the original 8 kbit/s CELP vocoder and the improved 7.2 kbit/s CELP vocoder. Since less subframes are used for the MBCELP vocoders, their complexity is not greater than that of usual CELP vocoders with the same type of codebook. Alot of techniques used to simplify CELP coding can be also adopted for the MBCELP coding.
international conference on acoustics speech and signal processing | 1996
Vincent Fontaine; Christophe Ris; Henri Leich
We present and compare two different hybrid HMM/MLP approaches. The first one uses MLPs as labelers coupled with a discrete HMM while the second one takes advantage of the ability of MLPs trained as classifiers to estimate a posteriori probabilities. Both approaches bring sensible improvement compared with classical methods since they rid the system of some restrictive hypotheses inherent of pure HMM design (no time correlation between successive acoustic vectors, hypothesis on the probability distributions...). Our experiments have been achieved in order to provide quite fair comparisons. This implied that we used standard environment namely, standard software, standard databases including common training and test sets.
Signal Processing | 1992
René Boite; Henri Leich; Gao Yang
Abstract In this paper we propose a very simple and efficient weighting filter with which the computational complexity of CELP coders can be considerably reduced. Other algorithms using a weighting filter could also benefit from the advantages of this simplified weighting filter. The estimation of the long term prediction with the close-loop method is described. A binary codebook is used for the excitation vectors. It is shown how the excitation sequence can be obtained by a non-exhaustive method in two steps with a simplified algorithm and the simple weighting filter. Several coders have been implemented showing that the perceptual quality of the simplified algorithm is equivalent to that of the original CELP.
international conference on acoustics, speech, and signal processing | 1995
Christophe Ris; Vincent Fontaine; Henri Leich
This paper presents a new pre-processing method developed with the objective to represent relevant information of a signal with a minimum number of parameters. The originality of this work is to propose a new efficient pre-processing algorithm producing acoustical vectors at a variable frame rate. The length of the speech frames is no longer fixed a priori to a constant value but results from a study of the signal stationarity. Both segmentation and signal analysis are based on Malvar wavelets since the orthogonal properties of this transform are the key to the problem of comparing measures done on frames of different lengths. Some results of speech recognition based on this preprocessing are presented.
international conference on digital signal processing | 1997
Henri Leich
In this paper, we describe a complete toolbox for the design of IIR filters. The first problem to be solved is the approximation problem which is developed for simultaneous approximation of the attenuation and the group-delay. The result is the transfer function for the direct or cascade form. The second step is the quantization of the coefficients for a fixed-point arithmetic. This done for the two structures by means of a sensitivity function followed by an exact computation. The third point is to compute the scaling factors and the noise performances of the different structures (direct, transposed and immediate) with internal or external scaling factors. This toolbox can also be used to quantize the coefficients in a discrete space. At the end of the procedure, the IIR digital filter can be implemented on a fixed-point DSP, in a FPGA or in an ASIC.
international conference on speech image processing and neural networks | 1994
Jun He; Henri Leich
Trajectory of a speech signal on a self-organizing feature map (SOFM) is usually obtained by concatenating the cells with peak neural excitation given each input vector. This usually causes unsmooth trajectory of speech. We introduce a new method, solidly grounded on Bayes rule, to find the response trajectory in SOFM. It takes into account not only the present response of the cells given input vector but also the a priori information of the response in SOFM for each class. To test the effectiveness of this method, a Multilayer Perceptron (MLP) is used to classify the trajectory to the class it belongs to. It will be shown experimentally that with our new method the recognition rate is increased from 91.6% to 96.2%.<<ETX>>