Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christophe d'Alessandro is active.

Publication


Featured researches published by Christophe d'Alessandro.


IEEE Transactions on Speech and Audio Processing | 1998

An iterative algorithm for decomposition of speech signals into periodic and aperiodic components

B. Yegnanarayana; Christophe d'Alessandro; Vassilios Darsinos

The speech signal may be considered as the output of a time-varying vocal tract system excited with quasiperiodic and/or random sequences of pulses. The quasiperiodic part may be considered as the deterministic or periodic component and the random part as the stochastic or aperiodic component of the excitation. We discuss issues involved in identifying and separating the periodic and aperiodic components of the source. The decomposition is performed on an approximation to the excitation signal, instead of decomposing the speech signal directly. The linear prediction residual signal is used as an approximation to the excitation signal of the vocal tract system. Speech is first analyzed to determine the voiced and unvoiced parts of the signal. Decomposition of the voiced part into periodic and aperiodic components is then accomplished by first identifying the frequency regions of harmonic and noise components in the spectral domain. The signal corresponding to the noise regions is used as a first approximation to the aperiodic component. An iterative algorithm is proposed which reconstructs the aperiodic component in the harmonic regions. The periodic component is obtained by subtracting the reconstructed aperiodic component signal from the residual signal. The individual components of the residual are then used to excite the derived all-pole model of the vocal tract system to obtain the corresponding components of the speech signal. Experiments were conducted using synthetic speech. They demonstrated the ability of the algorithm for decomposition of a synthetic speech signal made of a mixture of periodic and aperiodic components. Application to natural speech is also discussed.


IEEE Signal Processing Letters | 2005

Zeros of Z-transform representation with application to source-filter separation in speech

Baris Bozkurt; Boris Doval; Christophe d'Alessandro; Thierry Dutoit

We propose a new spectral representation called the zeros of z-transform (ZZT), which is an all-zero representation of the z-transform of the signal. We show that separate patterns exist in ZZT representations of speech signals for the glottal flow and the vocal tract contributions. A decomposition method for source-tract separation is presented based on ZZT. The ZZT-decomposition consists in grouping the zeros into two sets, according to their location in the z-plane. This type of decomposition leads to separating glottal flow contribution (without a return phase) from vocal tract contribution in the z domain.


IEEE Transactions on Speech and Audio Processing | 1998

Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources

Christophe d'Alessandro; Vassilios Darsinos; B. Yegnanarayana

Decomposition of speech into periodic and aperiodic components is useful in analyzing and describing the characteristics of voice sources. Such a decomposition is also useful in controlling the excitation source for synthesis. This paper addresses the issue of decomposition of speech into periodic and aperiodic components in the context of speech production. The effectiveness of a recently proposed algorithm for decomposing speech into these components is examined for analysis of voice sources. Synthetic signals are generated using formant synthesis. Different sources of aperiodicity encountered in normal speech production are considered, using a set of parameters to control the synthetic signals. The sources of aperiodicity studied are: (1) additive pulsed or continuous random noise, and (2) modulation aperiodicities due to variation in the fundamental frequency, jitter, and shimmer. Three types of measures are used to characterize these voices: ratio of energies in the periodic and aperiodic components, perceptual spectral distance, and spectrograms. The results demonstrate the effectiveness of the periodic-aperiodic decomposition algorithm for analyzing aperiodicities for a wide variety of voices, and point out the limitations of the algorithm.


Computer Speech & Language | 1998

Objective evaluation of grapheme to phoneme conversion for text-to-speech synthesis in French

François Yvon; P Boula de Mareüil; Christophe d'Alessandro; Véronique Aubergé; Michel Bagein; Gérard Bailly; Frédéric Béchet; S Foukia; J F Goldman; E Keller; D. O'Shaughnessy; V Pagel; F. Sannier; Jean Véronis; B Zellner

This paper reports on a cooperative international evaluation of grapheme-to-phoneme (GP) conversion for text-to-speech synthesis in French. Test methodology and test corpora are described. The results for eight systems are provided and analysed in some detail. The contribution of this paper is twofold: on the one hand, it gives an accurate picture of the state-of-the-art in the domain of GP conversion for French, and points out the problems still to be solved. On the other hand, much room is devoted to a discussion of methodological issues for this task. We hope this could help future evaluations of similar systems in other languages.


international conference on acoustics, speech, and signal processing | 1997

Spectral correlates of glottal waveform models: an analytic study

Boris Doval; Christophe d'Alessandro

This paper deals with the spectral representation of the glottal flow. The LF and the KLGLOTT88 models of the glottal flow are studied. We compute analytically the spectrum of the LF-model. Then, formulas are given for computing spectral tilt and amplitudes of the first harmonics as functions of the LF-model parameters. We consider the spectrum of the KLGLOTT88 model. It is shown that this model can be modeled in the spectral domain by an all-pole third-order linear filter. Moreover, the anticausal impulse response of this filter is a good approximation of the glottal flow model. Parameter estimation seems easier in the spectral domain. Therefore our results can be used for modification of the (hidden) glottal flow characteristic of natural speech signals, by processing directly the spectrum, without needing time-domain parameter estimation.


Speech Communication | 1996

Analysis/synthesis and modification of the speech aperiodic component

Gaël Richard; Christophe d'Alessandro

Abstract The general framework of this paper is speech analysis and synthesis. The speech signal may be separated into two components: (1) a periodic component (which includes the quasi-periodic or voiced sounds produced by regular vocal cord vibrations); (2) an aperiodic component (which includes the non-periodic part of voiced sounds (e.g. fricative noise in /v/) or sound emitted without any vocal cord vibration (e.g. unvoiced fricatives, or plosives)). This work is intended to contribute to a precise modelling of this second component and particularly of modulated noises. Firstly, a synthesis method, inspired by the “shot noise effect”, is introduced. This technique uses random point processes which define the times of arrival of spectral events (represented by Formant Wave Form (FWF)). Based on the theoretical framework provided by the Rice representation and the random modulation theory, an analysis/synthesis scheme is proposed. Perception tests show that this method allows to synthesize very natural speech signals. The representation proposed also brings new types of voice quality modifications (time scaling, vocal effort, breathiness of a voice, etc.).


international conference on acoustics, speech, and signal processing | 1995

Decomposition of speech signals into deterministic and stochastic components

Christophe d'Alessandro; B. Yegnanarayana; Vassilios Darsinos

This paper presents a new method for decomposition of the speech signal into a deterministic and a stochastic component. The method is based on iterative signal reconstruction. The method involves: (1) separation of speech into an approximate excitation and filter components using linear predictive (LP) analysis; (2) identification of frequency regions of noise and deterministic components of excitation using cepstrum; (3) reconstruction of the two excitation components of the residual using an iterative algorithm; (4) and finally, the deterministic and stochastic components of the excitation are then obtained by combining the reconstructed frames of data using an overlap-add procedure. The deterministic and stochastic components are then passed through the time varying all-pole filter to obtain the components of the speech signal. The algorithm is able to decompose varying mixtures of stochastic and deterministic signals, like the noise bursts produced at the glottal closure and the deterministic glottal pulses. This new algorithm is a powerful tool for analysis of relevant features of the source component of speech signals.


international conference on acoustics, speech, and signal processing | 2009

Glottal closure instant detection using Lines of Maximum Amplitudes (LOMA) of thewavelet transform

Nicolas Sturmel; Christophe d'Alessandro; Francois Rigaud

The Lines Of Maximum Amplitude (LOMA) of the wavelet transform are used for glottal closure instant detection. Following Kadambe & al. (1992), the wavelet transform modulus maxima can be used for singularity detection. The LOMA method extends this idea. All the lines chaining maxima of a wavelet transform across scales are built. Then a back-tracking procedure allows for selection of the optimal line for each pitch period, the top of which indicates the GCI. The LOMA method is then evaluated by comparing its results to the DYPSA (Naylor & al.) algorithm, with the option of using inverse filtering as preprocessing. The LOMA method compares favorably to DYPSA, particularly on accuracy. One of the advantage of the LOMA method is its ability to deal with variations in the glottal source parameters.


Journal of Voice | 2003

Just noticeable differences of open quotient and asymmetry coefficient in singing voice

Nathalie Henrich; Gunilla Sundin; Daniel Ambroise; Christophe d'Alessandro; Michèle Castellengo; Boris Doval

This study aims to explore the perceptual relevance of the variations of glottal flow parameters and to what extent a small variation can be detected. Just Noticeable Differences (JNDs) have been measured for three values of open quotient (0.4, 0.6, and 0.8) and two values of asymmetry coefficient (2/3 and 0.8), and the effect of changes of vowel, pitch, vibrato, and amplitude parameters has been tested. Two main groups of subjects have been analyzed: a group of 20 untrained subjects and a group of 10 trained subjects. The results show that the JND for open quotient is highly dependent on the target value: an increase of the JND is noticed when the open quotient target value is increased. The relative JND is constant: deltaOq/Oq = 14% for the untrained and 10% for the trained. In the same way, the JND for asymmetry coefficient is also slightly dependent on the target value--an increase of the asymmetry coefficient value leads to a decrease of the JND. The results show that there is no effect from the selected vowel or frequency (two values have been tested), but that the addition of a vibrato has a small effect on the JND of open quotient. The choice of an amplitude parameter also has a great effect on the JND of open quotient.


Speech Communication | 1990

Time-frequency speech transformation based on an elementary waveform representation

Christophe d'Alessandro

Abstract A representation of the speech signal as a sum of elementary waveforms (Elementary Waveform Speech Model or EWSM) is introduced and some of its features for modifying localized time-frequency events are demonstrated. The elementary waveforms model the local spectro-temporal maxima of energy within the speech signal thanks to the use of simple mathematical functions. An automatic analysis-synthesis system allows for waveforms parameters estimation, using frame-by-frame processing: spectral modelling and segmentation using short-time Fourier transform and LPC spectrum, Fourier filtering according to this segmentation, waveform spotting in each channel, waveform modelling using simple functions. The classical theory of speech production proves the validity of the EWSM parameters; their modifications yield well-localized time-frequency transformations, including frequency compression/expansion, pitch, formant and noise modification.

Collaboration


Dive into the Christophe d'Alessandro's collaboration.

Top Co-Authors

Avatar

Albert Rilliard

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sylvain Le Beux

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Sophie Rosset

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Baris Bozkurt

İzmir Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Philippe Boula de Mareüil

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Brian F. G. Katz

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

David Doukhan

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Gérard Bailly

Centre national de la recherche scientifique

View shared research outputs
Researchain Logo
Decentralizing Knowledge