Jean-Sylvain Liénard
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jean-Sylvain Liénard.
international conference on acoustics, speech, and signal processing | 1987
Jean-Sylvain Liénard
We consider the speech signal to be composed of elementary waveforms, wf, (windowed sinusoids), each one defined by a small number of parameters. The typical duration of a wf is of the order of magnitude of a pitch period in the voiced segments, and a few milliseconds in the noise segments. No preliminary evaluation of voicing or pitch is required ; this largely differentiates the approach from the classical pitch-synchronous analysis. The analysis process uses a filterbank, designed to introduce as few time distortions as possible. The signal at the output of each filter is segmented according to successive amplitude minima, and each segment is modeled by a wf. This decomposition can be validated by reconstructing the wfs from their parameters, and summing them in order to recover a signal perceptually equivalent to the original.
international conference on acoustics, speech, and signal processing | 2013
Delphine Charlet; Claude Barras; Jean-Sylvain Liénard
The overlapping speech detection systems developped by Orange and LIMSI for the ETAPE evaluation campaign on French broadcast news and debates are described. Using either cepstral features or a multi-pitch analysis, a F1-measure for overlapping speech detection up to 59.2% is reported on the TV data of the ETAPE evaluation set, where 6.7% of the speech was measured as overlapping, ranging from 1.2% in the news to 10.4% in the debates. Overlapping speech segments were excluded during the speaker diarization stage, and these segments were further labelled with the two nearest speaker labels, taking into account the temporal distance. We describe the effects of this strategy for various overlapping speech systems and we show that it improves the diarization error rate in all situations and up to 26.1% relative in our best configuration.
Journal of the Acoustical Society of America | 2008
François Signol; Claude Barras; Jean-Sylvain Liénard
Reliably tracking the fundamental frequency F0 of the components is an important step in the separation of superimposed speech signals. Several Pitch Estimation Algorithms are potentially usable and a rigorous evaluation method is needed in order to compare them. However, even in the monopitch case, many variations between them render such a comparison difficult. The extent of the F0min‐F0max interval, the use of a priori information on the whole sequence or database, and above all the arbitrary setting of the voicing threshold, yield large differences in the results. These biases can be removed by setting the F0 bounds to fixed values acceptable for many voices, by proceeding with the evaluation on a strictly frame‐to‐frame basis, and by fixing the voicing threshold in order to get an equal error rate for overvoiced and undervoiced frames. In the multipitch case any frame may exhibit 0, 1, or 2 valid voicings according to the coincidence between the voiced and unvoiced parts of both signals. This problem...
international conference on acoustics speech and signal processing | 1988
Christophe d'Alessandro; Jean-Sylvain Liénard
Representation of the speech signal by a set of discrete elements which respect its acoustical and perceptive structures is considered. The signal is pre-analyzed frame by frame, and the spectral envelope obtained for each frame is segmented into regions comprising a single peak. The signal is then filtered in each region, and the elementary waveforms are spotted in the time domain. The problem of grouping the waveforms in adjacent channels is thus circumvented. The resulting representation is satisfactory, as is the signal reconstruction, except for some modeling problems remaining in the lowest part of the spectrum.<<ETX>>
Archive | 1999
Jean-Sylvain Liénard
Among the many problems that impede our understanding of perception, three appear to be of primary importance: data variability, selective attention, and learning. Data variability is the hidden part of what is called Categorisation by psychologists and Pattern Recognition by engineers. Selective attention is the process used by cognitive systems to locate or identify some entities considered as relevant, disseminated in a set of comparable entities considered as non relevant in a given situation. Learning is the process by which the perceptual system works out the representations of data at each abstraction level, as well as the transformations of information from one level to the next. In the present chapter we shall examine those three problems and present a perceptual model, which aims at integrating them into a single view.
Archive | 1999
Jean-Sylvain Liénard
From the viewpoint of psychology, perception is the function by which an organism gains knowledge of its environment. Equivalently, perception is the function by which sensory information is transformed into meaningful elements. In other words perception makes use of two different logics : the logic of the physical world, from which the organism extracts the information it needs by means of specialized captors, and the logic of cognition, where information is structured in the form of abstract knowledge.
Journal of the Acoustical Society of America | 1983
Maxine Eskenazi; Jean-Sylvain Liénard
Since the work OF Ladefoged and Broadbent [J. Acoust. Soc. Am. 29, 98–104 (1957)], it is generally accepted that the identification of a specific vowel is linked to the relation of all of the vowels in a language. The type of acoustical characterization chosen for the vowels of a given language, and the result obtained in its use, depend in part on the quantity of phonetically different vowels in that language. A substantial database has been constituted, including several tokens for each of 13 French vowels pronounced by 30 male and female speakers, in order to study the acoustical features of vowels. Results of antomatic recognition tests based on a statistical characterization of smoothed, amplitude‐independent spectra [Eskenazi and Lienard, GALF‐AFCET Seminar, Toulouse, Sept. 81, pp. 54–69] are compared to the results of perceptual recognition experiments on the same data, and discussed in light of some other vowel identification experiments [Komatsu et al., IEEE‐ICASSP Paris, May 82; Cole et al., Art...
Journal of the Acoustical Society of America | 2008
Jean-Sylvain Liénard; Claude Barras; François Signol
conference of the international speech communication association | 2013
Jean-Sylvain Liénard; Claude Barras
Speech Communication | 1983
Maxine Eskenazi; Jean-Sylvain Liénard