Alain de Cheveigné
École Normale Supérieure
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alain de Cheveigné.
Journal of the Acoustical Society of America | 2002
Alain de Cheveigné; Hideki Kawahara
An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors. The algorithm has several desirable features. Error rates are about three times lower than the best competing methods, as evaluated over a database of speech recorded together with a laryngograph signal. There is no upper limit on the frequency search range, so the algorithm is suited for high-pitched voices and music. The algorithm is relatively simple and may be implemented efficiently and with low latency, and it involves few parameters that must be tuned. It is based on a signal model (periodic signal) that may be extended in several ways to handle various forms of aperiodicity that occur in particular applications. Finally, interesting parallels may be drawn with models of auditory processing.
Journal of the Acoustical Society of America | 1993
Alain de Cheveigné
Signal‐processing methods and auditory models for separation of concurrent harmonic sounds are reviewed, and a processing principle is proposed that cancels harmonic interference in the time domain. The principle is first formulated in signal processing terms as a time‐domain comb filter. The critical issue of fundamental frequency estimation is investigated and an algorithm is proposed. Tested on a restricted database of natural voiced speech, the algorithm successfully found estimates correct within 3% of an octave for 90% of all frames. Next, the principle is formulated in physiological terms. A hypothetical ‘‘neural comb filter’’ is described, based on neural delay lines and inhibitory synapses, and tested using auditory‐nerve fiber discharge data obtained in response to concurrent vowels [A. R. Palmer, J. Acoust. Soc. Am. 88, 1412–1426 (1990)]. Processing successfully suppresses the correlates of either vowel in the response of fibers that respond to both, allowing the other vowel to be better represented. The filter belongs to the class of ‘‘cancellation models’’ for which predictions can be made concerning the outcome of certain psychoacoustic experiments. These predictions are discussed in relation to recent experimental results obtained elsewhere.Signal‐processing methods and auditory models for separation of concurrent harmonic sounds are reviewed, and a processing principle is proposed that cancels harmonic interference in the time domain. The principle is first formulated in signal processing terms as a time‐domain comb filter. The critical issue of fundamental frequency estimation is investigated and an algorithm is proposed. Tested on a restricted database of natural voiced speech, the algorithm successfully found estimates correct within 3% of an octave for 90% of all frames. Next, the principle is formulated in physiological terms. A hypothetical ‘‘neural comb filter’’ is described, based on neural delay lines and inhibitory synapses, and tested using auditory‐nerve fiber discharge data obtained in response to concurrent vowels [A. R. Palmer, J. Acoust. Soc. Am. 88, 1412–1426 (1990)]. Processing successfully suppresses the correlates of either vowel in the response of fibers that respond to both, allowing the other vowel to be better repres...
Archive | 2005
Alain de Cheveigné
This chapter discusses models of pitch, old and recent. The aim is to chart their common points – many are variations on a theme – and differences, and build a catalog of ideas for use in understanding pitch perception. The busy reader might read just the next section, a crash course in pitch theory that explains why some obvious ideas don’t work and what are currently the best answers. The brave reader will read on as we delve more deeply into the origin of concepts, and the intricate and ingenious ideas behind the models and metaphors that we use to make progress in understanding pitch.
Journal of Neuroscience Methods | 2007
Alain de Cheveigné; Jonathan Z. Simon
We present an algorithm for removing environmental noise from neurophysiological recordings such as magnetoencephalography (MEG). Noise fields measured by reference magnetometers are optimally filtered and subtracted from brain channels. The filters (one per reference/brain sensor pair) are obtained by delaying the reference signals, orthogonalizing them to obtain a basis, projecting the brain sensors onto the noise-derived basis, and removing the projections to obtain clean data. Simulations with synthetic data suggest that distortion of brain signals is minimal. The method surpasses previous methods by synthesizing, for each reference/brain sensor pair, a filter that compensates for convolutive mismatches between sensors. The method enhances the value of data recorded in health and scientific applications by suppressing harmful noise, and reduces the need for deleterious spatial or spectral filtering. It should be applicable to a wider range of physiological recording techniques, such as EEG, local field potentials, etc.
Journal of the Acoustical Society of America | 2003
Jeremy Marozeau; Alain de Cheveigné; Stephen McAdams; Suzanne Winsberg
The dependency of the timbre of musical sounds on their fundamental frequency (F0) was examined in three experiments. In experiment I subjects compared the timbres of stimuli produced by a set of 12 musical instruments with equal F0, duration, and loudness. There were three sessions, each at a different F0. In experiment II the same stimuli were rearranged in pairs, each with the same difference in F0, and subjects had to ignore the constant difference in pitch. In experiment III, instruments were paired both with and without an F0 difference within the same session, and subjects had to ignore the variable differences in pitch. Experiment I yielded dissimilarity matrices that were similar at different F0s, suggesting that instruments kept their relative positions within timbre space. Experiment II found that subjects were able to ignore the salient pitch difference while rating timbre dissimilarity. Dissimilarity matrices were symmetrical, suggesting further that the absolute displacement of the set of instruments within timbre space was small. Experiment III extended this result to the case where the pitch difference varied from trial to trial. Multidimensional scaling (MDS) of dissimilarity scores produced solutions (timbre spaces) that varied little across conditions and experiments. MDS solutions were used to test the validity of signal-based predictors of timbre, and in particular their stability as a function of F0. Taken together, the results suggest that timbre differences are perceived independently from differences of pitch, at least for F0 differences smaller than an octave. Timbre differences can be measured between stimuli with different F0s.
Journal of Neuroscience Methods | 2008
Alain de Cheveigné; Jonathan Z. Simon
We present a method for removing unwanted components of biological origin from neurophysiological recordings such as magnetoencephalography (MEG), electroencephalography (EEG), or multichannel electrophysiological or optical recordings. A spatial filter is designed to partition recorded activity into stimulus-related and stimulus-unrelated components, based on a criterion of stimulus-evoked reproducibility. Components that are not reproducible are projected out to obtain clean data. In experiments that measure stimulus-evoked activity, typically about 80% of noise power is removed with minimal distortion of the evoked response. Signal-to-noise ratios of better than 0 dB (50% reproducible power) may be obtained for the single most reproducible spatial component. The spatial filters are synthesized using a blind source separation method known as denoising source separation (DSS) that allows the measure of interest (here proportion of evoked power) to guide the source separation. That method is of greater general use, allowing data denoising beyond the classical stimulus-evoked response paradigm.
Journal of the Acoustical Society of America | 1998
Alain de Cheveigné
A model of pitch perception is presented involving an array of delay lines and inhibitory gating neurons. In response to a periodic sound, a minimum appears in the pattern of outputs of the inhibitory neurons at a lag equal to the period of the sound. The position of this minimum is the cue to pitch. The model is similar to the autocorrelation model of pitch, multiplication being replaced by an operation similar to subtraction, and maxima by minima. The two models account for a wide class of pitch phenomena in very much the same way. The principal goal of this paper is to demonstrate this fact. Several features of the cancellation model may be to its advantage: it is closely related to the operation of harmonic cancellation that can account for segregation of concurrent harmonic stimuli, it can be generalized to explain the perception of multiple pitches, and it shows a greater degree of sensitivity to phase than autocorrelation, which may allow it to explain certain phenomena that autocorrelation cannot account for.
Speech Communication | 1999
Alain de Cheveigné; Hideki Kawahara
Abstract The pitch of a periodic sound is strongly correlated with its period. To perceive the multiple pitches evoked by several simultaneous sounds, the auditory system must estimate their periods. This paper proposes a process in which the periodic sounds are canceled in turn (multistep cancellation model) or simultaneously (joint cancellation model). As an example of multistep cancellation, the pitch perception model of Meddis and Hewitt, 1991a , Meddis and Hewitt, 1991b can be associated with the concurrent vowel identification model of Meddis and Hewitt (1992) . A first period estimate is used to suppress correlates of the dominant sound, and a second period is then estimated from the remainder. The process may be repeated to estimate further pitches, or else to recursively refine the initial estimates. Meddis and Hewitts models are spectrotemporal (filter channel selection based on temporal cues) but multistep cancellation can also be performed in the spectral or time domain. In the joint cancellation model , estimation and cancellation are performed together in the time domain: the parameter space of several cascaded cancellation filters is searched exhaustively for a minimum output. The parameters that yield this minimum are the period estimates. Joint cancellation is guaranteed to find all periods, except in certain situations for which the stimulus is inherently ambiguous.
The Journal of Neuroscience | 2007
Maria Chait; David Poeppel; Alain de Cheveigné; Jonathan Z. Simon
Auditory environments vary as a result of the appearance and disappearance of acoustic sources, as well as fluctuations characteristic of the sources themselves. The appearance of an object is often manifest as a transition in the pattern of ongoing fluctuation, rather than an onset or offset of acoustic power. How does the system detect and process such transitions? Based on magnetoencephalography data, we show that the temporal dynamics and response morphology of the neural temporal-edge detection processes depend in precise ways on the nature of the change. We measure auditory cortical responses to transitions between “disorder,” modeled as a sequence of random frequency tone pips, and “order,” modeled as a constant tone. Such transitions embody key characteristics of natural auditory edges. Early cortical responses (from ∼50 ms post-transition) reveal that order–disorder transitions, and vice versa, are processed by different neural mechanisms. Their dynamics suggest that the auditory cortex optimally adjusts to stimulus statistics, even when this is not required for overt behavior. Furthermore, this response profile bears a striking similarity to that measured from another order–disorder transition, between interaurally correlated and uncorrelated noise, a radically different stimulus. This parallelism suggests the existence of a general mechanism that operates early in the processing stream on the abstract statistics of the auditory input, and is putatively related to the processes of constructing a new representation or detecting a deviation from a previously acquired model of the auditory scene. Together, the data reveal information about the mechanisms with which the brain samples, represents, and detects changes in the environment.
Journal of the Acoustical Society of America | 1997
Alain de Cheveigné
This paper presents a “neural cancellation filter” capable of segregating weak targets from competing harmonic backgrounds, and a model of concurrent vowel segregation based upon it. The elementary cancellation filter comprises a delay line and an inhibitory synapse. Filters within each peripheral channel are tuned to the period of the competing sound to suppress its correlates within the neural discharge pattern. In combination with a pattern matching model based on autocorrelation functions summed over channels, the cancellation filter forms a model of concurrent vowel identification. The model predicts the number of vowels reported for each stimulus (when subjects are allowed to report one or two) and identification rates. It belongs to the class of “harmonic cancellation” models that are supported by experimental evidence that vowel identification is better when competing sounds are harmonic than inharmonic. Two alternative schemes using the same filter are also considered. One derives a “place” repre...