Mitsuko Aramaki
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mitsuko Aramaki.
Journal of the Acoustical Society of America | 2007
Mitsuko Aramaki; Henri Baillères; Loïc Brancheriau; Richard Kronland-Martinet; Sølvi Ystad
Xylophone sounds produced by striking wooden bars with a mallet are strongly influenced by the mechanical properties of the wood species chosen by the xylophone maker. In this paper, we address the relationship between the sound quality based on the timbre attribute of impacted wooden bars and the physical parameters characterizing wood species. For this, a methodology is proposed that associates an analysis-synthesis process and a perceptual classification test. Sounds generated by impacting 59 wooden bars of different species but with the same geometry were recorded and classified by a renowned instrument maker. The sounds were further digitally processed and adjusted to the same pitch before being once again classified. The processing is based on a physical model ensuring the main characteristics of the wood are preserved during the sound transformation. Statistical analysis of both classifications showed the influence of the pitch in the xylophone maker judgement and pointed out the importance of two timbre descriptors: the frequency-dependent damping and the spectral bandwidth. These descriptors are linked with physical and anatomical characteristics of wood species, providing new clues in the choice of attractive wood species from a musical point of view.
IEEE Transactions on Audio, Speech, and Language Processing | 2010
Charles Verron; Mitsuko Aramaki; Richard Kronland-Martinet; Grégory Pallone
Nowadays, interactive 3-D environments tend to include both synthesis and spatialization processes to increase the realism of virtual scenes. In typical systems, audio generation is created in two stages: first, a monophonic sound is synthesized (generation of the intrinsic timbre properties) and then it is spatialized (positioned in its environment). In this paper, we present the design of a 3-D immersive synthesizer dedicated to environmental sounds, and intended to be used in the framework of interactive virtual reality applications. The system is based on a physical categorization of environmental sounds (vibrating solids, liquids, aerodynamics). The synthesis engine has a novel architecture combining an additive synthesis model and 3-D audio modules at the prime level of sound generation. An original approach exploiting the synthesis capabilities for simulating the spatial extension of sound sources is also presented. The subjective results, evaluated with a formal listening test, are discussed. Finally, new control strategies based on a global manipulation of timbre and spatial attributes of sound sources are introduced.
IEEE Transactions on Audio, Speech, and Language Processing | 2006
Mitsuko Aramaki; Richard Kronland-Martinet
This paper presents a sound synthesis model that reproduces impact sounds by taking into account both the perceptual and the physical aspects of the sound. For that, we used a subtractive method based on dynamic filtering of noisy input signals that simulates the damping of spectral components. The resulting sound contains the perceptual characteristics of an impact on a given material. Further, the addition of very few modal contributions-using additive or banded digital waveguide synthesis-together with a bandpass filtering taking into account the interaction with the excitator, allows realistic impact sounds to be synthesized. The synthesis parameters can be linked to a perceptual notion of material and geometry of the sounding object. To determine the synthesis parameters, we further address the problem of analysis-synthesis aiming at reconstructing a given impact sound. The physical parameters are extracted through a time-scale analysis of natural sounds. Examples are presented for sounds generated by impacting plates made of different materials and a piano soundboard.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Mitsuko Aramaki; Mireille Besson; Richard Kronland-Martinet; Sølvi Ystad
In this paper, we focused on the identification of the perceptual properties of impacted materials to provide an intuitive control of an impact sound synthesizer. To investigate such properties, impact sounds from everyday life objects, made of different materials (wood, metal and glass), were recorded and analyzed. These sounds were synthesized using an analysis-synthesis technique and tuned to the same chroma. Sound continua were created to simulate progressive transitions between materials. Sounds from these continua were then used in a categorization experiment to determine sound categories representative of each material (called typical sounds). We also examined changes in electrical brain activity (using event related potentials (ERPs) method) associated with the categorization of these typical sounds. Moreover, acoustic analysis was conducted to investigate the relevance of acoustic descriptors known to be relevant for both timbre perception and material identification. Both acoustic and electrophysiological data confirmed the importance of damping and highlighted the relevance of spectral content for material perception. Based on these findings, controls for damping and spectral shaping were tested in synthesis applications. A global control strategy, with a three-layer architecture, was proposed for the synthesizer allowing the user to intuitively navigate in a “material space” and defining impact sounds directly from the material label. A formal perceptual evaluation was finally conducted to validate the proposed control strategy.
Journal of Cognitive Neuroscience | 2010
Mitsuko Aramaki; Céline Marie; Richard Kronland-Martinet; Sølvi Ystad; Mireille Besson
The aim of these experiments was to compare conceptual priming for linguistic and for a homogeneous class of nonlinguistic sounds, impact sounds, by using both behavioral (percentage errors and RTs) and electrophysiological measures (ERPs). Experiment 1 aimed at studying the neural basis of impact sound categorization by creating typical and ambiguous sounds from different material categories (wood, metal, and glass). Ambiguous sounds were associated with slower RTs and larger N280, smaller P350/P550 components, and larger negative slow wave than typical impact sounds. Thus, ambiguous sounds were more difficult to categorize than typical sounds. A category membership task was used in Experiment 2. Typical sounds were followed by sounds from the same or from a different category or by ambiguous sounds. Words were followed by words, pseudowords, or nonwords. Error rate was highest for ambiguous sounds and for pseudowords and both elicited larger N400-like components than same typical sounds and words. Moreover, both different typical sounds and nonwords elicited P300 components. These results are discussed in terms of similar conceptual priming effects for nonlinguistic and linguistic stimuli.
international conference on auditory display | 2009
Mitsuko Aramaki; Charles Gondre; Richard Kronland-Martinet; Thierry Voinier; Sølvi Ystad
In this paper we present a synthesizer developed for musical and Virtual Reality purposes that offers an intuitive control of impact sounds. A three layer control strategy is proposed for this purpose, where the top layer gives access to a control of the sound source through verbal descriptions, the middle layer to a control of perceptually relevant sound descriptors, while the bottom layer is directly linked to the parameters of the additive synthesis model. The mapping strategies between the parameters of the different layers are described. The synthesizer has been implemented using Max/MSP, offering the possibility to manipulate intrinsic characteristics of sounds in real-time through the control of few parameters.
Computer Music Journal | 2006
Mitsuko Aramaki; Richard Kronland-Martinet; Thierry Voinier; Sølvi Ystad
Synthesis of impact sounds is far from a trivialtask owing to the high density of modes generallycontained in such signals. Several authors have ad-dressed this problem and proposed different ap-proaches to model such sounds. The majority ofthese models are based on the physics of vibratingstructures, as with for instance modal synthesis(Adrien 1991; Pai et al. 2001; van den Doel, Kry,and Pai 2001; Cook 2002; Rocchesso, Bresin, andFernstrom 2003). Nevertheless, modal synthesis isnot always suitable for complex sounds, such asthose with a high density of mixed modes. Otherapproaches have also been proposed using algorith-mic techniques based on digital signal processing.Cook (2002), for example, proposed a granular-synthesis approach based on a wavelet decomposi-tion of sounds.The sound-synthesis model proposed in this ar-ticle takes into account both physical and percep-tual aspects related to sounds. Many subjectivetests have shown the existence of perceptual cluesallowing the source of the impact sound (its mate-rial, size, etc.) to be identified merely by listening(Klatzky, Pai, and Krotkov 2000; Tucker and Brown2002). Moreover, these tests have brought to thefore some correlations between physical attributes(the nature of the material and dimensions of thestructure) and perceptual attributes (perceived ma-terial and perceived dimensions). Hence, it hasbeen shown that the perception of the materialmainly correlates with the damping coefficient ofthe spectral components contained in the sound.This damping is frequency-dependent, and high-frequency modes are generally more heavilydamped than low-frequency modes. Actually, thedissipation of vibrating energy owing to the cou-pling between the structure and the air increaseswith frequency (see, for example, Caracciolo andValette 1995).To take into account this fundamental sound be-havior from a synthesis point of view, a time-varying filtering technique has been chosen. It iswell known that the size and shape of an object’sattributes are mainly perceived by the pitch of thegenerated sound and its spectral richness. The per-ception of the pitch primarily correlates with thevibrating modes (Carello, Anderson, and Kunkler-Peck 1998). For complex structures, the modal den-sity generally increases with the frequency, so thathigh frequency modes overlap and become indis-cernible. This phenomenon is well known and isdescribed for example in previous works on roomacoustics (Kuttruff 1991).Under such a condition, the human ear deter-mines the pitch of the sound from emergent spec-tral components with consistent frequency ratios.When a complex percussive sound contains severalharmonic or inharmonic series (i.e., spectral compo-nents that are not exact multiples of the fundamen-tal frequency), different pitches can generally beheard. The dominant pitch then mainly depends onthe frequencies and the amplitudes of the spectralcomponents belonging to a so-called dominant fre-quency region (Terhardt, Stoll, and Seewann 1982)in which the ear is pitch sensitive. (We will discussthis further in the Tuning section of this article.)With all these aspects in mind, and wishing to pro-pose an easy and intuitive control of the model,we have divided it into three parts represented byan excitation element, a material element, and anobject element.The large number of parameters available throughsuch a model necessitates a control strategy. Thisstrategy (generally called a mapping) is of great im-portance for the expressive capabilities of the in-strument, and it inevitably influences the way itcan be used in a musical context (Gobin et al. 2004).
Journal of Abnormal Psychology | 2012
Jean-Arthur Micoulaud-Franchi; Mitsuko Aramaki; Adrien Merer; M. Cermolacce; Sølvi Ystad; Richard Kronland-Martinet; Jean Naudin; Jean Vion-Dury
The aim of this study was to investigate abnormal perceptual experiences in schizophrenia, in particular the feeling of strangeness, which is commonly found in patients self-reports. The experimental design included auditory complex stimuli within 2 theoretical frameworks based on sensory gating deficit and aberrant salience, inspired from conventional perceptual scales. A specific sound corpus was designed with environmental (meaningful) and abstract (meaningless) sounds. The authors compared sound evaluations on 3 perceptual dimensions (bizarre, familiar, and invasive) and 2 emotional dimensions (frightening and reassuring) between 20 patients with schizophrenia (SCZ) and 20 control participants (CTL). The perceptual judgment was rated on independent linear scales for each sound. In addition, the conditioning-testing P50 paradigm was conducted on 10 SCZ and 10 CTL. Both behavioral and electrophysiological data confirmed the authors expectations according to the 2 previous theoretical frameworks and showed that abnormal perceptual experiences in SCZ consisted of perceiving meaningful sounds in a distorted manner and as flooding/inundating but also in perceiving meaningless sounds as things that become meaningful by assigning them some significance. In addition, the use of independent scales to each perceptual dimension highlighted an unexpected ambivalence on familiarity and bizarreness in SCZ compatible with the explanation of semantic process impairment. The authors further suggested that this ambivalence might be due to a conflicting coactivation of 2 types of listening, that is, every day and musical (or acousmatic) listening.
Psychiatry Research-neuroimaging | 2011
Jean-Arthur Micoulaud-Franchi; Mitsuko Aramaki; Adrien Merer; M. Cermolacce; Sølvi Ystad; Richard Kronland-Martinet; Jean Vion-Dury
Perception of environmental sounds from impacted materials (Wood, Metal and Glass) was examined by conducting a categorization experiment. Stimuli consisted of sound continua evoking progressive transitions between material categories. Results highlighted shallower response curves in subjects with schizophrenia than healthy participants, and are discussed in the framework of Signal Detection Theory and in terms of impaired perception of specific timbre features in schizophrenia.
IEEE Transactions on Audio, Speech, and Language Processing | 2010
Damián Marelli; Mitsuko Aramaki; Richard Kronland-Martinet; Charles Verron
The inverse fast Fourier transform (FFT) method was proposed to alleviate the computational complexity of the additive sound synthesis method in real-time applications, and consists in synthesizing overlapping blocks of samples in the frequency domain. However, its application is limited by its inherent tradeoff between time and frequency resolution. In this paper, we propose an alternative to the inverse FFT method for synthesizing colored noise. The proposed approach uses subband signal processing to generate time-frequency noise with an autocorrelation function such that the noise obtained after converting it to time domain has the desired power spectral density. We show that the inverse FFT method can be interpreted as a particular case of the proposed method, and therefore, the latter offers some extra design flexibility. Exploiting this property, we present experimental results showing that the proposed method can offer a better tradeoff between time and frequency resolution, at the expense of some extra computations.
Collaboration
Dive into the Mitsuko Aramaki's collaboration.
Centre de coopération internationale en recherche agronomique pour le développement
View shared research outputs