Esteban Maestre
Pompeu Fabra University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Esteban Maestre.
Computer Music Journal | 2009
Esteban Maestre; Rafael Ramirez; Stefan Kersten; Xavier Serra
Here we describe an approach to the expressive synthesis of jazz saxophone melodies that reuses audio recordings and carefully concatenates note samples. The aim is to generate an expressive audio sequence from the analysis of an arbitrary input score using a previously induced performance model and an annotated saxophone note database extracted from real performances. We push the idea of using the same corpus for both inducing an expressive performance model and synthesizing sound by concatenating samples in the corpus. Therefore, a connection between the performers’ instrument sound and performance characteristics is kept during the synthesis process.
Lecture Notes in Computer Science | 2006
Amaury Hazan; Rafael Ramirez; Esteban Maestre; Alfonso Pérez; Antonio Pertusa
This paper presents a novel Strongly-Typed Genetic Programming approach for building Regression Trees in order to model expressive music performance. The approach consists of inducing a Regression Tree model from training data (monophonic recordings of Jazz standards) for transforming an inexpressive melody into an expressive one. The work presented in this paper is an extension of [1], where we induced general expressive performance rules explaining part of the training examples. Here, the emphasis is on inducing a generative model (i.e. a model capable of generating expressive performances) which covers all the training examples. We present our evolutionary approach for a one-dimensional regression task: the performed note duration ratio prediction. We then show the encouraging results of experiments with Jazz musical material, and sketch the milestones which will enable the system to generate expressive music performance in a broader sense.
IEEE Transactions on Audio, Speech, and Language Processing | 2010
Esteban Maestre; Merlijn Blaauw; Jordi Bonada; Enric Guaus; Alfonso Pérez
Excitation-continuous music instrument control patterns are often not explicitly represented in current sound synthesis techniques when applied to automatic performance. Both physical model-based and sample-based synthesis paradigms would benefit from a flexible and accurate instrument control model, enabling the improvement of naturalness and realism. We present a framework for modeling bowing control parameters in violin performance. Nearly non-intrusive sensing techniques allow for accurate acquisition of relevant timbre-related bowing control parameter signals. We model the temporal contour of bow velocity, bow pressing force, and bow-bridge distance as sequences of short Be¿zier cubic curve segments. Considering different articulations, dynamics, and performance contexts, a number of note classes are defined. Contours of bowing parameters in a performance database are analyzed at note-level by following a predefined grammar that dictates characteristics of curve segment sequences for each of the classes in consideration. As a result, contour analysis of bowing parameters of each note yields an optimal representation vector that is sufficient for reconstructing original contours with significant fidelity. From the resulting representation vectors, we construct a statistical model based on Gaussian mixtures suitable for both the analysis and synthesis of bowing parameter contours. By using the estimated models, synthetic contours can be generated through a bow planning algorithm able to reproduce possible constraints caused by the finite length of the bow. Rendered contours are successfully used in two preliminary synthesis frameworks: digital waveguide-based bowed string physical modeling and sample-based spectral-domain synthesis.
Archive | 2007
Rafael Ramirez; Amaury Hazan; Esteban Maestre; Xavier Serra
In this chapter we present a data mining approach to one of the most challenging aspects of computer music: modeling the knowledge applied by a musician when performing a score in order to produce an expressive performance of a piece. We apply data mining techniques to real performance data (i.e., audio recordings) in order to induce an expressive performance model. This leads to an expressive performance system consisting of three components: (1) a melodic transcription component that extracts a set of acoustic features from the audio recordings, (2) a data mining component that induces an expressive transformation model from the set of extracted acoustic features, and (3) a melody synthesis component that generates expressive monophonic output (MIDI or audio) from inexpressive melody descriptions using the induced expressive transformation model. We describe, explore, and compare different data mining techniques for inducing the expressive transformation model.
Journal of New Music Research | 2014
Marco Marchini; Rafael Ramirez; Panos Papiotis; Esteban Maestre
Computational approaches for modelling expressive music performance have produced systems that emulate music expression, but few steps have been taken in the domain of ensemble performance. In this paper, we propose a novel method for building computational models of ensemble expressive performance and show how this method can be applied for deriving new insights about collaboration among musicians. In order to address the problem of inter-dependence among musicians we propose the introduction of inter-voice contextual attributes. We evaluate the method on data extracted from multi-modal recordings of string quartet performances in two different conditions: solo and ensemble. We used machine-learning algorithms to produce computational models for predicting intensity, timing deviations, vibrato extent, and bowing speed of each note. As a result, the introduced inter-voice contextual attributes generally improved the prediction of the expressive parameters. Furthermore, results on attribute selection show that the models trained on ensemble recordings took more advantage of inter-voice contextual attributes than those trained on solo recordings.
user centric media | 2009
Antonio Camurri; Gualtiero Volpe; Hugues Vinet; Roberto Bresin; Marco Fabiani; Gaël Dubus; Esteban Maestre; Jordi Llop; Jari Kleimola; Sami Oksanen; Vesa Välimäki; Jarno Seppänen
This paper surveys a collection of sample applications for networked user-centric context-aware embodied music listening. The applications have been designed and developed in the framework of the EU-ICT Project SAME (www.sameproject.eu) and have been presented at Agora Festival (IRCAM, Paris, France) in June 2009. All of them address in different ways the concept of embodied, active listening to music, i.e., enabling listeners to interactively operate in real-time on the music content by means of their movements and gestures as captured by mobile devices. In the occasion of the Agora Festival the applications have also been evaluated by both expert and non-expert users.
Journal of New Music Research | 2005
Rafael Ramirez; Amaury Hazan; Emilia Gómez; Esteban Maestre; Xavier Serra
Abstract If-then rules are one of the most expressive and intuitive knowledge representations and their application to represent musical knowledge raises particularly interesting questions. In this paper, we describe an approach to learning expressive performance rules from monophonic recordings of jazz standards by a skilled saxophonist. We have first developed a melodic transcription system which extracts a set of acoustic features from the recordings producing a melodic representation of the expressive performance played by the musician. We apply machine learning techniques, namely inductive logic programming, to this representation in order to induce first order logic rules of expressive music performance.
acm multimedia | 2013
Oscar Mayor; Quim Llimona; Marco Marchini; Panagiotis Papiotis; Esteban Maestre
In this technical demo we present repoVizz (http://repovizz.upf.edu), an integrated online system capable of structural formatting and remote storage, browsing, exchange, annotation, and visualization of synchronous multi-modal, time-aligned data. Motivated by a growing need for data-driven collaborative research, repoVizz aims to resolve commonly encountered difficulties in sharing or browsing large collections of multi-modal data. At its current state, repoVizz is designed to hold time-aligned streams of heterogeneous data: audio, video, motion capture, physiological signals, extracted descriptors, annotations, et cetera. Most popular formats for audio and video are supported, while Broadcast WAVE or CSV formats are adopted for streams other than audio or video (e.g., motion capture or physiological signals). The data itself is structured via customized XML files, allowing the user to (re-) organize multi-modal data in any hierarchical manner, as the XML structure only holds metadata and pointers to data files. Datasets are stored in an online database, allowing the user to interact with the data remotely through a powerful HTML5 visual interface accessible from any standard web browser; this feature can be considered a key aspect of repoVizz since data can be explored, annotated, or visualized from any location or device. Data exchange and upload/download is made easy and secure via a number of data conversion tools and a user/permission management system.
IEEE Transactions on Evolutionary Computation | 2012
Rafael Ramirez; Esteban Maestre; Xavier Serra
We describe an evolutionary approach to one of the most challenging problems in computer music: modeling how skilled musicians manipulate sound properties such as timing and amplitude in order to express their view of the emotional content of musical pieces. Starting with a collection of audio recordings of real performances, we apply a sequential-covering genetic algorithm in order to obtain computational models for different aspects of expressive performance. We use these models to automatically synthesize performances with the timing and energy expressiveness that characterizes the music generated by a professional musician. The reported results indicate that evolutionary computation is an appropriate technique for solving the problem considered. Specifically, our evolutionary algorithm provides a number of potential advantages over other supervised learning algorithms, such as a method for non-deterministically obtaining models capturing different possible interpretations of a musical piece.
Journal of the Acoustical Society of America | 2013
Esteban Maestre
A prominent challenge in instrumental sound synthesis is to reproduce the expressive nuances naturally conveyed by a musician when controlling a musical instrument. Despite the flexibility offered by physical modeling synthesis, appropriately mapping score annotations to sound synthesis controls still remains an interesting research problem, especially for the case of excitation-continuous instruments. Here we present our work on modeling bowing control in violin performance, and its application to sound synthesis via physical models. Minimally invasive sensing techniques allow for accurate acquisition of relevant timbre-related bowing control parameter signals. The temporal contours of bowing control parameters (bow velocity, bow force, and bow-bridge distance) are represented as sequences of low-order polynomial curves. A database of parametric representations of real performance data is used to construct a generative model able to synthesize bowing controls from an annotated score. Synthetic bowing con...