Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arshia Cont is active.

Publication


Featured researches published by Arshia Cont.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

A Coupled Duration-Focused Architecture for Real-Time Music-to-Score Alignment

Arshia Cont

The capacity for real-time synchronization and coordination is a common ability among trained musicians performing a music score that presents an interesting challenge for machine intelligence. Compared to speech recognition, which has influenced many music information retrieval systems, musics temporal dynamics and complexity pose challenging problems to common approximations regarding time modeling of data streams. In this paper, we propose a design for a real-time music-to-score alignment system. Given a live recording of a musician playing a music score, the system is capable of following the musician in real time within the score and decoding the tempo (or pace) of its performance. The proposed design features two coupled audio and tempo agents within a unique probabilistic inference framework that adaptively updates its parameters based on the real-time context. Online decoding is achieved through the collaboration of the coupled agents in a Hidden Hybrid Markov/semi-Markov framework, where prediction feedback of one agent affects the behavior of the other. We perform evaluations for both real-time alignment and the proposed temporal model. An implementation of the presented system has been widely used in real concert situations worldwide and the readers are encouraged to access the actual system and experiment the results.


international conference on acoustics, speech, and signal processing | 2006

Realtime Audio to Score Alignment for Polyphonic Music Instruments, using Sparse Non-Negative Constraints and Hierarchical HMMS

Arshia Cont

We present a new method for realtime alignment of audio to score for polyphonic music signals. In this paper, we will be focusing mostly on the multiple-pitch observation algorithm proposed based on realtime non-negative matrix factorization with sparseness constraints and hierarchical hidden Markov models for sequential modeling using particle filtering for decoding. The proposed algorithm has the advantage of having an explicit instrument model for pitch obtained through unsupervised learning as well as access to single note contribution probabilities which construct a complex chord instead of modeling the chord as one event


international conference on acoustics, speech, and signal processing | 2011

A unified approach to real time audio-to-score and audio-to-audio alignment using sequential Montecarlo inference techniques

Nicola Montecchio; Arshia Cont

We present a methodology for the real time alignment of music signals using sequential Montecarlo inference techniques. The alignment problem is formulated as the state tracking of a dynamical system, and differs from traditional Hidden Markov Model - Dynamic Time Warping based systems in that the hidden state is continuous rather than discrete. The major contribution of this paper is addressing both problems of audio-to-score and audio-to-audio alignment within the same framework in a real time setting. Performances of the proposed methodology on both problems are then evaluated and discussed.


Archive | 2013

Real-Time Detection of Overlapping Sound Events with Non-Negative Matrix Factorization

Arnaud Dessein; Arshia Cont; Guillaume Lemaitre

In this paper, we investigate the problem of real-time detection of overlapping sound events by employing non-negative matrix factorization techniques. We consider a setup where audio streams arrive in real-time to the system and are decomposed onto a dictionary of event templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We propose and compare two provably convergent algorithms that address this issue, by controlling respectively the sparsity of the decomposition and the trade-off of the decomposition between the different frequency components. Sparsity regularization is considered in the framework of convex quadratic programming, while frequency compromise is introduced by employing the beta-divergence as a cost function. The two algorithms are evaluated on the multi-source detection tasks of polyphonic music transcription, drum transcription and environmental sound recognition. The obtained results show how the proposed approaches can improve detection in such applications, while maintaining low computational costs that are suitable for real-time.


Discrete Event Dynamic Systems | 2013

Operational semantics of a domain specific language for real time musician---computer interaction

José Echeveste; Arshia Cont; Jean-Louis Giavitto; Florent Jacquemard

With the advent and availability of powerful personal computing, the computer music research and industry have been focusing on real-time musical interactions between musicians and computers; delegating human-like actions to computers who interact with a musical environment. One common use-case of this kind is Automatic Accompaniment where the system is comprised of a real-time machine listening system that in reaction to recognition of events in a score from a human performer, launches necessary actions for the accompaniment section. While the real-time detection of score events out of live musicians’ performance has been widely addressed in the literature, score accompaniment (or the reactive part of the process) has been rarely discussed. This paper deals with this missing component in the literature from a formal language perspective. We show how language considerations would enable better authoring of time and interaction during programming/composing and how it addresses critical aspects of a musical performance (such as errors) in real-time. We sketch the real-time features required by automatic musical accompaniment seen as a reactive system. We formalize the timing strategies for musical events taking into account the various temporal scales used in music. Various strategies for the handling of synchronization constraints and the handling of errors are presented. We give a formal semantics to model the possible behaviors of the system in terms of Parametric Timed Automata.


Proceedings of the 1st ACM workshop on Audio and music computing multimedia | 2006

OMax brothers: a dynamic yopology of agents for improvization learning

Gérard Assayag; Georges Bloch; Marc Chemillier; Arshia Cont; Shlomo Dubnov

We describe a multi-agent architecture for an improvization oriented musician-machine interaction system that learns in real time from human performers. The improvization kernel is based on sequence modeling and statistical learning. The working system involves a hybrid architecture using two popular composition/perfomance environments, Max and OpenMusic, that are put to work and communicate together, each one handling the process at a different time/memory scale. The system is capable of processing real-time audio/video as well as MIDI. After discussing the general cognitive background of improvization practices, the statistical modeling tools and the concurrent agent architecture are presented. Finally, a prospective Reinforcement Learning scheme for enhancing the systems realism is described.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

On the Information Geometry of Audio Streams With Applications to Similarity Computing

Arshia Cont; Shlomo Dubnov; Gérard Assayag

This paper proposes methods for information processing of audio streams using methods of information geometry. We lay the theoretical groundwork for a framework allowing the treatment of signal information as information entities, suitable for similarity and symbolic computing on audio signals. The theoretical basis of this paper is based on the information geometry of statistical structures representing audio spectrum features, and specifically through the bijection between the generic families of Bregman divergences and that of exponential distributions. The proposed framework, called Music Information Geometry, allows online segmentation of audio streams to metric balls where each ball represents a quasi-stationary continuous chunk of audio, and discusses methods to qualify and quantify information between entities for similarity computing. We define an information geometry that approximates a similarity metric space, redefine general notions in music information retrieval such as similarity between entities, and address methods for dealing with nonstationarity of audio signals. We demonstrate the framework on two sample applications for online audio structure discovery and audio matching.


simulation of adaptive behavior | 2007

Anticipatory Model of Musical Style Imitation Using Collaborative and Competitive Reinforcement Learning

Arshia Cont; Shlomo Dubnov; Gérard Assayag

The role of expectationin listening and composing music has drawn much attention in music cognition since about half a century ago. In this paper, we provide a first attempt to model some aspects of musical expectation specifically pertained to short-time and working memories, in an anticipatory framework. In our proposition anticipationis the mental realization of possible predicted actions and their effect on the perception of the world at an instant in time. We demonstrate the model in applications to automatic improvisation and style imitation. The proposed model, based on cognitive foundations of musical expectation, is an active model using reinforcement learning techniques with multiple agents that learn competitively and in collaboration. We show that compared to similar models, this anticipatory framework needs little training data and demonstrates complex musical behavior such as long-term planning and formal shapes as a result of the anticipatory architecture. We provide sample results and discuss further research.


international conference on acoustics, speech, and signal processing | 2005

Training Ircam's score follower [audio to musical score alignment system]

Arshia Cont; Diemo Schwarz; Norbert Schnell

This paper describes our attempt to make the hidden Markov model (HMM) score following system, developed at Ircam, sensible to past experiences in order to obtain better audio to score real-time alignment for musical applications. A new observation modeling based on Gaussian mixture models is developed which is trainable using a learning algorithm we would call automatic discriminative training. The novelty of this system lies in the fact that this method, unlike classical methods for HMM training, is not concerned with modeling the music signal but with correctly choosing the sequence of music events that was performed. Besides obtaining better alignment, the new systems parameters are controllable in a physical manner and the training algorithm learns different styles of music performance as discussed.


IEEE Signal Processing Letters | 2013

An Information-Geometric Approach to Real-Time Audio Segmentation

Arnaud Dessein; Arshia Cont

We present a generic approach to real-time audio segmentation in the framework of information geometry for exponential families. The proposed system detects changes by monitoring the information rate of the signals as they arrive in time. We also address shortcomings of traditional cumulative sum approaches to change detection, which assume known parameters before change. This is done by considering exact generalized likelihood ratio test statistics, with a complete estimation of the unknown parameters in the respective hypotheses. We derive an efficient sequential scheme to compute these statistics through convex duality. We finally provide results for speech segmentation in speakers, and polyphonic music segmentation in note slices.

Collaboration


Dive into the Arshia Cont's collaboration.

Top Co-Authors

Avatar

Shlomo Dubnov

University of California

View shared research outputs
Top Co-Authors

Avatar

Grigore Burloiu

Politehnica University of Bucharest

View shared research outputs
Top Co-Authors

Avatar

Georges Bloch

University of Strasbourg

View shared research outputs
Top Co-Authors

Avatar

Guillaume Lemaitre

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge