Régine André-Obrecht
University of Toulouse
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Régine André-Obrecht.
Multimedia Tools and Applications | 2014
Svebor Karaman; Jenny Benois-Pineau; Vladislavs Dovgalecs; Rémi Mégret; Julien Pinquier; Régine André-Obrecht; Yann Gaëstel; Jean-François Dartigues
This paper presents a method for indexing activities of daily living in videos acquired from wearable cameras. It addresses the problematic of analyzing the complex multimedia data acquired from wearable devices, which has been recently a growing concern due to the increasing amount of this kind of multimedia data. In the context of dementia diagnosis by doctors, patient activities are recorded in the environment of their home using a lightweight wearable device, to be later visualized by the medical practitioners. The recording mode poses great challenges since the video data consists in a single sequence shot where strong motion and sharp lighting changes often appear. Because of the length of the recordings, tools for an efficient navigation in terms of activities of interest are crucial. Our work introduces a video structuring approach that combines automatic motion based segmentation of the video and activity recognition by a hierarchical two-level Hidden Markov Model. We define a multi-modal description space over visual and audio features, including mid-level features such as motion, location, speech and noise detections. We show their complementarities globally as well as for specific activities. Experiments on real data obtained from the recording of several patients at home show the difficulty of the task and the promising results of the proposed approach.
acm multimedia | 2010
Rémi Mégret; Vladislavs Dovgalecs; Hazem Wannous; Svebor Karaman; Jenny Benois-Pineau; Elie Khoury; Julien Pinquier; Philippe Joly; Régine André-Obrecht; Yann Gaëstel; Jean-François Dartigues
In this paper, we describe a new application for multimedia indexing, using a system that monitors the instrumental activities of daily living to assess the cognitive decline caused by dementia. The system is composed of a wearable camera device designed to capture audio and video data of the instrumental activities of a patient, which is leveraged with multimedia indexing techniques in order to allow medical specialists to analyze several hour long observation shots efficiently.
Proceedings of the 2010 international workshop on Searching spontaneous conversational speech | 2010
Benjamin Bigot; Isabelle Ferrané; Julien Pinquier; Régine André-Obrecht
In the audio indexing context, we present our recent contributions to the field of speaker role recognition, especially applied to conversational speech.n We assume that there exist clues about roles like Anchor, Journalists or Others in temporal, acoustic and prosodic features extracted from the results of speaker segmentation and from audio files. In this paper, investigations are done on the EPAC corpus, mainly containing conversational documents. First, an automatic clustering approach is used to validate the proposed features and the role definitions. In a second study we propose a hierarchical supervised classification system. The use of dimensionality reduction methods as well as feature selection are investigated. This system correctly classifies 92% of speaker roles
international conference on acoustics, speech, and signal processing | 2013
Patrice Guyot; Julien Pinquier; Régine André-Obrecht
This article describes an audio signal processing algorithm to detect water sounds, built in the context of a larger system aiming to monitor daily activities of elderly people. While previous proposals for water sound recognition relied on classical machine learning and generic audio features to characterize water sounds as a flow texture, we describe here a recognition system based on a physical model of air bubble acoustics. This system is able to recognize a wide variety of water sounds and does not require training. It is validated on a home environmental sound corpus with a classification task, in which all water sounds are correctly detected. In a free detection task on a real life recording, it outperformed the classical systems and obtained 70% of F-measure.
Multimedia Tools and Applications | 2012
Benjamin Bigot; Isabelle Ferrané; Julien Pinquier; Régine André-Obrecht
In the field of automatic audiovisual content-based indexing and structuring, finding events like interviews, debates, reports, or live commentaries requires to bridge the gap between low-level feature extraction and such high-level event detection. In our work, we consider that detecting speaker roles like Anchor, Journalist and Other is a first step to enrich interaction sequences between speakers. Our work relies on the assumption of the existence of clues about speaker roles in temporal, prosodic and basic signal features extracted from audio files and from speaker segmentations. Each speaker is therefore represented by a 36-feature vector. Contrarily to most of the state-of-the-art propositions we do not use the structure of the document to recognize the roles of the interveners. We investigate the influence of two dimensionality reduction techniques (Principal Component Analysis and Linear Discriminant Analysis) and different classification methods (Gaussian Mixture Models, K-nearest neighbours and Support Vectors Machines). Experiments are done on the 13-h corpus of the ESTER2 evaluation campaign. The best result reaches about 82% of well recognized roles. This corresponds to more than 89% of speech duration correctly labelled.
content based multimedia indexing | 2012
Patrice Guyot; Julien Pinquier; Régine André-Obrecht
This paper presents a new system for water flow detection on real life recordings and its application to medical context. The recognition system is based on an original feature for sound event detection in real life. This feature, called ”spectral cover” shows an interesting behaviour to recognize water flow in a noisy environment. The system is only based on thresholds. It is simple, robust, and can be used on every corpus without training. An experiment is realized with more than 7 hours of videos recorded by a wearable device. Our system obtains good results for the water flow event recognition (F-measure of 66%). A comparison with classical approaches using MFCC or low levels descriptors with GMM classifiers is done to attest the good performance of our system. Adding the spectral cover to low levels descriptors also improve their performance and confirms that this feature is relevant.
International Conference on Statistical Language and Speech Processing | 2017
Abdelwahab Heba; Thomas Pellegrini; Tom Jorquera; Régine André-Obrecht; Jean-Pierre Lorré
Expressiveness and non-verbal information in speech are active research topics in speech processing. In this work, we are interested in detecting emphasis at word-level as a mean to identify what are the focus words in a given utterance. We compare several machine learning techniques (Linear Discriminant Analysis, Support Vector Machines, Neural Networks) for this task carried out on SIWIS, a French speech synthesis database. Our approach consists first in aligning the spoken words to the speech signal and second to feed classifier with filter bank coefficients in order to take a binary decision at word-level: neutral/emphasized. Evaluation results show that a three-layer neural network performed best with a (93%) accuracy.
content based multimedia indexing | 2016
Marwa Thlithi; Julien Pinquier; Thomas Pellegrini; Régine André-Obrecht
Audio segmentation is often the first step of audio indexing systems. It provides segments supposed to be acoustically homogeneous. In this paper, we report our recent experiments on segmenting music recordings into singer turns, by analogy with speaker turns in speech processing. We compare several acoustic features for this task: FilterBANK coefficients (FBANK), and Mel frequency cepstral coefficients (MFCC). FBANK features were shown to outperform MFCC on a “clean” singing corpus. We describe a coefficient selection method that allowed further improvement on this corpus. A 75.8% F-measure was obtained with FBANK features selected with this method, corresponding to a 30.6% absolute gain compared to MFCC. On another corpus comprised of ethno-musicological recordings, both feature types showed a similar performance of about 60%. This corpus presents an increased difficulty due to the presence of instruments overlapped with singing and to a lower recording audio quality.
international conference on acoustics, speech, and signal processing | 2013
Hélène Lachambre; Lionel Koenig; Régine André-Obrecht
In the context of the acoustic-to-articulatory inversion, various unsupervised HMM based feature-mapping methods are assessed and compared. In a previous study we introduced an unsupervised HMM as an alternative model to the phone-HMM. We propose here to evaluate this approach using different inversion methods, in order to assess the behavior of our model and its compatibility with the most efficient inversion algorithms available. The best configuration leads to similar root mean square error (up to 1.44 mm) than phoneme-based HMM.
european signal processing conference | 2009
Hélène Lachambre; Régine André-Obrecht; Julien Pinquier