Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Odile Mella is active.

Publication


Featured researches published by Odile Mella.


Computer Speech & Language | 2010

A wavelet-based parameterization for speech/music discrimination

Emmanuel Didiot; Irina Illina; Dominique Fohr; Odile Mella

This paper addresses the problem of parameterization for speech/music discrimination. The current successful parameterization based on cepstral coefficients uses the Fourier transformation (FT), which is well adapted for stationary signals. In order to take into account the non-stationarity of music/speech signals, this work proposes to study wavelet-based signal decomposition instead of FT. Three wavelet families and several numbers of vanishing moments have been evaluated. Different types of energy, calculated for each frequency band obtained from wavelet decomposition, are studied. Static, dynamic and long-term parameters were evaluated. The proposed parameterization are integrated into two class/non-class classifiers: one for speech/non-speech, one for music/non-music. Different experiments on realistic corpora, including different styles of speech and music (Broadcast News, Entertainment, Scheirer), illustrate the performance of the proposed parameterization, especially for music/non-music discrimination. Our parameterization yielded a significant reduction of the error rate. More than 30% relative improvement was obtained for the envisaged tasks compared to MFCC parameterization.


Speech Communication | 2006

Optimizing the coverage of a speech database through a selection of representative speaker recordings

Sacha Krstulovic; Frédéric Bimbot; Olivier Boëffard; Delphine Charlet; Dominique Fohr; Odile Mella

In the context of the Neologos French speech database creation project, we have defined a general methodology for the selection of representative speaker recordings. The selection aims at insuring a good coverage in terms of speaker variability while limiting the number of recorded speakers. This makes the resulting database both more adapted to the development of recently proposed multi-model methods and cheaper to collect. The presented methodology proposes to operate a selection by optimizing a quality criterion defined in a variety of speaker similarity modeling frameworks. The selection can be operated and validated with respect to a unique similarity criterion, using classical clustering methods such as Hierarchical or K-Medians clustering, or it can be operated and validated across several speaker similarity criteria, thanks to a newly developed clustering method called Focal Speakers Selection. In this framework, four different speaker similarity criteria are tested, and three different speaker clustering algorithms are compared. Results pertaining to the collection of the Neologos database are also discussed.


International Journal of Pattern Recognition and Artificial Intelligence | 2011

FRAME-SYNCHRONOUS AND LOCAL CONFIDENCE MEASURES FOR AUTOMATIC SPEECH RECOGNITION

Joseph Razik; Odile Mella; Dominique Fohr; Jean Paul Haton

In this paper, we introduce two new confidence measures for large vocabulary speech recognition systems. The major feature of these measures is that they can be computed without waiting for the end of the audio stream. We proposed two kinds of confidence measures: frame-synchronous and local. The frame-synchronous ones can be computed as soon as a frame is processed by the recognition engine and are based on a likelihood ratio. The local measures estimate a local posterior probability in the vicinity of the word to analyze. We evaluated our confidence measures within the framework of the automatic transcription of French broadcast news with the EER criterion. Our local measures achieved results very close to the best state-of-the-art measure (EER of 23% compared to 22.0%). We then conducted a preliminary experiment to assess the contribution of our confidence measure in improving the comprehension of an automatic transcription for the hearing impaired. We introduced several modalities to highlight words of low confidence in this transcription. We showed that these modalities used with our local confidence measure improved the comprehension of automatic transcription.


conference of the international speech communication association | 2015

Qualitative investigation of the display of speech recognition results for communication with deaf people

Agnès Piquard-Kipffer; Odile Mella; Jérémy Miranda; Denis Jouvet; Luiza Orosanu

Speech technologies provide ways of helping people with hearing loss by improving their autonomy. This study focuses on an application in French language which is developed in the collaborative project RAPSODIE in order to improve communication between a hearing person and a deaf or hard-of-hearing person. Our goal is to investigate different ways of displaying the speech recognition results which takes also into account the reliability of the recognized items. In this qualitative study, 10 persons have been interviewed to find the best way of displaying the speech transcription results. All the participants are deaf with different levels of hearing loss and various modes of communication.


Procedia Computer Science | 2017

Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect

Mohamed Amine Menacer; Odile Mella; Dominique Fohr; Denis Jouvet; David Langlois; Kamel Smaïli

Abstract This paper addresses the development of an Automatic Speech Recognition system for Modern Standard Arabic (MSA) and its extension to Algerian dialect. Algerian dialect is very different from Arabic dialects of the Middle-East, since it is highly influenced by the French language. In this article, we start by presenting the new automatic speech recognition named ALASR (Arabic Loria Automatic Speech Recognition) system. The acoustic model of ALASR is based on a DNN approach and the language model is a classical n-gram. Several options are investigated in this paper to find the best combination of models and parameters. ALASR achieves good results for MSA in terms of WER (14.02%), but it completely collapses on an Algerian dialect data set of 70 minutes (a WER of 89%). In order to take into account the impact of the French language, on the Algerian dialect, we combine in ALASR two acoustic models, the original one (MSA) and a French one trained on ESTER corpus. This solution has been adopted because no transcribed speech data for Algerian dialect are available. This combination leads to a substantial absolute reduction of the word error of 24%.


information sciences, signal processing and their applications | 2007

Frame-synchronous and local confidence measures for on-the-fly keyword spotting

Joseph Razik; Odile Mella; Dominique Fohr; Jean Paul Haton

This paper presents several new confidence measures for speech recognition applications. The major advantage of these measures is that they can be evaluated with only a part of the whole sentence. Two of these measures can be computed directly within the first step of the recognition process, synchronously with the decoding engine. Such measures are useful to drive the recognition process by modifying the likelihood score or to validate recognized words in on-the-fly applications as keyword spotting task and on-line automatic speech transcription for deaf people. Two kinds of results are given. Firstly, an EER evaluation on a French broadcast news corpus shows performance close to the batch version of these measures (23.9% against 23.8% of EER). Secondly, for the keyword spotting application, our best measure provides a decrease of the false-acceptation rate by 50% with only a decrease of the correct words by 5%.


conference of the international speech communication association | 2007

Speaker Diarization using Normalized Cross Likelihood Ratio

Viet Bac Le; Odile Mella; Dominique Fohr


conference of the international speech communication association | 2004

The Automatic News Transcription System: ANTS some Real Time experiments

Dominique Fohr; Odile Mella; Christophe Cerisara; Irina Illina


conference of the international speech communication association | 2000

The automatic speech recognition engine ESPERE : experiments on telephone speech

Dominique Fohr; Odile Mella; Christophe Antoine


language resources and evaluation | 2014

Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process

Camille Fauth; Anne Bonneau; Frank Zimmerer; Juergen Trouvain; Bistra Andreeva; Vincent Colotte; Dominique Fohr; Denis Jouvet; Jeanin J"ugler; Yves Laprie; Odile Mella; Bernd M"obius

Collaboration


Dive into the Odile Mella's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anne Bonneau

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Jean-Paul Haton

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Camille Fauth

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jean Paul Haton

French Institute for Research in Computer Science and Automation

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christine Sénac

Centre national de la recherche scientifique

View shared research outputs
Researchain Logo
Decentralizing Knowledge