Denis Jouvet
University of Lorraine
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Denis Jouvet.
international conference on speech and computer | 2013
Katarina Bartkova; Denis Jouvet
This paper presents an automatic approach for the detection of the prosodic structures of speech utterances. The algorithm relies on a hierarchical representation of the prosodic organization of the speech utterances. The approach is applied on a corpus of radio French broadcast news and also on radio and TV shows which are more spontaneous speech data. The algorithm detects prosodic boundaries whether they are followed or not by pause. The detection of the prosodic boundaries and of the prosodic structures is based on an approach that integrates little linguistic knowledge and mainly uses the amplitude of the F0 slopes and the inversion of the slopes as described in [1], as well as phone durations. The automatic prosodic segmentation results are then compared to a manual prosodic segmentation made by an expert phonetician. Finally, the results obtained by this automatic approach provide an insight into the most frequently used prosodic structures in the broadcasting speech style as well as in a more spontaneous speech style.
Proceedings of the Third Arabic Natural Language Processing Workshop | 2017
Mohamed Amine Menacer; Odile Mella; Dominique Fohr; Denis Jouvet; David Langlois; Kamel Smaïli
Automatic speech recognition for Arabic is a very challenging task. Despite all the classical techniques for Automatic Speech Recognition (ASR), which can be efficiently applied to Arabic speech recognition , it is essential to take into consideration the language specificities to improve the system performance. In this article, we focus on Modern Standard Arabic (MSA) speech recognition. We introduce the challenges related to Arabic language, namely the complex morphology nature of the language and the absence of the short vowels in written text, which leads to several potential vowelization for each graphemes, which is often conflicting. We develop an ASR system for MSA by using Kaldi toolkit. Several acoustic and language models are trained. We obtain a Word Error Rate (WER) of 14.42 for the baseline system and 12.2 relative improvement by rescoring the lattice and by rewriting the output with the right hamoza above or below Alif.
SLSP 2015 Proceedings of the Third International Conference on Statistical Language and Speech Processing - Volume 9449 | 2015
Luiza Orosanu; Denis Jouvet
This article analyzes the automatic detection of sentence modality in French using both prosodic and linguistic information. The goal is to later use such an approach as a support for helping communication with deaf people. Two sentence modalities are evaluated: questions and statements. As linguistic features, we considered the presence of discriminative interrogative patterns and two log-likelihood ratios of the sentence being a question rather than a statement: one based on words and the other one based on part-of-speech tags. The prosodic features are based on duration, energy and pitch features estimated over the last prosodic group of the sentence. The evaluations consider using linguistic features stemming from manual transcriptions or from an automatic speech transcription system. The behavior of various sets of features are analyzed and compared. The combination of linguistic and prosodic features gives a slight improvement on automatic transcriptions, where the correct classification performance reaches 72i¾ź%.
SLSP 2015 Proceedings of the Third International Conference on Statistical Language and Speech Processing - Volume 9449 | 2015
Mathilde Dargnat; Katarina Bartkova; Denis Jouvet
Detecting the correct syntactic function of a word is of great importance for language and speech processing. The semantic load of a word is different whether its function is a discourse particle or a preposition. Words having the function of a discourse particle DP are very frequent in spontaneous speech and their discursive function is often expressed only by prosodic means. Our study analyses some prosodic correlates of two French words quoi, voili, used as discourse particles or pronoun quoi or preposition voili. Our goal is to determine to what extent intrinsic and contextual prosodic properties characterize DP and non-DP functions. Prosodic parameters are analyzed with respect to the DP or non-DP function for these words extracted from large speech corpora. A preliminary test concerning the automatic detection of the word function is also carried out using prosodic parameters only, leading to an encouraging result of 70i¾ź% correct identification.
Procedia Computer Science | 2018
Luiza Orosanu; Denis Jouvet
Abstract This article presents a study on how to automatically add new words into a language model without re-training it or adapting it (which requires a lot of new data). The proposed approach consists in finding a list of similar words for each new word to be added in the language model. Based on a small set of sentences containing the new words and on a set of n-gram counts containing the known words, we search for known words which have the most similar neighbor distribution (of the few preceding and few following neighbor words) to the new words. The similar words are determined through the computation of KL divergences on the distribution of neighbor words. The n-gram parameter values associated to the similar words are then used to define the n-gram parameter values of the new words. In the context of speech recognition, the performance assessment on a LVCSR task shows the benefit of the proposed approach.
MISSI | 2018
Kamel Smaïli; Dominique Fohr; Carlos-Emiliano González-Gallardo; Michał Grega; Lucjan Janowski; Denis Jouvet; Artur Komorowski; Arian Koźbiał; David Langlois; Mikołaj Leszczuk; Odile Mella; Mohamed Menacer; Amaia Méndez; Elvys Linhares Pontes; Eric SanJuan; Damian Świst; Juan-Manuel Torres-Moreno; Begoña Garcia-Zapirain
In this paper, we present the first results of the project AMIS (Access Multilingual Information opinionS) funded by Chist-Era. The main goal of this project is to understand the content of a video in a foreign language. In this work, we consider the understanding process, such as the aptitude to capture the most important ideas contained in a media expressed in a foreign language. In other words, the understanding will be approached by the global meaning of the content of a support and not by the meaning of each fragment of a video.
International Conference on Statistical Language and Speech Processing | 2018
Amal Houidhek; Vincent Colotte; Zied Mnasri; Denis Jouvet
This paper investigates the use of deep neural networks (DNN) for Arabic speech synthesis. In parametric speech synthesis, whether HMM-based or DNN-based, each speech segment is described with a set of contextual features. These contextual features correspond to linguistic, phonetic and prosodic information that may affect the pronunciation of the segments. Gemination and vowel quantity (short vowel vs. long vowel) are two particular and important phenomena in Arabic language. Hence, it is worth investigating if those phenomena must be handled by using specific speech units, or if their specification in the contextual features is enough. Consequently four modelling approaches are evaluated by considering geminated consonants (respectively long vowels) either as fully-fledged phoneme units or as the same phoneme as their simple (respectively short) counterparts. Although no significant difference has been observed in previous studies relying on HMM-based modelling, this paper examines these modelling variants in the framework of DNN-based speech synthesis. Listening tests are conducted to evaluate the four modelling approaches, and to assess the performance of DNN-based Arabic speech synthesis with respect to previous HMM-based approach.
Lecture Notes in Computer Science | 2017
Michał Grega; Kamel Smaïli; Mikołaj Leszczuk; Carlos-Emiliano González-Gallardo; Juan-Manuel Torres-Moreno; Elvys Linhares Pontes; Dominique Fohr; Odile Mella; Mohamed Menacer; Denis Jouvet
In this paper we present the results of the integration works on the system designed for automated summarization and translation of newscast and reports. We show the proposed system architectures and list the available software modules. Thanks to well defined interfaces the software modules may be used as building blocks allowing easy experimentation with different summarization scenarios.
SLSP 2015 Proceedings of the Third International Conference on Statistical Language and Speech Processing - Volume 9449 | 2015
Denis Jouvet; Katarina Bartkova
Speech technology enables computing statistics on word pronunciation variants as well as investigating various phonetic phenomena. This is achieved through a forced alignment of large amounts of speech signals with their possible pronunciations variants. Such alignments are usually performed using a 10 ms frame shift acoustical analysis. Therefore, the three emitting state structure of conventional acoustic hidden Markov models introduces a minimum duration constraint of 30 ms for each phone segment. This constraint is not critical at low speaking rates, but may introduce artefacts at high speaking rates. Thus, this paper investigates the impact of the acoustical frame rate on corpus-based phonetic statistics. Statistics on pronunciation variants obtained with a shorter frame shift 5 ms are compared to the statistics resulting from the standard 10 ms frame shift. Statistics are computed on a large speech corpus of more than 3 million running words, and are analyzed with respect to the estimated local speaking rate. Results exhibit some discrepancies between the two sets of statistics, in particular for high speaking rates where the usual acoustic analysis frame shift of 10 ms leads to an under-estimation of the frequency of the longest pronunciation variants.
CHiME - 2nd International Workshop on Machine Listening in Multisource Environments - 2013 | 2013
Dung Tran; Emmanuel Vincent; Denis Jouvet; Kamil Adiloglu
Collaboration
Dive into the Denis Jouvet's collaboration.
French Institute for Research in Computer Science and Automation
View shared research outputs