Blaise Potard
Idiap Research Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Blaise Potard.
international conference on acoustics, speech, and signal processing | 2015
Ivan Himawan; Petr Motlicek; David Imseng; Blaise Potard; Nam-hoon Kim; Jae-won Lee
Automatic speech recognition from distant microphones is a difficult task because recordings are affected by reverberation and background noise. First, the application of the deep neural network (DNN)/hidden Markov model (HMM) hybrid acoustic models for distant speech recognition task using AMI meeting corpus is investigated. This paper then proposes a feature transformation for removing reverberation and background noise artefacts from bottleneck features using DNN trained to learn the mapping between distant-talking speech features and close-talking speech bottleneck features. Experimental results on AMI meeting corpus reveal that the mismatch between close-talking and distant-talking conditions is largely reduced, with about 16% relative improvement over conventional bottleneck system (trained on close-talking speech). If the feature mapping is applied to close-talking speech, a minor degradation of 4% relative is observed.
international conference on acoustics, speech, and signal processing | 2015
Milos Cernak; Blaise Potard; Philip N. Garner
We investigate a vocoder based on artificial neural networks using a phonological speech representation. Speech decomposition is based on the phonological encoders, realised as neural network classifiers, that are trained for a particular language. The speech reconstruction process involves using a Deep Neural Network (DNN) to map phonological features posteriors to speech parameters - line spectra and glottal signal parameters - followed by LPC resynthesis. This DNN is trained on a target voice without transcriptions, in a semi-supervised manner. Both encoder and decoder are based on neural networks and thus the vocoding is achieved using a simple fast forward pass. An experiment with French vocoding and a target male voice trained on 21 hour long audio book is presented. An application of the phonological vocoder to low bit rate speech coding is shown, where transmitted phonological posteriors are pruned and quantized. The vocoder with scalar quantization operates at 1 kbps, with potential for lower bit-rate.
international conference on acoustics, speech, and signal processing | 2014
David Imseng; Blaise Potard; Petr Motlicek; Alexandre Nanchen
Manual transcription of audio databases for automatic speech recognition (ASR) training is a costly and time-consuming process. State-of-the-art hybrid ASR systems that are based on deep neural networks (DNN) can exploit un-transcribed foreign data during unsupervised DNN pre-training or semi-supervised DNN training. We investigate the relevance of foreign data characteristics, in particular domain and language. Using three different datasets of the MediaParl and Ester databases, our experiments suggest that domain and language are equally important. Foreign data recorded under matched conditions (language and domain) yields the most improvement. The resulting ASR system yields about 5% relative improvement compared to the baseline system only trained on transcribed data. Our studies also reveal that the amount of foreign data used for semi-supervised training can be significantly reduced without degrading the ASR performance if confidence measure based data selection is employed.
international conference on speech and computer | 2015
Alexandros Lazaridis; Blaise Potard; Philip N. Garner
Deep neural networks (DNNs) have been recently introduced in speech synthesis. In this paper, an investigation on the importance of input features and training data on speaker dependent (SD) DNN-based speech synthesis is presented. Various aspects of the training procedure of DNNs are investigated in this work. Additionally, several training sets of different size (i.e., 13.5, 3.6 and 1.5 h of speech) are evaluated.
Archive | 2015
Blaise Potard; Petr Motlicek; David Imseng
language resources and evaluation | 2014
Volha Petukhova; Martin Gropp; Dietrich Klakow; Gregor Eigner; Mario Topf; Stefan Srb; Petr Motlicek; Blaise Potard; John Dines; Olivier Deroo; Ronny Egeler; Uwe Meinz; Steffen Liersch; Anna Schmidt
Archive | 2015
Philip N. Garner; Milos Cernak; Blaise Potard
Eurasip Journal on Audio, Speech, and Music Processing | 2015
Petr Motlicek; David Imseng; Blaise Potard; Philip N. Garner; Ivan Himawan
ISCA Speech Synthesis Workshop | 2013
Lakshmi Saheer; Blaise Potard
SSW | 2016
Blaise Potard; Matthew P. Aylett; David A. Braude; Petr Motlicek