Blaise Potard | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Blaise Potard is active.

Explore More

Publication

Featured researches published by Blaise Potard.

international conference on acoustics, speech, and signal processing | 2015

Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition

Ivan Himawan; Petr Motlicek; David Imseng; Blaise Potard; Nam-hoon Kim; Jae-won Lee

Automatic speech recognition from distant microphones is a difficult task because recordings are affected by reverberation and background noise. First, the application of the deep neural network (DNN)/hidden Markov model (HMM) hybrid acoustic models for distant speech recognition task using AMI meeting corpus is investigated. This paper then proposes a feature transformation for removing reverberation and background noise artefacts from bottleneck features using DNN trained to learn the mapping between distant-talking speech features and close-talking speech bottleneck features. Experimental results on AMI meeting corpus reveal that the mismatch between close-talking and distant-talking conditions is largely reduced, with about 16% relative improvement over conventional bottleneck system (trained on close-talking speech). If the feature mapping is applied to close-talking speech, a minor degradation of 4% relative is observed.

international conference on acoustics, speech, and signal processing | 2015

Phonological vocoding using artificial neural networks

Milos Cernak; Blaise Potard; Philip N. Garner

We investigate a vocoder based on artificial neural networks using a phonological speech representation. Speech decomposition is based on the phonological encoders, realised as neural network classifiers, that are trained for a particular language. The speech reconstruction process involves using a Deep Neural Network (DNN) to map phonological features posteriors to speech parameters - line spectra and glottal signal parameters - followed by LPC resynthesis. This DNN is trained on a target voice without transcriptions, in a semi-supervised manner. Both encoder and decoder are based on neural networks and thus the vocoding is achieved using a simple fast forward pass. An experiment with French vocoding and a target male voice trained on 21 hour long audio book is presented. An application of the phonological vocoder to low bit rate speech coding is shown, where transmitted phonological posteriors are pruned and quantized. The vocoder with scalar quantization operates at 1 kbps, with potential for lower bit-rate.

international conference on acoustics, speech, and signal processing | 2014

Exploiting un-transcribed foreign data for speech recognition in well-resourced languages

David Imseng; Blaise Potard; Petr Motlicek; Alexandre Nanchen

Manual transcription of audio databases for automatic speech recognition (ASR) training is a costly and time-consuming process. State-of-the-art hybrid ASR systems that are based on deep neural networks (DNN) can exploit un-transcribed foreign data during unsupervised DNN pre-training or semi-supervised DNN training. We investigate the relevance of foreign data characteristics, in particular domain and language. Using three different datasets of the MediaParl and Ester databases, our experiments suggest that domain and language are equally important. Foreign data recorded under matched conditions (language and domain) yields the most improvement. The resulting ASR system yields about 5% relative improvement compared to the baseline system only trained on transcribed data. Our studies also reveal that the amount of foreign data used for semi-supervised training can be significantly reduced without degrading the ASR performance if confidence measure based data selection is employed.

international conference on speech and computer | 2015

DNN-Based Speech Synthesis: Importance of Input Features and Training Data

Alexandros Lazaridis; Blaise Potard; Philip N. Garner

Deep neural networks (DNNs) have been recently introduced in speech synthesis. In this paper, an investigation on the importance of input features and training data on speaker dependent (SD) DNN-based speech synthesis is presented. Various aspects of the training procedure of DNNs are investigated in this work. Additionally, several training sets of different size (i.e., 13.5, 3.6 and 1.5 h of speech) are evaluated.

Archive | 2015

Preliminary Work on Speaker Adaptation for DNN-Based Speech Synthesis

Blaise Potard; Petr Motlicek; David Imseng

language resources and evaluation | 2014

The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues

Volha Petukhova; Martin Gropp; Dietrich Klakow; Gregor Eigner; Mario Topf; Stefan Srb; Petr Motlicek; Blaise Potard; John Dines; Olivier Deroo; Ronny Egeler; Uwe Meinz; Steffen Liersch; Anna Schmidt

Archive | 2015