Daniel Soutner | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Soutner is active.

Explore More

Publication

Featured researches published by Daniel Soutner.

text speech and dialogue | 2011

Web text data mining for building large scale language modelling corpus

Jan Švec; Jan Hoidekr; Daniel Soutner; Jan Vavruška

The paper describes a system for collecting a large text corpus from Internet news servers. The architecture and text preprocessing algorithms are described. We also describe the used duplicity detection algorithm. The resulting corpus contains more than 1 billion tokens in more than 3 millions articles with assigned topics and duplicates identified. Corpus statistics like consistency and perplexity are presented.

text speech and dialogue | 2013

Application of LSTM Neural Networks in Language Modelling

Daniel Soutner; Luděk Müller

Artificial neural networks have become state-of-the-art in the task of language modelling on a small corpora. While feed-forward networks are able to take into account only a fixed context length to predict the next word, recurrent neural networks (RNN) can take advantage of all previous words. Due the difficulties in training of RNN, the way could be in using Long Short Term Memory (LSTM) neural network architecture.

text speech and dialogue | 2012

Neural Network Language Model with Cache

Daniel Soutner; Zdeněk Loose; Luděk Müller; Aleš Pražák

In this paper we investigate whether a combination of statistical, neural network and cache language models can outperform a basic statistical model. These models have been developed, tested and exploited for a Czech spontaneous speech data, which is very different from common written Czech and is specified by a small set of the data available and high inflection of the words. As a baseline model we used a trigram model and after its training several cache models interpolated with the baseline model have been tested and measured on a perplexity. Finally, an evaluation of the model with the lowest perplexity has been performed on speech recordings of phone calls.

International Conference on Statistical Language and Speech Processing | 2017

A Regularization Post Layer: An Additional Way How to Make Deep Neural Networks Robust

Jan Vaněk; Jan Zelinka; Daniel Soutner; Josef Psutka

Neural Networks (NNs) are prone to overfitting. Especially, the Deep Neural Networks in the cases where the training data are not abundant. There are several techniques which allow us to prevent the overfitting, e.g., L1/L2 regularization, unsupervised pre-training, early training stopping, dropout, bootstrapping or cross-validation models aggregation. In this paper, we proposed a regularization post-layer that may be combined with prior techniques, and it brings additional robustness to the NN. We trained the regularization post-layer in the cross-validation (CV) aggregation scenario: we used the CV held-out folds to train an additional neural network post-layer that boosts the network robustness. We have tested various post-layer topologies and compared results with other regularization techniques. As a benchmark task, we have selected the TIMIT phone recognition which is a well-known and still favorite task where the training data are limited, and the used regularization techniques play a key role. However, the regularization post-layer is a general method, and it may be employed in any classification task.

text speech and dialogue | 2014

Inter-Annotator Agreement on Spontaneous Czech Language

Tomáš Valenta; Luboš Šmídl; Jan Švec; Daniel Soutner

The goal of this article is to show that for some tasks in automatic speech recognition (ASR), especially for recognition of spontaneous telephony speech, the reference annotation differs substantially among human annotators and thus sets the upper bound of the ASR accuracy. In this paper, we focus on the evaluation of the inter-annotator agreement (IAA) and ASR accuracy in the context of imperfect IAA. We evaluated it using a part of our Czech Switchboard-like spontaneous speech corpus called Toll-free calls. This data set was annotated by three different annotators rendering three parallel transcriptions. The results give us additional insights for understanding the ASR accuracy.

text speech and dialogue | 2014

Continuous Distributed Representations of Words as Input of LSTM Network Language Model

Daniel Soutner; Luděk Müller

The continuous skip-gram model is an efficient algorithm for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationships. Artificial neural networks have become the state-of-the-art in the task of language modelling whereas Long-Short Term Memory (LSTM) networks seem to be efficient training algorithm.

international conference on speech and computer | 2014

On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model

Daniel Soutner; Jan Zelinka; Luděk Müller

In this paper, we present a new NN/HMM speech recognition system with a NN-base acoustic model and RNN-based language model. The employed neural-network-based acoustic model computes posteriors for states of context-dependent acoustic units. A recurrent neural network with the maximum entropy extension was used as a language model. This hybrid NN/HMM system was compared with our previous hybrid NN/HMM system equipped with a standard n-gram language model. In our experiments, we also compared it to a standard GMM/HMM system. The system performance was evaluated on the British English speech corpus and compared with some previous work.

text, speech and dialogue | 2018

Recurrent Neural Network Based Speaker Change Detection from Text Transcription Applied in Telephone Speaker Diarization System

Zbyněk Zajíc; Daniel Soutner; Marek Hrúz; Luděk Müller; Vlasta Radová

In this paper, we propose a speaker change detection system based on lexical information from the transcribed speech. For this purpose, we applied a recurrent neural network to decide if there is an end of an utterance at the end of a spoken word. Our motivation is to use the transcription of the conversation as an additional feature for a speaker diarization system to refine the segmentation step to achieve better accuracy of the whole diarization system. We compare the proposed speaker change detection system based on transcription (text) with our previous system based on information from spectrogram (audio) and combine these two modalities to improve the results of diarization. We cut the conversation into segments according to the detected changes and represent them by an i-vector. We conducted experiments on the English part of the CallHome corpus. The results indicate improvement in speaker change detection (by 0.5% relatively) and also in speaker diarization (by 1% relatively) when both modalities are used.

SLSP 2015 Proceedings of the Third International Conference on Statistical Language and Speech Processing - Volume 9449 | 2015

On Continuous Space Word Representations as Input of LSTM Language Model

Daniel Soutner; Luděk Müller

Artificial neural networks have become the state-of-the-art in the task of language modelling whereas Long-Short Term Memory LSTM networks seem to be an efficient architecture. The continuous skip-gram and thei¾źcontinuous bag of words CBOW are algorithms for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationships. In this paper, we carried out experiments with a combination of these powerful models: the continuous representations of words trained with skip-gram/CBOW/GloVe method, word cache expressed as a vector using latent Dirichlet allocation LDA. These all are used on the input of LSTM network instead of 1-of-N coding traditionally used in language models. The proposed models are tested on Penn Treebank and MALACH corpus.

language resources and evaluation | 2018

Towards Processing of the Oral History Interviews and Related Printed Documents.

Zbynek Zajic; Lucie Skorkovská; Petr Neduchal; Pavel Ircing; Josef Psutka; Marek Hrúz; Ales Prazák; Daniel Soutner; Jan Švec; Lukáš Bureš; Ludek Müller

Explore More