Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Trausti Kristjansson is active.

Publication


Featured researches published by Trausti Kristjansson.


Computer Speech & Language | 2010

Super-human multi-talker speech recognition: A graphical modeling approach

John R. Hershey; Steven J. Rennie; Peder A. Olsen; Trausti Kristjansson

We present a system that can separate and recognize the simultaneous speech of two people recorded in a single channel. Applied to the monaural speech separation and recognition challenge, the system out-performed all other participants -including human listeners - with an overall recognition error rate of 21.6%, compared to the human error rate of 22.3%. The system consists of a speaker recognizer, a model-based speech separation module, and a speech recognizer. For the separation models we explored a range of speech models that incorporate different levels of constraints on temporal dynamics to help infer the source speech signals. The system achieves its best performance when the model of temporal dynamics closely captures the grammatical constraints of the task. For inference, we compare a 2-D Viterbi algorithm and two loopy belief-propagation algorithms. We show how belief-propagation reduces the complexity of temporal inference from exponential to linear in the number of sources and the size of the language model. The best belief-propagation method results in nearly the same recognition error rate as exact inference.


international conference on acoustics, speech, and signal processing | 2012

Music models for music-speech separation

Thad Hughes; Trausti Kristjansson

We consider the task of speech recognition with loud music background interference. We use model-based music-speech separation and train GMM models for music on the audio prior to speech. We show over 8% relative improvement in WER at 10 dB SNR for a real world Voice Search ASR system. We investigate the relationship between ASR accuracy and the amount of music background used as prologue and the the size of music models. Our study shows that performance peaks when using a music prologue of around 6 seconds to train the music model. We hypothesize that this is due to the dynamic nature of music and the structure of popular music. Adding more history beyond a certain point does not improve results. Additionally, we show moderately sized 8-component music GMM models suffice to model this amount of music prologue.


Archive | 2011

Acoustic model adaptation using geographic information

Matthew I. Lloyd; Trausti Kristjansson


Archive | 2011

Geotagged environmental audio for enhanced speech recognition accuracy

Trausti Kristjansson; Matthew I. Lloyd


Archive | 2011

Speech and noise models for speech recognition

Matthew I. Lloyd; Trausti Kristjansson


Archive | 2011

Word-level correction of speech input

Michael J. Lebeau; William J. Byrne; John Nicholas Jitkoff; Brandon M. Ballinger; Trausti Kristjansson


Archive | 2009

Multisensory speech detection

Dave Burke; Michael J. Lebeau; Konrad Gianno; Trausti Kristjansson; John Nicholas Jitkoff; Andrew W. Senior


Archive | 2010

MATCHING ENCODER OUTPUT TO NETWORK BANDWIDTH

Matthew I. Lloyd; Trausti Kristjansson


Archive | 2011

Predictive pre-recording of audio for voice input

Trausti Kristjansson; Matthew I. Lloyd


conference of the international speech communication association | 2008

DySANA: dynamic speech and noise adaptation for voice activity detection.

Ron J. Weiss; Trausti Kristjansson

Collaboration


Dive into the Trausti Kristjansson's collaboration.

Researchain Logo
Decentralizing Knowledge