Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Niko Brümmer is active.

Publication


Featured researches published by Niko Brümmer.


Computer Speech & Language | 2006

Application-independent evaluation of speaker detection

Niko Brümmer; Johan A. du Preez

We propose and motivate an alternative to the traditional error-based or cost-based evaluation metrics for the goodness of speaker detection performance. The metric that we propose is an information-theoretic one, which measures the effective amount of information that the speaker detector delivers to the user. We show that this metric is appropriate for the evaluation of what we call application-independent detectors, which output soft decisions in the form of log-likelihood-ratios, rather than hard decisions. The proposed metric is constructed via analysis and generalization of cost-based evaluation metrics. This construction forms an interpretation of this metric as an expected cost, or as a total error-rate, over a range of different application-types. We further show how the metric can be decomposed into a discrimination and a calibration component. We conclude with an experimental demonstration of the proposed technique to evaluate three speaker detection systems submitted to the NIST 2004 Speaker Recognition Evaluation.


international conference on acoustics, speech, and signal processing | 2009

Comparison of scoring methods used in speaker recognition with Joint Factor Analysis

Ondrej Glembek; Lukas Burget; Najim Dehak; Niko Brümmer; Patrick Kenny

The aim of this paper is to compare different log-likelihood scoring methods, that different sites used in the latest state-of-the-art Joint Factor Analysis (JFA) Speaker Recognition systems. The algorithms use various assumptions and have been derived from various approximations of the objective functions of JFA. We compare the techniques in terms of speed and performance. We show, that approximations of the true log-likelihood ratio (LLR) may lead to significant speedup without any loss in performance.


international conference on acoustics, speech, and signal processing | 2011

Discriminatively trained Probabilistic Linear Discriminant Analysis for speaker verification

Lukas Burget; Oldrich Plchot; Sandro Cumani; Ondrej Glembek; Pavel Matejka; Niko Brümmer

Recently, i-vector extraction and Probabilistic Linear Discriminant Analysis (PLDA) have proven to provide state-of-the-art speaker verification performance. In this paper, the speaker verification score for a pair of i-vectors representing a trial is computed with a functional form derived from the successful PLDA generative model. In our case, however, parameters of this function are estimated based on a discriminative training criterion. We propose to use the objective function to directly address the task in speaker verification: discrimination between same-speaker and different-speaker trials. Compared with a baseline which uses a generatively trained PLDA model, discriminative training provides up to 40% relative improvement on the NIST SRE 2010 evaluation task.


international conference on acoustics, speech, and signal processing | 2011

Fast discriminative speaker verification in the i-vector space

Sandro Cumani; Niko Brümmer; Lukas Burget; Pietro Laface

This work presents a new approach to discriminative speaker verification. Rather than estimating speaker models, or a model that discriminates between a speaker class and the class of all the other speakers, we directly solve the problem of classifying pairs of utterances as belonging to the same speaker or not.


2006 IEEE Odyssey - The Speaker and Language Recognition Workshop | 2006

Channel-dependent GMM and Multi-class Logistic Regression models for language recognition

D.A. van Leeuwen; Niko Brümmer

This paper describes two new approaches to spoken language recognition. These were both successfully applied in the NIST 2005 Language Recognition Evaluation. The first approach extends the Gaussian Mixture Model technique with channel dependency, which results in actual detection costs (CDET) of 0.095 in NIST LRE-2005, and which should be compared to a traditional 2-gender dependency of GMM language models achieving 0.120. The second approach is a Multi-class Logistic Regression system, which operates similarly to a Support Vector Machine (SVM), but can be trained for all languages simultaneously. This new approach resulted in a CDET of 0.198. The joint TNO-Spescom Datavoice (TNO-SDV) submission to NIST LRE-2005 contained two more systems and obtained a result of 0.0958.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Pairwise Discriminative Speaker Verification in the

Sandro Cumani; Niko Brümmer; Lukas Burget; Pietro Laface; Oldrich Plchot; Vasileios Vasilakakis

This work presents a new and efficient approach to discriminative speaker verification in the i-vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to the usual discriminative setup that discriminates between a speaker and all the other speakers. We use a discriminative classifier based on a Support Vector Machine (SVM) that is trained to estimate the parameters of a symmetric quadratic function approximating a log-likelihood ratio score without explicit modeling of the i-vector distributions as in the generative Probabilistic Linear Discriminant Analysis (PLDA) models. Training these models is feasible because it is not necessary to expand the i -vector pairs, which would be expensive or even impossible even for medium sized training sets. The results of experiments performed on the tel-tel extended core condition of the NIST 2010 Speaker Recognition Evaluation are competitive with the ones obtained by generative models, in terms of normalized Detection Cost Function and Equal Error Rate. Moreover, we show that it is possible to train a gender-independent discriminative model that achieves state-of-the-art accuracy, comparable to the one of a gender-dependent system, saving memory and execution time both in training and in testing.


international conference on acoustics, speech, and signal processing | 2007

{\rm I}

R. Matejka; Lukas Burget; Petr Schwarz; Ondrej Glembek; Martin Karafiát; Frantisek Grezl; Jan Cernocky; D.A. van Leeuwen; Niko Brümmer; A. Strasheim

This paper describes STBU 2006 speaker recognition system, which performed well in the NIST 2006 speaker recognition evaluation. STBU is consortium of 4 partners: Spescom DataVoice (South Africa), TNO (Netherlands), BUT (Czech Republic) and University of Stellenbosch (South Africa). The primary system is a combination of three main kinds of systems: (1) GMM, with short-time MFCC or PLP features, (2) GMM-SVM, using GMM mean supervectors as input and (3) MLLR-SVM, using MLLR speaker adaptation coefficients derived from English LVCSR system. In this paper, we describe these sub-systems and present results for each system alone and in combination on the NIST Speaker Recognition Evaluation (SRE) 2006 development and evaluation data sets.


international conference on acoustics, speech, and signal processing | 2012

-Vector Space

Sandro Cumani; Ondřej Glembek; Niko Brümmer; Edward de Villiers; Pietro Laface

Speaker recognition systems attain their best accuracy when trained with gender dependent features and tested with known gender trials. In real applications, however, gender labels are often not given. In this work we illustrate the design of a system that does not make use of the gender labels both in training and in test, i.e. a completely Gender Independent (GI) system. It relies on discriminative training, where the trials are i-vector pairs, and the discrimination is between the hypothesis that the pair of feature vectors in the trial belong to the same speaker or to different speakers. We demonstrate that this pairwise discriminative training can be interpreted as a procedure that estimates the parameters of the best (second order) approximation of the log-likelihood ratio score function, and that a pairwise SVM can be used for training a gender independent system. Our results show that a pairwise GI SVM, saving memory and execution time, achieves on the last NIST evaluations state-of-the-art performance, comparable to a Gender Dependent(GD) system.


conference of the international speech communication association | 2009

STBU System for the NIST 2006 Speaker Recognition Evaluation

Najim Dehak; Réda Dehak; Patrick Kenny; Niko Brümmer; Pierre Ouellet; Pierre Dumouchel


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Gender independent discriminative speaker recognition in i-vector space

Niko Brümmer; Lukas Burget; Jan Cernocky; Ondrej Glembek; Frantisek Grezl; Martin Karafiát; D.A. van Leeuwen; Pavel Matejka; Petr Schwarz; Albert Strasheim

Collaboration


Dive into the Niko Brümmer's collaboration.

Top Co-Authors

Avatar

Lukas Burget

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Ondrej Glembek

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Pavel Matejka

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Oldrich Plchot

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Martin Karafiát

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Najim Dehak

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

D.A. van Leeuwen

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Albert Strasheim

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Jan Cernocky

Brno University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge