D.A. van Leeuwen
Radboud University Nijmegen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by D.A. van Leeuwen.
enterprise distributed object computing | 2003
Henk Jonkers; R. van Burren; Farhad Arbab; F.S. de Boer; Marcello M. Bonsangue; H. Bosma; H.W.L. ter Doest; L.P.J. Groenewegen; Juan Guillen Scholten; Stijn Hoppenbrouwers; Maria Eugenia Iacob; W. Janssen; Marc M. Lankhorst; D.A. van Leeuwen; Erik Proper; Andries Stam; L. van der Torre; G.V. van Zanten
A coherent description of architectures provides insight, enables communication among different stakeholders and guides complicated (business and ICT) change processes. Unfortunately, so far no architecture description language exists that fully enables integrated enterprise modeling. In this paper we focus on the requirements and design of such a language. This language defines generic, organization-independent concepts that can be specialized or composed to obtain more specific concepts to be used within a particular organisation. It is not our intention to re-invent the wheel for each architectural domain: wherever possible we conform to existing languages or standards such as UML. We complement them with missing concepts, focusing on concepts to model the relationships among architectural domains. The concepts should also make it possible to define links between models in other languages. The relationship between architecture descriptions at the business layer and at the application layer (business-IT alignment) plays a central role.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Mitchell McLaren; D.A. van Leeuwen
The recent development of the i-vector framework for speaker recognition has set a new performance standard in the research field. An i-vector is a compact representation of a speakers utterance extracted from a total variability subspace. Prior to classification using a cosine kernel, i-vectors are projected into an linear discriminant analysis (LDA) space in order to reduce inter-session variability and enhance speaker discrimination. The accurate estimation of this LDA space from a training dataset is crucial to detection performance. A typical training dataset, however, does not consist of utterances acquired through all sources of interest for each speaker. This has the effect of introducing systematic variation related to the speech source in the between-speaker covariance matrix and results in an incomplete representation of the within-speaker scatter matrix used for LDA. The recently proposed source-normalized (SN) LDA algorithm improves the robustness of i-vector-based speaker recognition under both mis-matched evaluation conditions and conditions for which inadequate speech resources are available for suitable system development. When evaluated on the recent NIST 2008 and 2010 Speaker Recognition Evaluations (SRE), SN-LDA demonstrated relative improvements of up to 38% in equal error rate (EER) and 44% in minimum DCF over LDA under mis-matched and sparsely resourced evaluation conditions while also providing improvements in the common telephone-only conditions. Extending on these initial developments, this study provides a thorough analysis of how SN-LDA transforms the i-vector space to reduce source variation and its robustness to varying evaluation and LDA training conditions. The concept of source-normalization is further extended to within-class covariance normalization (WCCN) and data-driven source detection.
international conference on biometrics | 2013
Elie Khoury; B. Vesnicer; Javier Franco-Pedroso; Ricardo Paranhos Velloso Violato; Z. Boulkcnafet; L. M. Mazaira Fernandez; Mireia Diez; J. Kosmala; Houssemeddine Khemiri; T. Cipr; Rahim Saeidi; Manuel Günther; J. Zganec-Gros; R. Zazo Candil; Flávio Olmos Simões; M. Bengherabi; A. Alvarez Marquina; Mikel Penagarikano; Alberto Abad; M. Boulayemen; Petr Schwarz; D.A. van Leeuwen; J. Gonzalez-Dominguez; M. Uliani Neto; E. Boutellaa; P. Gómez Vilda; Amparo Varona; Dijana Petrovska-Delacrétaz; Pavel Matejka; Joaquin Gonzalez-Rodriguez
This paper evaluates the performance of the twelve primary systems submitted to the evaluation on speaker verification in the context of a mobile environment using the MOBIO database. The mobile environment provides a challenging and realistic test-bed for current state-of-the-art speaker verification techniques. Results in terms of equal error rate (EER), half total error rate (HTER) and detection error trade-off (DET) confirm that the best performing systems are based on total variability modeling, and are the fusion of several sub-systems. Nevertheless, the good old UBM-GMM based systems are still competitive. The results also show that the use of additional data for training as well as gender-dependent features can be helpful.
international conference on acoustics, speech, and signal processing | 2007
R. Matejka; Lukas Burget; Petr Schwarz; Ondrej Glembek; Martin Karafiát; Frantisek Grezl; Jan Cernocky; D.A. van Leeuwen; Niko Brümmer; A. Strasheim
This paper describes STBU 2006 speaker recognition system, which performed well in the NIST 2006 speaker recognition evaluation. STBU is consortium of 4 partners: Spescom DataVoice (South Africa), TNO (Netherlands), BUT (Czech Republic) and University of Stellenbosch (South Africa). The primary system is a combination of three main kinds of systems: (1) GMM, with short-time MFCC or PLP features, (2) GMM-SVM, using GMM mean supervectors as input and (3) MLLR-SVM, using MLLR speaker adaptation coefficients derived from English LVCSR system. In this paper, we describe these sub-systems and present results for each system alone and in combination on the NIST Speaker Recognition Evaluation (SRE) 2006 development and evaluation data sets.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Marijn Huijbregts; D.A. van Leeuwen
Performing speaker diarization of very long recordings is a problem for most diarization systems that are based on agglomerative clustering with an hidden Markov model (HMM) topology. Performing collection-wide speaker diarization, where each speaker is identified uniquely across the entire collection, is even a more challenging task. In this paper we propose a method with which it is possible to efficiently perform diarization of long recordings. We have also applied this method successfully to a collection of a total duration of approximately 15 hours. The method consists of first segmenting long recordings into smaller chunks on which diarization is performed. Next, a speaker detection system is used to link the speech clusters from each chunk and to assign a unique label to each speaker in the long recording or in the small collection. We show for three different audio collections that it is possible to perform high-quality diarization with this approach. The long meetings from the ICSI corpus are processed 5.5 times faster than the originally needed time and by uniquely labeling each speaker across the entire collection it becomes possible to perform speaker-based information retrieval with high accuracy (mean average precision of 0.57).
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Marijn Huijbregts; D.A. van Leeuwen; C. Wooters
In this paper, we describe an analysis of our speaker diarization system based on a series of oracle experiments. In this analysis, each system component is substituted by an oracle component that uses the reference transcripts to perform flawlessly. By placing the original components back into the system one at a time, either in a top-down or bottom-up manner, the performance of each individual system component is measured. The analysis approach can be applied to any speaker diarization system that consists of a concatenation of separate components. Our experimental findings are relevant for most RT09s diarization systems that all apply similar techniques. The analysis revealed that three components caused most errors: speech activity detection, the inability to handle overlapping speech, and robustness of the merging component to cluster impurity.
conference of the international speech communication association | 2016
Emre Yilmaz; H. van den Heuvel; J. Dijkstra; H. Van de Velde; F. Kampstra; J. Algra; D.A. van Leeuwen
In this paper, we present several open source speech and language resources for the under-resourced Frisian language. Frisian is mostly spoken in the province of Fryslân which is located in the north of the Netherlands. The native speakers of Frisian are Frisian-Dutch bilingual and often code-switch in daily conversations. The resources presented in this paper include a code-switching speech database containing radio broadcasts, a phonetic lexicon with more than 70k words and a language model trained on a text corpus with more than 38M words. With this contribution, we aim to share the Frisian resources we have collected in the scope of the FAME! project, in which a spoken document retrieval system is built for the disclosure of the regional broadcaster’s radio archives. These resources enable research on code-switching and longitudinal speech and language change. Moreover, a sample automatic speech recognition (ASR) recipe for the Kaldi toolkit will also be provided online to facilitate the Frisian ASR research.
IEEE Transactions on Audio, Speech, and Language Processing | 2007
Niko Brümmer; Lukas Burget; Jan Cernocky; Ondrej Glembek; Frantisek Grezl; Martin Karafiát; D.A. van Leeuwen; Pavel Matejka; Petr Schwarz; Albert Strasheim
IEEE Transactions on Audio, Speech, and Language Processing | 2011
D.A. van Leeuwen
conference of the international speech communication association | 2005
K.P. Truong; D.A. van Leeuwen