Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Patrick Wambacq is active.

Publication


Featured researches published by Patrick Wambacq.


Remote Sensing Reviews | 1994

Speckle filtering of synthetic aperture radar images: A review

J. S. Lee; L. Jurkevich; Piet Dewaele; Patrick Wambacq; André Oosterlinck

Abstract Speckle, appearing in synthetic aperture radar (SAR) images as granular noise, is due to the interference of waves reflected from many elementary scatterers. Speckle in SAR images complicates the image interpretation problem by reducing the effectiveness of image segmentation and classification. To alleviate deleterious effects of speckle, various ways have been devised to suppress it. This paper surveys several better‐known speckle filtering algorithms. The concept of each filtering algorithm and the interrelationship between algorithms are discussed in detail. A set of performance criteria is established and comparisons are made for the effectiveness of these filters in speckle reduction and edge, line, and point target contrast preservation using a simulated SAR image as well as airborne and spaceborne SAR images. In addition, computational efficiency and implementation complexity are compared. This critical evaluation of speckle suppression filters is mostly new and is presented as a survey p...


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Template-Based Continuous Speech Recognition

M. De Wachter; Mike Matton; Kris Demuynck; Patrick Wambacq; Ronald Cools; D. Van Compernolle

Despite their known weaknesses, hidden Markov models (HMMs) have been the dominant technique for acoustic modeling in speech recognition for over two decades. Still, the advances in the HMM framework have not solved its key problems: it discards information about time dependencies and is prone to overgeneralization. In this paper, we attempt to overcome these problems by relying on straightforward template matching. The basis for the recognizer is the well-known DTW algorithm. However, classical DTW continuous speech recognition results in an explosion of the search space. The traditional top-down search is therefore complemented with a data-driven selection of candidates for DTW alignment. We also extend the DTW framework with a flexible subword unit mechanism and a class sensitive distance measure-two components suggested by state-of-the-art HMM systems. The added flexibility of the unit selection in the template-based framework leads to new approaches to speaker and environment adaptation. The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate with 17% compared to the HMM results


EURASIP Journal on Advances in Signal Processing | 2007

A review of signal subspace speech enhancement and its application to noise robust speech recognition

Kris Hermus; Patrick Wambacq; Hugo Van hamme

The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a comprehensive study of the potential of subspace filtering to increase the robustness of automatic speech recognisers against stationary additive noise distortions. Subspace filtering methods are based on the orthogonal decomposition of the noisy speech observation space into a signal subspace and a noise subspace. This decomposition is possible under the assumption of a low-rank model for speech, and on the availability of an estimate of the noise correlation matrix. We present an extensive overview of the available estimators, and derive a theoretical estimator to experimentally assess an upper bound to the performance that can be achieved by any subspace-based method. Automatic speech recognition experiments with noisy data demonstrate that subspace-based speech enhancement can significantly increase the robustness of these systems in additive coloured noise environments. Optimal performance is obtained only if no explicit rank reduction of the noisy Hankel matrix is performed. Although this strategy might increase the level of the residual noise, it reduces the risk of removing essential signal information for the recognisers back end. Finally, it is also shown that subspace filtering compares favourably to the well-known spectral subtraction technique.


Speech Communication | 2000

An efficient search space representation for large vocabulary continuous speech recognition

Kris Demuynck; Jacques Duchateau; Dirk Van Compernolle; Patrick Wambacq

Abstract In pursuance of better performance, current speech recognition systems tend to use more and more complicated models for both the acoustic and the language component. Cross-word context dependent (CD) phone models and long-span statistical language models (LMs) are now widely used. In this paper, we present a memory-efficient search topology that enables the use of such detailed acoustic and language models in a one pass time-synchronous recognition system. Characteristic of our approach is (1) the decoupling of the two basic knowledge sources, namely pronunciation information and LM information, and (2) the representation of pronunciation information – the lexicon in terms of CD units – by means of a compact static network. The LM information is incorporated into the search at run-time by means of a slightly modified token-passing algorithm. The decoupling of the LM and lexicon allows great flexibility in the choice of LMs, while the static lexicon representation avoids the cost of dynamic tree expansion and facilitates the integration of additional pronunciation information such as assimilation rules. Moreover, the network representation results in a compact structure when words have various pronunciations, and due to its construction, it offers partial LM forwarding at no extra cost.


Speech Communication | 2006

Model-based feature enhancement with uncertainty decoding for noise robust ASR

Veronique Stouten; Hugo Van hamme; Patrick Wambacq

In this paper, several techniques are proposed to incorporate the uncertainty of the clean speech estimate in the decoding process of the backend recogniser in the context of model-based feature enhancement (MBFE) for noise robust speech recognition. Usually, the Gaussians in the acoustic space are sampled in a single point estimate, which means that the backend recogniser considers its input as a noise-free utterance. However, in this way the variance of the estimator is neglected. To solve this problem, it has already been argued that the acoustic space should be evaluated in a probability density function, e.g. a Gaussian observation pdf. We illustrate that this Gaussian observation pdf can be replaced by a computationally more tractable discrete pdf, consisting of a weighted sum of delta functions. We also show how improved posterior state probabilities can be obtained by calculating their maximum likelihood estimates or by using the pdf of clean speech conditioned on both the noisy speech and the backend Gaussian. Another simple and efficient technique is to replace these posterior probabilities by M Kronecker deltas, which results in M front-end feature vector candidates, and to take the maximum over their backend scores. Experimental results are given for the Aurora2 and Aurora4 database to compare the proposed techniques. A significant decrease of the word error rate of the resulting speech recognition system is obtained.In this paper, we consider the parametric version of Wiener systems where both the linear and nonlinear parts are identified with clipped observations in the presence of internal and external noises. Also the static functions are allowed noninvertible. We propose a classification based support vector machine (SVM) and formulate the identification problem as a convex optimization. The solution to the optimization problem converges to the true parameters of the linear system if it is an finite-impulse-response (FIR) system, even though clipping reduces a great deal of information about the system characteristics. In identifying a Wiener system with a stable infinite-impulse-response (IIR) system, an FIR system is used to approximate it and the problem is converted to identifying the FIR system together with solving a set of nonlinear equations. This leads to biased estimates of parameters in the IIR system while the bias can be controlled by choosing the order of the approximated FIR system.


Computers and Electronics in Agriculture | 1991

Development and application of computer vision systems for use in livestock production

E. Van Der Stuyft; C.P. Schofield; J.M. Randall; Patrick Wambacq; Vic Goedseels

Abstract This paper examines the feasibility of applying computer vision systems to improve health, welfare and efficiency in livestock production. Very little directly relevant literature was revealed when reviewing the subject, so it is examined from first principles. After briefly describing the value of computer vision as a sensor with powerful observational and interpretative ability, the different steps in vision system development are identified and explored. Where possible this examination is related to computer vision work on livestock as well as other biological objects which by their typically varied nature offer meaningful paradigms for the livestockrelated work. The analysis suggests that most operations in livestock production tend to be at the complex end of the spectrum of vision-related problems currently being tackled in agriculture. Hence, only applications which have a significant production or welfare effect will be viable. Another vital element necessary for success in this application is a simultaneous understanding by the system designers of a diverse set of mechanisms (the production process, the interaction between process and sensor, vision algorithm building, and software and hardware systems). This calls for a multi-disciplinary, interactive approach to develop optimal solutions.


Signal Processing | 2005

Perceptual audio modeling with exponentially damped sinusoids

Kris Hermus; Werner Verhelst; Philippe Lemmerling; Patrick Wambacq; Sabine Van Huffel

This paper presents the derivation of a new perceptual model that represents speech and audio signals by a sum of exponentially damped sinusoids. Compared to a traditional sinusoidal model, the exponential sinusoidal model (ESM) is better suited to model transient segments that are readily found in audio signals.Total least squares (TLS) algorithms are applied for the automatic extraction of the modeling parameters in the ESM, i.e. the amplitude, phase, frequency and damping factors of a user-defined number of damped sinusoids. In order to turn the SNR optimization criterion of these TLS algorithms into a perceptual modeling strategy, we use the psychoacoustic model of MPEG-1 Layer 1 in a subband TLS-ESM scheme. This allows us to model each subband signal in accordance with its perceptual relevance, thereby lowering the number of required modeling components for a given modeling quality. Simulations and listening tests confirm that perceptual ESM achieves the same perceived quality as plain ESM while using substantially less components, and provide support for applying the new model in the fields of parametric audio processing and coding.


Speech Communication | 2006

Coping with disfluencies in spontaneous speech recognition: Acoustic detection and linguistic context manipulation

Frederik Stouten; Jacques Duchateau; Jean-Pierre Martens; Patrick Wambacq

Nowadays read speech recognition already works pretty well, but the recognition of spontaneous speech is much more problematic. There are plenty of reasons for this, and we hypothesize that one of them is the regular occurrence of disfluencies in spontaneous speech. Disfluencies disrupt the normal course of the sentence and when for instance word interruptions are concerned, they also give rise to word-like speech elements which have no representation in the lexicon of the recognizer. In this paper we propose novel methods that aim at coping with the problems induced by three types of disfluencies, namely filled pauses, repeated words and sentence restarts. Our experiments show that especially the proposed methods for filled pause handling offer a moderate but statistically significant improvement over the more traditional techniques previously presented in the literature.


international conference on acoustics, speech, and signal processing | 2004

Assessment of signal subspace based speech enhancement for noise robust speech recognition

Kris Hermus; Patrick Wambacq

Subspace filtering is an extensively studied technique that has been proven very effective in the area of speech enhancement to improve the speech intelligibility. In this paper, we review different subspace estimation techniques (minimum variance, least squares, singular value adaptation, time domain constrained and spectral domain constrained) in a modified singular value decomposition (SVD) framework, and investigate their capability to improve the noise robustness of speech recognisers. An extensive set of recognition experiments with the Resource Management (RM) database showed that significant reductions in WER can be obtained, both for the white noise and the coloured noise case. Unlike for speech enhancement approaches, we found that no truncation of the noisy signal subspace should be done to optimise the recognition accuracy.


ieee automatic speech recognition and understanding workshop | 2009

The ESAT 2008 system for N-Best Dutch speech recognition benchmark

Kris Demuynck; Antti Puurula; Dirk Van Compernolle; Patrick Wambacq

This paper describes the ESAT 2008 Broadcast News transcription system for the N-Best 2008 benchmark, developed in part for testing the recent SPRAAK Speech Recognition Toolkit. ESAT system was developed for the Southern Dutch Broadcast News subtask of N-Best using standard methods of modern speech recognition. A combination of improvements were made in commonly overlooked areas such as text normalization, pronunciation modeling, lexicon selection and morphological modeling, virtually solving the out-of-vocabulary (OOV) problem for Dutch by reducing OOV-rate to 0.06% on the N-Best development data and 0.23% on the evaluation data. Recognition experiments were run with several configurations comparing one-pass vs. two-pass decoding, high-order vs. low-order n-gram models, lexicon sizes and different types of morphological modeling. The system achieved 7.23% word error rate (WER) on the broadcast news development data and 20.3% on the much more difficult evaluation data of N-Best.

Collaboration


Dive into the Patrick Wambacq's collaboration.

Top Co-Authors

Avatar

André Oosterlinck

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Kris Demuynck

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Joris Pelemans

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Hugo Van hamme

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Lyan Verwimp

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jacques Duchateau

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Johan Vandeneede

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Kris Hermus

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Werner Verhelst

Vrije Universiteit Brussel

View shared research outputs
Researchain Logo
Decentralizing Knowledge