Bengt J. Borgström
University of California, Los Angeles
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bengt J. Borgström.
IEEE Transactions on Robotics | 2009
Per Henrik Borgstrom; Brett L. Jordan; Bengt J. Borgström; Michael J. Stealey; Gaurav S. Sukhatme; Maxim A. Batalin; William J. Kaiser
We present the Networked InfoMechanical System for Planar Translation, which is a novel two-degree-of-freedom (2-DOF) cable-driven robot with self-calibration and online drift-correction capabilities. This system is intended for actuated sensing applications in aquatic environments. The actuation redundancy resulting from in-plane translation driven by four cables results in an infinite set of tension distributions, thus requiring real-time computation of optimal tension distributions. To this end, we have implemented a highly efficient, iterative linear programming solver, which requires a very small number of iterations to converge to the optimal value. In addition, two novel self-calibration methods have been developed that leverage the robots actuation redundancy. The first uses an incremental displacement, or jitter method, whereas the second uses variations in cable tensions to determine end-effector location. We also propose a novel least-squares drift-detection algorithm, which enables the robot to detect long-term drift. Combined with self-calibration capabilities, this drift-monitoring algorithm enables long-term autonomous operation. To verify the performance of our algorithms, we have performed extensive experiments in simulation and on a real system.
international conference on acoustics, speech, and signal processing | 2010
Lee Ngee Tan; Bengt J. Borgström; Abeer Alwan
This paper proposes a new statistical model-based likelihood ratio test (LRT) VAD to obtain reliable speech / non-speech decisions. In the proposed method, the likelihood ratio (LR) is calculated differently for voiced frames, as opposed to unvoiced frames: only DFT bins containing harmonic spectral peaks are selected for LR computation. To evaluate the new VADs effectiveness in improving the noise-robustness of ASR, its decisions are applied to pre-processing techniques such as non-linear spectral subtraction, minimum mean square error short-time spectral amplitude estimator, and frame dropping. From the ASR experiments conducted on the Aurora2 database, the proposed harmonic frequency-based LRTs give better results than conventional LRT-based VADs and the standard G.729B and ETSI AMR VADs.
IEEE Signal Processing Letters | 2009
Bengt J. Borgström; Abeer Alwan
In this letter, we propose a novel algorithm for reconstructing unreliable spectrographic data, a method applicable to missing feature-based automatic speech recognition (ASR). We provide quantitative analysis illustrating the high compressibility of spectrographic speech data. The existence of sparse representations for spectrographic data motivates the spectral reconstruction solution to be posed as an optimization problem minimizing the lscr1-norm. When applied to the Aurora-2 database, the proposed missing feature estimation algorithm is shown to provide significant improvements in recognition accuracy relative to the baseline MFCC system. Even without an oracle mask, performance approaches that of the ETSI advanced front end (AFE) , with less complexity.
systems man and cybernetics | 2008
Bengt J. Borgström; Abeer Alwan
This paper proposes a novel low-complexity lip contour model for high-level optic feature extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The model is based on weighted least-squares parabolic fitting of the upper and lower lip contours, does not require the assumption of symmetry across the horizontal axis of the mouth, and is therefore realistic. The proposed model does not depend on the accurate estimation of specific facial points, as do other high-level models. Also, we present a novel low-complexity algorithm for speaker normalization of the optic information stream, which is compatible with the proposed model and does not require parameter training. The use of the proposed model with speaker normalization results in noise robustness improvement in AV isolated-word recognition relative to using the baseline high-level model.
international conference on acoustics, speech, and signal processing | 2011
Bengt J. Borgström; Abeer Alwan
This paper presents a family of log-spectral amplitude (LSA) estimators for speech enhancement. Generalized Gamma distributed (GGD) priors are assumed for speech short-time spectral amplitudes (STSAs), providing mathematical flexibility in capturing the statistical behavior of speech. Although solutions are not obtainable in closed-form, estimators are expressed as limits, and can be efficiently approximated. When applied to the Noizeus database [1], proposed estimators are shown to provide improvements in segmental signal-to-noise ratio (SSNR) and COSH distance [2], relative to the LSA estimator proposed by Ephraim and Malah [3].
Archive | 2008
Bengt J. Borgström; Alexis Bernard; Abeer Alwan
Distributed Speech Recognition (DSR) systems rely on efficient transmission of speech information from distributed clients to a centralized server. Wireless or network communication channels within DSR systems are typically noisy and bursty. Thus, DSR systems must utilize efficient Error Recovery (ER) schemes during transmission of speech information. Some ER strategies, referred to as forward error control (FEC), aim to create redundancy in the source coded bitstream to overcome the effect of channel errors, while others are designed to create spread or delay in the feature stream in order to overcome the effect of bursty channel errors. Furthermore, ER strategies may be designed as a combination of the previously described techniques. This chapter presents an array of error recovery techniques for remote speech recognition applications. This chapter is organized as follows. First, channel characterization and modeling are discussed. Next, media-specific FEC is presented for packet erasure applications, followed by a discussion on media-independent FEC techniques for bit error applications, including general linear block codes, cyclic codes, and convolutional codes. The application of unequal error protection (UEP) strategies utilizing combinations of the aforementioned FEC methods is also presented. Finally, framebased interleaving is discussed as an alternative to overcoming the effect of bursty channel erasures. The chapter concludes with examples of modern standards for channel coding strategies for distributed speech recognition (DSR). 8.1. Distributed Speech Recognition Systems Throughout this chapter various error recovery and detection techniques are discussed. It is therefore necessary to present an overview of a complete experimental DSR system, including feature extraction, a noisy channel model, and an automatic speech recognition engine at the server end (Fig. 8.1). 2 Bengt J. Borgstrom, Alexis Bernard, and Abeer Alwan Feature Extraction Source Coding Channel Coding Channel Decoding Source Decoding Speech Recognition Client Server Noisy Channel input speech
Nature Biotechnology | 2007
Phil Oh; Per Henrik Borgstrom; Halina Witkiewicz; Yan Li; Bengt J. Borgström; Adrian Chrastina; Koji Iwata; Kurt R. Zinn; Richard Baldwin; Jacqueline E Testa; Jan E. Schnitzer
conference of the international speech communication association | 2007
Bengt J. Borgström; Abeer Alwan
conference of the international speech communication association | 2008
Bengt J. Borgström; Abeer Alwan
Nature Biotechnology | 2007
Phil Oh; Per Henrik Borgstrom; Halina Witkiewicz; Yan Li; Bengt J. Borgström; Adrian Chrastina; Koji Iwata; Kurt R. Zinn; Richard Baldwin; Jacqueline E. Testa; Jan E. Schnitzer