Is this you? Create Your Porfile

Weiran Wang

Toyota Technological Institute at Chicago

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Weiran Wang is active.

Explore More

Publication

Featured researches published by Weiran Wang.

north american chapter of the association for computational linguistics | 2015

Deep Multilingual Correlation for Improved Word Embeddings.

Ang Lu; Weiran Wang; Mohit Bansal; Kevin Gimpel; Karen Livescu

Word embeddings have been found useful for many NLP tasks, including part-of-speech tagging, named entity recognition, and parsing. Adding multilingual context when learning embeddings can improve their quality, for example via canonical correlation analysis (CCA) on embeddingsfromtwo languages. In this paper, we extend this idea to learn deep non-linear transformations of word embeddings of the two languages, using the recently proposed deep canonical correlation analysis. The resulting embeddings, when evaluated on multiple word and bigram similarity tasks, consistently improve over monolingual embeddings and over embeddings transformed with linear CCA.

international conference on acoustics, speech, and signal processing | 2015

Unsupervised learning of acoustic features via deep canonical correlation analysis

Weiran Wang; Raman Arora; Karen Livescu; Jeff A. Bilmes

It has been previously shown that, when both acoustic and articulatory training data are available, it is possible to improve phonetic recognition accuracy by learning acoustic features from this multi-view data with canonical correlation analysis (CCA). In contrast with previous work based on linear or kernel CCA, we use the recently proposed deep CCA, where the functional form of the feature mapping is a deep neural network. We apply the approach on a speaker-independent phonetic recognition task using data from the University of Wisconsin X-ray Microbeam Database. Using a tandem-style recognizer on this task, deep CCA features improve over earlier multi-view approaches as well as over articulatory inversion and typical neural network-based tandem features. We also present a new stochastic training approach for deep CCA, which produces both faster training and better-performing features.

international conference on acoustics, speech, and signal processing | 2016

Deep convolutional acoustic word embeddings using word-pair side information

Herman Kamper; Weiran Wang; Karen Livescu

Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech segment to a vector in a fixed-dimensional space; the resulting acoustic word embeddings need to allow for accurate discrimination between different word types, directly in the embedding space. We compare several old and new approaches in a word discrimination task. Our best approach uses side information in the form of known word pairs to train a Siamese convolutional neural network (CNN): a pair of tied networks that take two speech segments as input and produce their embeddings, trained with a hinge loss that separates same-word pairs and different-word pairs by some margin. A word classifier CNN performs similarly, but requires much stronger supervision. Both types of CNNs yield large improvements over the best previously published results on the word discrimination task.

computer vision and pattern recognition | 2010

Manifold blurring mean shift algorithms for manifold denoising

Weiran Wang; Miguel Á. Carreira-Perpiñán

We propose a new family of algorithms for denoising data assumed to lie on a low-dimensional manifold. The algorithms are based on the blurring mean-shift update, which moves each data point towards its neighbors, but constrain the motion to be orthogonal to the manifold. The resulting algorithms are nonparametric, simple to implement and very effective at removing noise while preserving the curvature of the manifold and limiting shrinkage. They deal well with extreme outliers and with variations of density along the manifold. We apply them as preprocessing for dimensionality reduction; and for nearest-neighbor classification of MNIST digits, with consistent improvements up to 36% over the original data.

allerton conference on communication, control, and computing | 2015

Stochastic optimization for deep CCA via nonlinear orthogonal iterations

Weiran Wang; Raman Arora; Karen Livescu; Nathan Srebro

Deep CCA is a recently proposed deep neural network extension to the traditional canonical correlation analysis (CCA), and has been successful for multi-view representation learning in several domains. However, stochastic optimization of the deep CCA objective is not straightforward, because it does not decouple over training examples. Previous optimizers for deep CCA are either batch-based algorithms or stochastic optimization using large minibatches, which can have high memory consumption. In this paper, we tackle the problem of stochastic optimization for deep CCA with small minibatches, based on an iterative solution to the CCA objective, and show that we can achieve as good performance as previous optimizers and thus alleviate the memory requirement.

ieee automatic speech recognition and understanding workshop | 2015

Discriminative segmental cascades for feature-rich phone recognition

Hao Tang; Weiran Wang; Kevin Gimpel; Karen Livescu

Discriminative segmental models, such as segmental conditional random fields (SCRFs) and segmental structured support vector machines (SSVMs), have had success in speech recognition via both lattice rescoring and first-pass decoding. However, such models suffer from slow decoding, hampering the use of computationally expensive features, such as segment neural networks or other high-order features. A typical solution is to use approximate decoding, either by beam pruning in a single pass or by beam pruning to generate a lattice followed by a second pass. In this work, we study discriminative segmental models trained with a hinge loss (i.e., segmental structured SVMs). We show that beam search is not suitable for learning rescoring models in this approach, though it gives good approximate decoding performance when the model is already well-trained. Instead, we consider an approach inspired by structured prediction cascades, which use max-marginal pruning to generate lattices. We obtain a high-accuracy phonetic recognition system with several expensive feature types: a segment neural network, a second-order language model, and second-order phone boundary features.

spoken language technology workshop | 2014

Reconstruction of articulatory measurements with smoothed low-rank matrix completion

Weiran Wang; Raman Arora; Karen Livescu

Articulatory measurements have been used in a variety of speech science and technology applications. These measurements can be obtained with a number of technologies, such as electromagnetic articulography and X-ray microbeam, typically involving pellets attached to individual articulators. Due to limitations in the recording technologies, articulatory measurements often contain missing data when individual pellets are mis-tracked, leading to relatively high rates of loss in this expensive and time-consuming data source. We present an approach to reconstructing such data, using low-rank matrix factorization techniques combined with temporal smoothness regularization, and apply it to reconstructing the missing entries in the Wisconsin X-ray microbeam database. Our algorithm alternates between two simple steps, each having a closed form as the solution of a linear system. The algorithm gives realistic reconstructions even when a majority of the frames contain missing data, improving over previous approaches to this problem in terms of both root mean squared error and phonetic recognition performance when using the reconstructions.

spoken language technology workshop | 2016

End-to-end training approaches for discriminative segmental models

Hao Tang; Weiran Wang; Kevin Gimpel; Karen Livescu

Recent work on discriminative segmental models has shown that they can achieve competitive speech recognition performance, using features based on deep neural frame classifiers. However, segmental models can be more challenging to train than standard frame-based approaches. While some segmental models have been successfully trained end to end, there is a lack of understanding of their training under different settings and with different losses.

international conference on acoustics, speech, and signal processing | 2016

Signer-independent fingerspelling recognition with deep neural network adaptation

Taehwan Kim; Weiran Wang; Hao Tang; Karen Livescu

We study the problem of recognition of fingerspelled letter sequences in American Sign Language in a signer-independent setting. Fingerspelled sequences are both challenging and important to recognize, as they are used for many content words such as proper nouns and technical terms. Previous work has shown that it is possible to achieve almost 90% accuracies on fingerspelling recognition in a signer-dependent setting. However, the more realistic signer-independent setting presents challenges due to significant variations among signers, coupled with the dearth of available training data. We investigate this problem with approaches inspired by automatic speech recognition. We start with the best-performing approaches from prior work, based on tandem models and segmental conditional random fields (SCRFs), with features based on deep neural network (DNN) classifiers of letters and phonological features. Using DNN adaptation, we find that it is possible to bridge a large part of the gap between signer-dependent and signer-independent performance. Using only about 115 transcribed words for adaptation from the target signer, we obtain letter accuracies of up to 82.7% with framelevel adaptation labels and 69.7% with only word labels.

conference of the international speech communication association | 2016

Efficient Segmental Cascades for Speech Recognition.

Hao Tang; Weiran Wang; Kevin Gimpel; Karen Livescu

Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition. However, their appeal has been limited by their computational requirements, due to the large number of possible segments to consider. Multi-pass cascades of segmental models introduce features of increasing complexity in different passes, where in each pass a segmental model rescores lattices produced by a previous (simpler) segmental model. In this paper, we explore several ways of making segmental cascades efficient and practical: reducing the feature set in the first pass, frame subsampling, and various pruning approaches. In experiments on phonetic recognition, we find that with a combination of such techniques, it is possible to maintain competitive performance while greatly reducing decoding, pruning, and training time.

Explore More