Is this you? Create Your Porfile

Wei-Ning Hsu

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wei-Ning Hsu is active.

Explore More

Publication

Featured researches published by Wei-Ning Hsu.

north american chapter of the association for computational linguistics | 2016

SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering.

Mitra Mohtarami; Yonatan Belinkov; Wei-Ning Hsu; Yu Zhang; Tao Lei; Kfir Bar; Scott Cyphers; James R. Glass

Community question answering platforms need to automatically rank answers and questions with respect to a given question. In this paper, we present the approaches for the Answer Selection and Question Retrieval tasks of SemEval-2016 (task 3). We develop a bag-of-vectors approach with various vectorand text-based features, and different neural network approaches including CNNs and LSTMs to capture the semantic similarity between questions and answers for ranking purpose. Our evaluation demonstrates that our approaches significantly outperform the baselines.

conference of the international speech communication association | 2016

Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition.

Wei-Ning Hsu; Yu Zhang; Ann Lee; James R. Glass

Deep neural network models have achieved considerable success in a wide range of fields. Several architectures have been proposed to alleviate the vanishing gradient problem, and hence enable training of very deep networks. In the speech recognition area, convolutional neural networks, recurrent neural networks, and fully connected deep neural networks have been shown to be complimentary in their modeling capabilities. Combining all three components, called CLDNN, yields the best performance to date. In this paper, we extend the CLDNN model by introducing a highway connection between LSTM layers, which enables direct information flow from cells of lower layers to cells of upper layers. With this design, we are able to better exploit the advantages of a deeper structure. Experiments on the GALE Chinese Broadcast Conversation/News Speech dataset indicate that our model outperforms all previous models and achieves a new benchmark, which is 22.41% character error rate on the dataset.

international conference on acoustics, speech, and signal processing | 2015

Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection

Cheng-Tao Chung; Wei-Ning Hsu; Cheng-Yi Lee; Lin-Shan Lee

This paper presents a novel approach for enhancing the multiple sets of acoustic patterns automatically discovered from a given corpus. In a previous work it was proposed that different HMM configurations (number of states per model, number of distinct models) for the acoustic patterns form a two-dimensional space. Multiple sets of acoustic patterns automatically discovered with the HMM configurations properly located on different points over this two-dimensional space were shown to be complementary to one another, jointly capturing the characteristics of the given corpus. By representing the given corpus as sequences of acoustic patterns on different HMM sets, the pattern indices in these sequences can be relabeled considering the context consistency across the different sequences. Good improvements were observed in preliminary experiments of pattern spoken term detection (STD) performed on both TIMIT and Mandarin Broadcast News with such enhanced patterns.

spoken language technology workshop | 2016

Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge

Tuka Alhanai; Wei-Ning Hsu; James R. Glass

The Arabic language, with over 300 million speakers, has significant diversity and breadth. This proves challenging when building an automated system to understand what is said. This paper describes an Arabic Automatic Speech Recognition system developed on a 1,200 hour speech corpus that was made available for the 2016 Arabic Multi-genre Broadcast (MGB) Challenge. A range of Deep Neural Network (DNN) topologies were modeled including; Feed-forward, Convolutional, Time-Delay, Recurrent Long Short-Term Memory (LSTM), Highway LSTM (H-LSTM), and Grid LSTM (GLSTM). The best performance came from a sequence discriminatively trained G-LSTM neural network. The best overall Word Error Rate (WER) was 18.3% (p < 0:001) on the development set, after combining hypotheses of 3 and 5 layer sequence discriminatively trained G-LSTM models that had been rescored with a 4-gram language model.

spoken language technology workshop | 2016

A prioritized grid long short-term memory RNN for speech recognition

Wei-Ning Hsu; Yu Zhang; James R. Glass

Recurrent neural networks (RNNs) are naturally suitable for speech recognition because of their ability of utilizing dynamically changing temporal information. Deep RNNs have been argued to be able to model temporal relationships at different time granularities, but suffer vanishing gradient problems. In this paper, we extend stacked long short-term memory (LSTM) RNNs by using grid LSTM blocks that formulate computation along not only the temporal dimension, but also the depth dimension, in order to alleviate this issue. Moreover, we prioritize the depth dimension over the temporal one to provide the depth dimension more updated information, since the output from it will be used for classification. We call this model the prioritized Grid LSTM (pGLSTM). Extensive experiments on four large datasets (AMI, HKUST, GALE, and MGB) indicate that the pGLSTM outperforms alternative deep LSTM models, beating stacked LSTMs with 4% to 7% relative improvement, and achieve new benchmarks among uni-directional models on all datasets.

national conference on artificial intelligence | 2015

Active learning by learning

Wei-Ning Hsu; Hsuan-Tien Lin

neural information processing systems | 2017

Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data

Wei-Ning Hsu; Yu Zhang; James R. Glass

international conference on computational linguistics | 2016

Neural Attention for Learning to Rank Questions in Community Question Answering.

Salvatore Romeo; Giovanni Da San Martino; Alberto Barrón-Cedeño; Alessandro Moschitti; Yonatan Belinkov; Wei-Ning Hsu; Yu Zhang; Mitra Mohtarami; James R. Glass

conference of the international speech communication association | 2017