Beilun Wang
University of Virginia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Beilun Wang.
pacific symposium on biocomputing | 2017
Jack Lanchantin; Ritambhara Singh; Beilun Wang; Yanjun Qi
Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequences saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them.
european conference on machine learning | 2017
Ritambhara Singh; Arshdeep Sekhon; Kamran Kowsari; Jack Lanchantin; Beilun Wang; Yanjun Qi
String Kernel (SK) techniques, especially those using gapped
Machine Learning | 2017
Beilun Wang; Ritambhara Singh; Yanjun Qi
k
arXiv: Learning | 2017
Beilun Wang; Ji Gao; Yanjun Qi
-mers as features (gk), have obtained great success in classifying sequences like DNA, protein, and text. However, the state-of-the-art gk-SK runs extremely slow when we increase the dictionary size (
Archive | 2016
Beilun Wang; Ji Gao; Yanjun Qi
\Sigma
arXiv: Learning | 2017
Ji Gao; Beilun Wang; Zeming Lin; Weilin Xu; Yanjun Qi
) or allow more mismatches (
arXiv: Learning | 2017
Ji Gao; Beilun Wang; Yanjun Qi
M
international conference on machine learning | 2018
Beilun Wang; Arshdeep Sekhon; Yanjun Qi
). This is because current gk-SK uses a trie-based algorithm to calculate co-occurrence of mismatched substrings resulting in a time cost proportional to
international conference on artificial intelligence and statistics | 2018
Beilun Wang; Arshdeep Sekhon; Yanjun Qi
O(\Sigma^{M})
international conference on artificial intelligence and statistics | 2017
Beilun Wang; Ji Gao; Yanjun Qi
. We propose a \textbf{fast} algorithm for calculating \underline{Ga}pped