Guangxu Xun
University at Buffalo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guangxu Xun.
bioinformatics and biomedicine | 2015
Guangxu Xun; Xiaowei Jia; Aidong Zhang
Epileptic seizure is a serious health problem in the world and there is a huge population suffering from it every year. If an algorithm could automatically detect seizures and deliver the patient therapy or notify the hospital, that would be of great assistance. Analyzing the scalp EEG is the most common way to detect the onset of a seizure. In this paper, we proposed the context-learning based EEG analysis for seizure detection (Context-EEG) algorithm. The proposed method aims at extracting both the hidden inherent features within EEG fragments and the temporal features from EEG contexts. First, we segment the EEG signals into EEG fragments of fixed length. Second, we learn the hidden inherent features from each fragment and reduce the dimensionality of the original data. Third, we translate each EEG fragment to an EEG word so that the EEG context can provide us with temporal information. And finally, we concatenate the hidden feature and the temporal feature together to train a binary classifier. The experiment result shows the proposed model is highly effective in detecting seizure.
international joint conference on artificial intelligence | 2017
Guangxu Xun; Yaliang Li; Wayne Xin Zhao; Jing Gao; Aidong Zhang
Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, via cosine values. In this paper, we propose a novel correlated topic model using word embeddings. The proposed model enables us to exploit the additional word-level correlation information in word embeddings and directly model topic correlation in the continuous word embedding space. In the model, words in documents are replaced with meaningful word embeddings, topics are modeled as multivariate Gaussian distributions over the word embeddings and topic correlations are learned among the continuous Gaussian topics. A Gibbs sampling solution with data augmentation is given to perform inference. We evaluate our model on the 20 Newsgroups dataset and the Reuters-21578 dataset qualitatively and quantitatively. The experimental results show the effectiveness of our proposed model.
advances in social networks analysis and mining | 2016
Xiaowei Jia; Xiaoyi Li; Kang Li; Vishrawas Gopalakrishnan; Guangxu Xun; Aidong Zhang
The development of social networks has not only improved the online experience, but also stimulated the advances in knowledge mining so as to assist people in planning their offline social events. Users can explore their favorite events, such as celebrations and symposiums, through the pictures and the posts from their friends on social networks. An effective event recommendation can offer great convenience for both event organizers and participants, which yet remains extremely challenging due to a wide range of practical concerns. In this paper we propose a novel recommendation framework, which combines the information from multiple sources and establishes a connection between the online knowledge and the event participation.
international conference on bioinformatics | 2017
Ye Yuan; Guangxu Xun; Kebin Jia; Aidong Zhang
With the advances in pervasive sensor technologies, physiological signals can be captured continuously to prevent the serious outcomes caused by epilepsy. Detection of epileptic seizure onset on collected multi-channel electroencephalogram (EEG) has attracted lots of attention recently. Deep learning is a promising method to analyze large-scale unlabeled data. In this paper, we propose a multi-view deep learning model to capture brain abnormality from multi-channel epileptic EEG signals for seizure detection. Specifically, we first generate EEG spectrograms using short-time Fourier transform (STFT) to represent the time-frequency information after signal segmentation. Second, we adopt stacked sparse denoising autoencoders (SSDA) to unsupervisedly learn multiple features by considering both intra and inter correlation of EEG channels, denoted as intra-channel and cross-channel features, respectively. Third, we add an SSDA-based channel selection procedure using proposed response rate to reduce the dimension of intra-channel feature. Finally, we concatenate the learned multi-features and apply a fully-connected SSDA model with softmax classifier to jointly learn the cross-patient seizure detector in a supervised fashion. To evaluate the performance of the proposed model, we carry out experiments on a real world benchmark EEG dataset and compare it with six baselines. Extensive experimental results demonstrate that the proposed learning model is able to extract latent features with meaningful interpretation, and hence is effective in detecting epileptic seizure.
BMC Medical Informatics and Decision Making | 2016
Guangxu Xun; Xiaowei Jia; Aidong Zhang
BackgroundEpileptic seizure is a serious health problem in the world and there is a huge population suffering from it every year. If an algorithm could automatically detect seizures and deliver the patient therapy or notify the hospital, that would be of great assistance. Analyzing the scalp electroencephalogram (EEG) is the most common way to detect the onset of a seizure. In this paper, we proposed the context-learning based EEG analysis for seizure detection (Context-EEG) algorithm.MethodsThe proposed method aims at extracting both the hidden inherent features within EEG fragments and the temporal features from EEG contexts. First, we segment the EEG signals into EEG fragments of fixed length. Second, we learn the hidden inherent features from each fragment with a sparse auto-encoder and thus the dimensionality of the original data is reduced. Third, we translate each EEG fragment to an EEG word so that a continuous EEG signal is converted to a sequence of EEG words. Fourth, by analyzing the context information of EEG words, we learn the temporal features for EEG signals. And finally, we concatenate the hidden features and the temporal features together to train a binary classifier which can be used to detect the onset of an epileptic sezure.Results4302 EEG fragments from four different patients are used to train and test our model. An error rate of 22.93 % is achieved by our model as a general, non-patient specific seizure detector. The error rate of our model is averagely 16.7 % lower than the other baseline models. Receiver operating characteristics (ROC curve) and area under curve (AUC) confirm the effectiveness of our model. Furthermore, we discuss the extracted features and how to reconstruct the original data based on the extracted features, as well as the parameter sensitivity.ConclusionsGiven a EEG fragment, by extracting high-quality features (the hidden inherent features and temporal features) from the EEG signals, our Context-EEG model is able to detect the onset of a seizure with high accuracy in real time.
bioinformatics and biomedicine | 2015
Guangxu Xun; Xiaoyi Li; Marc R. Knecht; Paras N. Prasad; Mark T. Swihart; Tiffany R. Walsh; Aidong Zhang
There is a growing interest in identifying inorganic material affinity classes for peptide sequences due to the development of bionanotechnology and its wide applications. In particular, a selective model capable of learning cross-material affinity patterns can help us design peptide sequences with desired binding selectivity for one inorganic material over another. However, as a newly emerging topic, there are several distinct challenges of it that limit the performance of many existing peptide sequence classification algorithms. In this paper, we propose a novel framework to identify affinity classes for peptide sequences across inorganic materials. After enlarging our dataset by simulating peptide sequences, we use a context learning based method to obtain the vector representation of each amino acid and each peptide sequence. By analyzing the structure and affinity class of each peptide sequence, we are able to capture the semantics of amino acids and peptide sequences in a vector space. At the last step we train our classifier based on these vector features and the heuristic rules. The construction of our models gives us the potential to overcome the challenges of this task and the empirical results show the effectiveness of our models.
knowledge discovery and data mining | 2017
Guangxu Xun; Yaliang Li; Jing Gao; Aidong Zhang
A text corpus typically contains two types of context information -- global context and local context. Global context carries topical information which can be utilized by topic models to discover topic structures from the text corpus, while local context can train word embeddings to capture semantic regularities reflected in the text corpus. This encourages us to exploit the useful information in both the global and the local context information. In this paper, we propose a unified language model based on matrix factorization techniques which 1) takes the complementary global and local context information into consideration simultaneously, and 2) models topics and learns word embeddings collaboratively. We empirically show that by incorporating both global and local context, this collaborative model can not only significantly improve the performance of topic discovery over the baseline topic models, but also learn better word embeddings than the baseline word embedding models. We also provide qualitative analysis that explains how the cooperation of global and local context information can result in better topic structures and word embeddings.
international conference on data mining | 2016
Guangxu Xun; Vishrawas Gopalakrishnan; Fenglong Ma; Yaliang Li; Jing Gao; Aidong Zhang
Discovering topics in short texts, such as news titles and tweets, has become an important task for many content analysis applications. However, due to the lack of rich context information in short texts, the performance of conventional topic models on short texts is usually unsatisfying. In this paper, we propose a novel topic model for short text corpus using word embeddings. Continuous space word embeddings, which is proven effective at capturing regularities in language, is incorporated into our model to provide additional semantics. Thus we model each short document as a Gaussian topic over word embeddings in the vector space. In addition, considering that background words in a short text are usually not semantically related, we introduce a discrete background mode over word types to complement the continuous Gaussian topics. We evaluate our model on news titles from data sources like abcnews, showing that our model is able to extract more coherent topics from short texts compared with the baseline methods and learn better topic representation for each short document.
Bioinformatics | 2018
Vishrawas Gopalakrishnan; Kishlay Jha; Guangxu Xun; Hung Q. Ngo; Aidong Zhang
Motivation: The overwhelming amount of research articles in the domain of bio‐medicine might cause important connections to remain unnoticed. Literature Based Discovery is a sub‐field within biomedical text mining that peruses these articles to formulate high confident hypotheses on possible connections between medical concepts. Although many alternate methodologies have been proposed over the last decade, they still suffer from scalability issues. The primary reason, apart from the dense inter‐connections between biological concepts, is the absence of information on the factors that lead to the edge‐formation. In this work, we formulate this problem as a collaborative filtering task and leverage a relatively new concept of word‐vectors to learn and mimic the implicit edge‐formation process. Along with single‐class classifier, we prune the search‐space of redundant and irrelevant hypotheses to increase the efficiency of the system and at the same time maintaining and in some cases even boosting the overall accuracy. Results: We show that our proposed framework is able to prune up to 90% of the hypotheses while still retaining high recall in top‐K results. This level of efficiency enables the discovery algorithm to look for higher‐order hypotheses, something that was infeasible until now. Furthermore, the generic formulation allows our approach to be agile to perform both open and closed discovery. We also experimentally validate that the core data‐structures upon which the system bases its decision has a high concordance with the opinion of the experts.This coupled with the ability to understand the edge formation process provides us with interpretable results without any manual intervention. Availability and implementation: The relevant JAVA codes are available at: https://github.com/vishrawas/Medline‐Code_v2. Supplementary information: Supplementary data are available at Bioinformatics online.
IEEE Transactions on Knowledge and Data Engineering | 2017
Guangxu Xun; Xiaowei Jia; Vishrawas Gopalakrishnan; Aidong Zhang
Learning semantics based on context information has been researched in many research areas for decades. Context information can not only be directly used as the input data, but also sometimes used as auxiliary knowledge to improve existing models. This survey aims at providing a structured and comprehensive overview of the research on context learning. We summarize and group the existing literature into four categories, Explicit Analysis, Implicit Analysis, Neural Network Models, and Composite Models, based on the underlying techniques adopted by them. For each category, we talk about the basic idea and techniques, and also introduce how context information is utilized as the model input or incorporated into the model to enhance the performance or extend the domain of application as auxiliary knowledge. In addition, we discuss the advantages and disadvantages of each model from both the technical and practical point of view.