Guyu Hu
University of Science and Technology, Sana'a
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guyu Hu.
Neurocomputing | 2015
Dong Li; Guyu Hu; Yibing Wang; Zhisong Pan
Machine learning has been used in network traffic classification and statistical features are used to represent flows. However, conventional feature selection may work out in face of dynamic and complex traffic data. Multi-Task Learning has obtained quite wide attention nowadays, and one important form of multi-task learning is to exploit the features shared by tasks by sparse models. We propose a fast multi-task sparse feature learning method, using a non-convex Capped- ? 1 , ? 1 as the regularizer to learn a set of shared features in traffic data. Specifically, the non-convex multi-task feature learning model can learn features belonging to each task as well as the common features shared among tasks. We use the iterative shrinkage and thresholding (IST) algorithm to solve the problem, which has a closed-form solution for one of the crucial steps in the whole iteration. Experiment on real traffic data captured from backbone network as well as synthetic data and other popular real-world data show the effectiveness the method, compared with state-of-the-art methods.
Mathematical Problems in Engineering | 2016
Zhen Li; Zhisong Pan; Yanyan Zhang; Guopeng Li; Guyu Hu
Community detection is of great importance which enables us to understand the network structure and promotes many real-world applications such as recommendation systems. The heterogeneous social networks, which contain multiple social relations and various user generated content, make the community detection problem more complicated. Particularly, social relations and user generated content are regarded as link information and content information, respectively. Since the two types of information indicate a common community structure from different perspectives, it is better to mine them jointly to improve the detection accuracy. Some detection algorithms utilizing both link and content information have been developed. However, most works take the private community structure of a single data source as the common one, and some methods take extra time transforming the content data into link data compared with mining directly. In this paper, we propose a framework based on regularized joint nonnegative matrix factorization (RJNMF) to utilize link and content information jointly to enhance the community detection accuracy. In the framework, we develop joint NMF to analyze link and content information simultaneously and introduce regularization to obtain the common community structure directly. Experimental results on real-world datasets show the effectiveness of our method.
Mathematical Problems in Engineering | 2015
Bo Jia; Dong Li; Zhisong Pan; Guyu Hu
Extreme learning machine (ELM) has achieved wide attention due to faster learning speed compared with conventional neural network models like support vector machine (SVM) and back-propagation (BP) networks. However, like many other methods, ELM is originally proposed to handle vector pattern while nonvector patterns in real applications need to be explored, such as image data. We propose the two-dimensional extreme learning machine (2DELM) based on the very natural idea to deal with matrix data directly. Unlike original ELM which handles vectors, 2DELM take the matrices as input features without vectorization. Empirical studies on several real image datasets show the efficiency and effectiveness of the algorithm.
bioRxiv | 2017
Dong Li; Zexuan Zhu; Zhisong Pan; Guyu Hu; Shan He
Active modules identification has received much attention due to its ability to reveal regulatory and signaling mechanisms of a given cellular response. Most existing algorithms identify active modules by extracting connected nodes with high activity scores from a graph. These algorithms do not consider other topological properties such as community structure, which may correspond to functional units. In this paper, we propose an active module identification algorithm based on a novel objective function, which considers both and network topology and nodes activity. This objective is formulated as a constrained quadratic programming problem, which is convex and can be solved by iterative methods. Furthermore, the framework is extended to the multilayer dynamic PPI networks. Empirical results on the single layer and multilayer PPI networks show the effectiveness of proposed algorithms. Availability: The package and code for reproducing all results and figures are available at https://github.com/fairmiracle/ModuleExtraction.
Neurocomputing | 2017
Yajun Yu; Zhisong Pan; Guyu Hu
Identification and classification of graph data is a hot research issue in pattern recognition. The conventional methods of graph classification usually convert the graph data to vector representation which ignore the sparsity of graph data. In this paper, we propose a new graph classification algorithm called graph classification based on sparse graph feature selection and extreme learning machine. The key of our method is using lasso to select sparse feature because of the sparsity of the corresponding feature space of the graph data, and extreme learning machine (ELM) is introduced to the following classification task due to its good performance. Extensive experimental results on a series of benchmark graph datasets validate the effectiveness of the proposed methods.
bioRxiv | 2016
Dong Li; James B. Brown; Luisa Orsini; Zhisong Pan; Guyu Hu; Shan He
Gene co-expression network differential analysis is designed to help biologists understand gene expression patterns under different conditions. We have implemented an R package called MODA (Module Differential Analysis) for gene co-expression network differential analysis. Based on transcriptomic data, MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes. The usefulness of the method is also demonstrated by synthetic data as well as Daphnia magna gene expression data under different environmental stresses.
Mathematical Problems in Engineering | 2016
Liangliang Zhang; Longqi Yang; Guyu Hu; Zhisong Pan; Zhen Li
Link prediction is an important task in complex network analysis. Traditional link prediction methods are limited by network topology and lack of node property information, which makes predicting links challenging. In this study, we address link prediction using a sparse Gaussian graphical model and demonstrate its theoretical and practical effectiveness. In theory, link prediction is executed by estimating the inverse covariance matrix of samples to overcome information limits. The proposed method was evaluated with four small and four large real-world datasets. The experimental results show that the area under the curve (AUC) value obtained by the proposed method improved by an average of 3% and 12.5% compared to 13 mainstream similarity methods, respectively. This method outperforms the baseline method, and the prediction accuracy is superior to mainstream methods when using only 80% of the training set. The method also provides significantly higher AUC values when using only 60% in Dolphin and Taro datasets. Furthermore, the error rate of the proposed method demonstrates superior performance with all datasets compared to mainstream methods.
Proceedings of the International Conference on Intelligent Science and Technology | 2018
Zhisong Pan; Guyu Hu; Dong Li
The study of multilayer networks has received much attention in network science in recent years. As a fundamental topic of network analysis, communities detection has been intensively studied on single networks. It is natural to extend communities detection to multilayer networks. In this paper, we propose an aggregation approach to detect communities from multiple networks. This approach simply constructs a consensus graph from multiple networks then applies existing communities detection algorithms on this consensus graph. We show the rationale behind this operation by both theoretical validation as well as experiments on multilayer networks, including the multilayer dynamic protein-protein interaction networks and a set of heterogenous biological networks.
BMC Genomics | 2017
Dong Li; Zhisong Pan; Guyu Hu; Zexuan Zhu; Shan He
BackgroundActive modules are connected regions in biological network which show significant changes in expression over particular conditions. The identification of such modules is important since it may reveal the regulatory and signaling mechanisms that associate with a given cellular response.ResultsIn this paper, we propose a novel active module identification algorithm based on a memetic algorithm. We propose a novel encoding/decoding scheme to ensure the connectedness of the identified active modules. Based on the scheme, we also design and incorporate a local search operator into the memetic algorithm to improve its performance.ConclusionThe effectiveness of proposed algorithm is validated on both small and large protein interaction networks.
bioRxiv | 2016
Dong Li; Shan He; Zhisong Pan; Guyu Hu
Motivation Searching for active connected subgraphs in biological networks has shown important to identifying functional modules. Most existing active modules identification methods need both network structural information and gene activity measures, typically requiring prior knowledge database and high-throughput data. As a pure data-driven gene network, weighted gene co-expression network (WGCN) could be constructed only from expression profile. Searching for modules on WGCN thus has potential values. While traditional clustering based modules detection on WGCN method covers all genes, unavoidable introducing many uninformative ones when annotating modules. We need to find more accurate part of them. Results We propose a fine-grained method to identify active modules on the multi-layer weighted (co-expression gene) network, based on a continuous optimization approach (AMOUNTAIN). The multilayer network are also considered under the unified framework, as a natural extension to single layer network case. The effectiveness is validated on both synthetic data and real-world data. And the software is provided as a user-friendly R package. Availability Available at https://github.com/fairmiracle/AMOUNTAIN Contact [email protected] Supplementary information Supplementary data are available at Bioin-formatics online.