Rongjing Hu
Lanzhou University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rongjing Hu.
chinagrid annual conference | 2012
Yan Zhang; Ruisheng Zhang; Qiuqiang Chen; Xiaopan Gao; Rongjing Hu; Ying Zhang; Guangcai Liu
Virtual Screening involves massive computing tasks with millions of molecules docking on the targeted protein. Such data-intensive science always faces the challenge of managing tens of TB datasets, which gives rise to the requirement of large-scale storage. Furthermore, the efficient query and transmission of the large-scale datasets are the other key requirements during the virtual screening progress. Therefore, in this data-intensive application, a massive data storage solution is expected to improve the efficiency of storage and access of large-scale molecules and their docking results, as well as facilitating the data preparing and analysis phases of virtual screening. In order to address the key requirements mentioned above, we proposed a novel storage solution based on Hadoop for virtual screening. HBase was implemented as a distributed database to persist the properties of massive molecules and docking results. HDFS was utilized as a molecule source files storage system. The comparison of the system performance was also presented. Finally, we concluded that the storage solution we proposed could be considered as an alternative attempt to enable the efficient storage and access of large-scale molecules and docking results in virtual screening research.
International Journal of Modern Physics C | 2017
Fan Yang; Ruisheng Zhang; Zhao Yang; Rongjing Hu; Mengtian Li; Yongna Yuan; Keqin Li
Identifying influential spreaders is crucial for developing strategies to control the spreading process on complex networks. Following the well-known K-Shell (KS) decomposition, several improved measures are proposed. However, these measures cannot identify the most influential spreaders accurately. In this paper, we define a Local K-Shell Sum (LKSS) by calculating the sum of the K-Shell indices of the neighbors within 2-hops of a given node. Based on the LKSS, we propose an Extended Local K-Shell Sum (ELKSS) centrality to rank spreaders. The ELKSS is defined as the sum of the LKSS of the nearest neighbors of a given node. By assuming that the spreading process on networks follows the Susceptible-Infectious-Recovered (SIR) model, we perform extensive simulations on a series of real networks to compare the performance between the ELKSS centrality and other six measures. The results show that the ELKSS centrality has a better performance than the six measures to distinguish the spreading ability of nodes and to identify the most influential spreaders accurately.
Applied Soft Computing | 2017
Jiaxuan Wei; Ruisheng Zhang; Zhixuan Yu; Rongjing Hu; Jianxin Tang; Chun Gui; Yongna Yuan
Abstract Feature selection (FS) is an essential component of data mining and machine learning. Most researchers devoted to get more effective method with high accuracy and fewer features, it has become one of the most challenging problems in FS. Certainly, some algorithms have been proven to be effectively, such as binary particle swarm optimization (BPSO), genetic algorithm (GA) and support vector machine (SVM). BPSO is a metaheuristic algorithm having been widely applied to various fields and applications successfully, including FS. As a wrapper method of FS, BPSO-SVM tends to be trapped into premature easily. In this paper, we present a novel mutation enhanced BPSO-SVM algorithm by adjusting the memory of local and global optimum (LGO) and increasing the particles’ mutation probability for feature selection to overcome convergence premature problem and achieve high quality features. Typical simulated experimental results carried out on Sonar, LSVT and DLBCL datasets indicated that the proposed algorithm improved the accuracy and decreased the number of feature subsets, comparing with existing modified BPSO algorithms and GA.
Current Computer - Aided Drug Design | 2016
Ruisheng Zhang; Juan Li; Jingjing Lu; Rongjing Hu; Yongna Yuan; Zhili Zhao
Compound selectivity prediction plays an important role in identifying potential compounds that bind to the target of interest with high affinity. However, there is still short of efficient and accurate computational approaches to analyze and predict compound selectivity. In this paper, we propose two methods to improve the compound selectivity prediction. We employ an improved multitask learning method in Neural Networks (NNs), which not only incorporates both activity and selectivity for other targets, but also uses a probabilistic classifier with a logistic regression. We further improve the compound selectivity prediction by using the multitask learning method in Deep Belief Networks (DBNs) which can build a distributed representation model and improve the generalization of the shared tasks. In addition, we assign different weights to the auxiliary tasks that are related to the primary selectivity prediction task. In contrast to other related work, our methods greatly improve the accuracy of the compound selectivity prediction, in particular, using the multitask learning in DBNs with modified weights obtains the best performance.
advanced data mining and applications | 2013
Ruisheng Zhang; Guangcai Liu; Rongjing Hu; Jiaxuan Wei; Juan Li
Molecular docking is one main technique in Virtual Screening. During a molecular docking process, the molecule docking time presents serious diversity because of different chemical structures. The time diversity can cause certain nodes to overload, thereby reducing the data processing ability of the whole distributed molecular docking system. Therefore, a reasonable and efficient data grouping strategy is essential in the molecular docking system. In this paper, molecular structural similarity is researched in depth, and a similarity-based data grouping method is proposed. On the basis of the work in Database Management System for Virtual Screening, the method takes advantage of the computational chemistry software Chemistry Development Kit and cluster analysis methods to process the chemical molecules data. Finally, we deploy and implement the data grouping method on the Hadoop distributed platform. The experimental results show that this data grouping method can improve the efficiency of molecular docking.
International Journal of Modern Physics B | 2018
Mengtian Li; Ruisheng Zhang; Rongjing Hu; Fan Yang; Yabing Yao; Yongna Yuan
Identifying influential spreaders is a crucial problem that can help authorities to control the spreading process in complex networks. Based on the classical degree centrality (DC), several improved measures have been presented. However, these measures cannot rank spreaders accurately. In this paper, we first calculate the sum of the degrees of the nearest neighbors of a given node, and based on the calculated sum, a novel centrality named clustered local-degree (CLD) is proposed, which combines the sum and the clustering coefficients of nodes to rank spreaders. By assuming that the spreading process in networks follows the susceptible–infectious–recovered (SIR) model, we perform extensive simulations on a series of real networks to compare the performances between the CLD centrality and other six measures. The results show that the CLD centrality has a competitive performance in distinguishing the spreading ability of nodes, and exposes the best performance to identify influential spreaders accurately.
Knowledge Based Systems | 2018
Qidong Liu; Ruisheng Zhang; Rongjing Hu; Guangjing Wang; Zhenghai Wang; Zhili Zhao
Abstract Path-based clustering algorithms usually generate clusters by optimizing a criterion function. Most of state-of-the-art optimization methods give a solution close to the global optimum. By analyzing the minimax distance, we find that cluster centers have the minimum density in their own clusters. Inspired by this, we propose an improved path-based clustering algorithm (IPC) by mining the cluster centers of the dataset. IPC solves this problem by the process of elimination since it is difficult to mine these cluster centers directly. The algorithm can achieve the global optimum within O ( n 2 ) . Experimental results on synthetic datasets show that IPC not only can recognize all kinds of clusters regardless of their shapes, sizes and densities, but also is robust against noises and outliers in the data. More importantly, IPC needs only one parameter (i.e., the number of clusters). Comparing IPC with other clustering algorithms on the real datasets, the experimental results show that IPC outperforms compared clustering algorithms.
International Journal of Modern Physics C | 2018
Chun Gui; Ruisheng Zhang; Zhili Zhao; Jiaxuan Wei; Rongjing Hu
In order to deal with stochasticity in center node selection and instability in community detection of label propagation algorithm, this paper proposes an improved label propagation algorithm named label propagation algorithm based on community belonging degree (LPA-CBD) that employs community belonging degree to determine the number and the center of community. The general process of LPA-CBD is that the initial community is identified by the nodes with the maximum degree, and then it is optimized or expanded by community belonging degree. After getting the rough structure of network community, the remaining nodes are labeled by using label propagation algorithm. The experimental results on 10 real-world networks and three synthetic networks show that LPA-CBD achieves reasonable community number, better algorithm accuracy and higher modularity compared with other four prominent algorithms. Moreover, the proposed algorithm not only has lower algorithm complexity and higher community detection quality, but also improves the stability of the original label propagation algorithm.
International Journal of Modern Physics C | 2017
Yabing Yao; Ruisheng Zhang; Fan Yang; Yongna Yuan; Qingshuang Sun; Yu Qiu; Rongjing Hu
In complex networks, the existing link prediction methods primarily focus on the internal structural information derived from single-layer networks. However, the role of interlayer information is hardly recognized in multiplex networks, which provide more diverse structural features than single-layer networks. Actually, the structural properties and functions of one layer can affect that of other layers in multiplex networks. In this paper, the effect of interlayer structural properties on the link prediction performance is investigated in multiplex networks. By utilizing the intralayer and interlayer information, we propose a novel “Node Similarity Index” based on “Layer Relevance” (NSILR) of multiplex network for link prediction. The performance of NSILR index is validated on each layer of seven multiplex networks in real-world systems. Experimental results show that the NSILR index can significantly improve the prediction performance compared with the traditional methods, which only consider the intralayer information. Furthermore, the more relevant the layers are, the higher the performance is enhanced.
International Journal of Modern Physics C | 2017
Yabing Yao; Ruisheng Zhang; Fan Yang; Yongna Yuan; Rongjing Hu; Zhili Zhao
As a significant problem in complex networks, link prediction aims to find the missing and future links between two unconnected nodes by estimating the existence likelihood of potential links. It plays an important role in understanding the evolution mechanism of networks and has broad applications in practice. In order to improve prediction performance, a variety of structural similarity-based methods that rely on different topological features have been put forward. As one topological feature, the path information between node pairs is utilized to calculate the node similarity. However, many path-dependent methods neglect the different contributions of paths for a pair of nodes. In this paper, a local weighted path (LWP) index is proposed to differentiate the contributions between paths. The LWP index considers the effect of the link degrees of intermediate links and the connectivity influence of intermediate nodes on paths to quantify the path weight in the prediction procedure. The experimental results on 12 real-world networks show that the LWP index outperforms other seven prediction baselines.