Xiabing Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiabing Zhou is active.

Explore More

Publication

Featured researches published by Xiabing Zhou.

IEEE Transactions on Parallel and Distributed Systems | 2015

Influence Maximization on Large-Scale Mobile Social Network: A Divide-and-Conquer Method

Guojie Song; Xiabing Zhou; Yu Wang; Kunqing Xie

With the proliferation of mobile devices and wireless technologies, mobile social network systems are increasingly available. A mobile social network plays an essential role as the spread of information and influence in the form of “word-of-mouth”. It is a fundamental issue to find a subset of influential individuals in a mobile social network such that targeting them initially (e.g., to adopt a new product) will maximize the spread of the influence (further adoptions of the new product). The problem of finding the most influential nodes is unfortunately NP-hard. It has been shown that a Greedy algorithm with provable approximation guarantees can give good approximation; However, it is computationally expensive, if not prohibitive, to run the greedy algorithm on a large mobile social network. In this paper, a divide-and-conquer strategy with parallel computing mechanism has been adopted. We first propose an algorithm called Community-based Greedy algorithm for mining top-K influential nodes. It encompasses two components: dividing the large-scale mobile social network into several communities by taking into account information diffusion and selecting communities to find influential nodes by a dynamic programming. Then, to further improve the performance, we parallelize the influence propagation based on communities and consider the influence propagation crossing communities. Also, we give precision analysis to show approximation guarantees of our models. Experiments on real large-scale mobile social networks show that the proposed methods are much faster than previous algorithms, meanwhile, with high accuracy.

international symposium on neural networks | 2015

Improving deep neural network ensembles using reconstruction error

Wenhao Huang; Haikun Hong; Kaigui Bian; Xiabing Zhou; Guojie Song; Kunqing Xie

Ensemble learning of neural network is a learning paradigm where ensembles of several neural networks show improved generalization capabilities that outperform those of single networks. For deep learning of multi-layer neural networks, ensemble learning is still applicable. In addition, characteristics of deep neural networks can provide potential opportunities to improve the performance of traditional neural network ensembles. In this paper, we propose an ensemble criterion of deep neural networks that is based on the reconstruction error and present two strategies to solve the most important issues in ensemble learning of neural networks: component dataset sampling and output averaging. Component training datasets are selected according to the reconstruction error instead of random bootstrap sampling or re-weighting. Moreover, for each testing instance, we can compute the reconstruction error yielded by the sub-model simultaneously with the output. The reconstruction error is used as the weights in output averaging. From the perspectives of prediction interval and confidence interval, we demonstrated that smaller reconstruction error could ensure smaller prediction interval. We also incorporate the famous structure ensemble approach “Dropout” into the proposed approach to achieve the best performance. We conduct experiments on classification and regression datasets to validate the effectiveness of our approach.

international conference on machine learning and applications | 2015

Learning Common Metrics for Homogenous Tasks in Traffic Flow Prediction

Haikun Hong; Xiabing Zhou; Wenhao Huang; Xingxing Xing; Fei Chen; Yu Lei; Kaigui Bian; Kunqing Xie

Nearest neighbor based nonparametric regression is a classic data-driven method for traffic flow prediction in intelligent transportation systems (ITS). Performances of those models depend heavily on the similarity or distance metric used to search nearest neighborhood. Metric learning algorithms have been developed to learn the distance metrics from data in recent years. In real-world transportation application, multiple forecasting tasks are set since there are lots of road sections and detector points in the traffic network. Previous works tend to learn only one global metric to be used for all the tasks or learn multiple local metrics for each task which may lead to under-fitting or over-fitting problem. To balance these two kinds of methods and improve the generalization of learned metrics, we propose a common metric learning algorithm under the intuition that homogenous tasks tend to have similar local metrics. Then the learned common metrics are used in common metric KNN (CM-KNN) for traffic flow prediction. Experimental results show that our algorithm to learn common metrics are reasonable and CM-KNN method for traffic flow prediction outperforms other competing methods.

international conference on intelligent transportation systems | 2015

Hybrid Multi-metric K-Nearest Neighbor Regression for Traffic Flow Prediction

Haikun Hong; Wenhao Huang; Xingxing Xing; Xiabing Zhou; Hongyu Lu; Kaigui Bian; Kunqing Xie

Traffic flow prediction is a fundamental component in Intelligent Transportation Systems (ITS). Nearest neighbor based nonparametric regression method is a classic data-driven method for traffic flow prediction. Modern data collection technologies provide the opportunity to represent various features of the nonlinear complex system which also bring challenges to fuse the multiple sources of data. Firstly, the classic Euclidean distance metric based models for traffic flow prediction that treat each feature with equal weight is not effective in multi-source high-dimension feature space. Secondly, traditional handcrafting feature engineering by experts is tedious and error-prone. Thirdly, the traffic conditions in real-life situation are too complex to measure with only one distance metric. In this paper, we propose a hybrid multi-metric based k-nearest neighbor method (HMMKNN) for traffic flow prediction which can seize the intrinsic features in data and reduce the semantic gap between domain knowledge and handcrafted feature engineering. Experimental results demonstrate multi-source data fusion helps to improve the performance of traffic parameter prediction and HMMKNN outperforms the traditional Euclidean-based k-NN under various configurations. Furthermore, visualization of feature transformation clustering results implies the learned metrics are more reasonable.

international conference on intelligent transportation systems | 2015

Traffic Flow Decomposition and Prediction Based on Robust Principal Component Analysis

Xingxing Xing; Xiabing Zhou; Haikun Hong; Wenhao Huang; Kaigui Bian; Kunqing Xie

Research on traffic data analysis is becoming more available and important. One of the key challenges is how to accurately decompose the high-dimensional, noisy observation traffic flow matrix into sub-matrices that correspond to different classes of traffic flow which builds a foundation for traffic flow prediction, abnormal data detection and missing data imputation. While in traditional research, Principal Component Analysis (PCA) is usually used for traffic matrix analysis. However, the traffic matrix is usually corrupted by large volume anomalies, the resulting principal components will be significantly skewed from those in the anomaly-free case. In this paper, we introduce the Robust Principal Component Analysis (robust PCA) for decomposition. It can mine more accurate and robust underlining temporal and spatial characteristics of traffic flow with all kinds of fluctuations. We performed a comparative experimental analysis based on robust PCA with PCA-based method on a real-life dataset and got better decomposition performance. In the real-life dataset, results show that through robust PCA most of the large volume anomalies are short-lived and well isolated in the residual traffic matrix while PCA failed. In traffic flow prediction experiments based on decomposition, it shows that the result based on robust PCA outperforms the PCA and simple average. It provide adequate evidence that robust PCA is more appropriate for traffic flow matrix analysis. Robust PCA shows promising abilities in improving the accuracy and reliability of traffic flow analysis.

web age information management | 2015

Mining Dependencies Considering Time Lag in Spatio-Temporal Traffic Data

Xiabing Zhou; Haikun Hong; Xingxing Xing; Wenhao Huang; Kaigui Bian; Kunqing Xie

Learning dependency structure is meaningful to characterize causal or statistical relationships. Traditional dependencies learning algorithms only use the same time stamp data of variables. However, in many real-world applications, such as traffic system and climate, time lag is a key feature of hidden temporal dependencies, and plays an essential role in interpreting the cause of discovered temporal dependencies. In this paper, we propose a method for mining dependencies by considering the time lag. The proposed approach is based on a decomposition of the coefficients into products of two-level hierarchical coefficients, where one represents feature-level and the other represents time-level. Specially, we capture the prior information of time lag in spatio-temporal traffic data. We construct a probabilistic formulation by applying some probabilistic priors to these hierarchical coefficients, and devise an expectation-maximization (EM) algorithm to learn the model parameters. We evaluate our model on both synthetic and real-world highway traffic datasets. Experimental results show the effectiveness of our method.

fuzzy systems and knowledge discovery | 2015

Short-term traffic flow forecasting: Multi-metric KNN with related station discovery

Haikun Hong; Wenhao Huang; Xiabing Zhou; Sizhen Du; Kaigui Bian; Kunqing Xie

Nonparametric regression is a classic method for short-term traffic flow forecasting in Intelligent Transportation Systems (ITS). Feature space construction and distance metric selection are two important parts in nonparametric regression. Few of previous works have taken both these two aspects into account together. In addition, how to use information of related stations in network scale is a key to improve the performance of ITS. In this paper, we propose a novel three-stage framework based on KNN to handle the issues above for short-term traffic flow forecasting. In the first stage, the related origin stations and destination stations of target task are discovered from the whole traffic network. Then for each target task, a particular distance metric is learned in the second stage. Finally, an extended multi-metric k-nearest neighbor regression model is built in the third stage. Experimental results on real-world traffic dataset show that our multi-metric KNN model with Lasso outperforms the traditional KNN model and the feature construction method is effective.

Neurocomputing | 2017

Discovering spatio-temporal dependencies based on time-lag in intelligent transportation data

Xiabing Zhou; Haikun Hong; Xingxing Xing; Kaigui Bian; Kunqing Xie; Mingliang Xu

Abstract Learning spatio-temporal dependency structure is meaningful to characterize causal or statistical relationships. In many real-world applications, dependency structure is often characterized by time-lag between variables. For example, traffic system and climate, time lag is a key feature of hidden temporal dependencies, and plays an essential role in interpreting the cause of discovered temporal dependencies. However, traditional dependencies learning algorithms only use the same time stamp data of variables. In this paper, we propose a method for mining dependencies by considering the time lag. The proposed approach is based on a decomposition of the coefficients into products of two-level hierarchical coefficients, where one represents feature-level and the other represents time-level. Specially, we capture the prior information of time lag in intelligent transportation data. We construct a probabilistic formulation by applying some probabilistic priors to these hierarchical coefficients, and devise an expectation-maximization (EM) algorithm to learn the model parameters. We evaluate our model on both synthetic and real-world highway traffic datasets. Experimental results show the effectiveness of our method.

International Journal of Pattern Recognition and Artificial Intelligence | 2016

Structure Feature Learning Method for Incomplete Data

Xiabing Zhou; Xingxing Xing; Lei Han; Haikun Hong; Kaigui Bian; Kunqing Xie

Learning with incomplete data remains challenging in many real-world applications especially when the data is high-dimensional and dynamic. Many imputation-based algorithms have been proposed to handle with incomplete data, where these algorithms use statistics of the historical information to remedy the missing parts. However, these methods merely use the structural information existing in the data, which are very helpful for sharing between the complete entries and the missing ones. For example, in traffic system, some group information and temporal smoothness exist in the data structure. In this paper, we propose to incorporate these structural information and develop structural feature leaning method for learning with incomplete data (SFLIC). The SFLIC model adopt a fused Lasso based regularizer and a group Lasso style regularizer to enlarge the data sharing along both the temporal smoothness level and the feature group level to fill the gap where the data entries are missing. The proposed SFLIC model is a nonsmooth function according to the model parameters, and we adopt the smoothing proximal gradient (SPG) method to seek for an efficient solution. We evaluate our model on both synthetic and real-world highway traffic datasets. Experimental results show that our method outperforms the state-of-the-art methods.

international symposium on neural networks | 2015

Probabilistic dynamic causal model for temporal data

Xiabing Zhou; Wenhao Huang; Ni Zhang; Weisong Hu; Sizhen Du; Guojie Song; Kunqing Xie

Learning temporal causal structures between time series is one of key tools for analyzing time series data. Most previous works focuse on learning with static temporal causal relationships. However, in many real world applications, such as climate environment and transportation system, the causal structures vary dramatically over time. In this paper, we propose a probabilistic dynamic causal (PDC) model based on Lasso-Granger to uncover the dynamic temporal dependencies. Specifically, the PDC model infers different state varying of temporal data and causal structures of each state in one unified model. We devise the expectation-maximization (EM) algorithm to infer the model parameters. Furthermore, to address the smoothness of state varying in adjacent time, we extend the PDC model with a regularization term encouraging states to be similar in adjacent time. Though it may slightly decrease the precision on training data, it improves the generalization capability of the model. We conduct experiments on synthetic dataset as well as two real-world datasets of climate and traffic to evaluate the effectiveness of the PDC model. Experimental results show that the proposed model is effective in discovering the dynamic causal factors of Particulate Matter 2.5 (PM2.5) and traffic spatial causalities.

Explore More