Kevin S. Xu
University of Michigan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kevin S. Xu.
Data Mining and Knowledge Discovery | 2014
Kevin S. Xu; Mark Kliger; Alfred O. Hero
In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a static clustering method. In this paper, we introduce a different approach to evolutionary clustering by accurately tracking the time-varying proximities between objects followed by static clustering. We present an evolutionary clustering framework that adaptively estimates the optimal smoothing parameter using shrinkage estimation, a statistical approach that improves a naïve estimate using additional information. The proposed framework can be used to extend a variety of static clustering algorithms, including hierarchical, k-means, and spectral clustering, into evolutionary clustering algorithms. Experiments on synthetic and real data sets indicate that the proposed framework outperforms static clustering and existing evolutionary clustering algorithms in many scenarios.
IEEE Journal of Selected Topics in Signal Processing | 2014
Kevin S. Xu; Alfred O. Hero
Significant efforts have gone into the development of statistical models for analyzing data in the form of networks, such as social networks. Most existing work has focused on modeling static networks, which represent either a single time snapshot or an aggregate view over time. There has been recent interest in statistical modeling of dynamic networks, which are observed at multiple points in time and offer a richer representation of many complex phenomena. In this paper, we present a state-space model for dynamic networks that extends the well-known stochastic blockmodel for static networks to the dynamic setting. We fit the model in a near-optimal manner using an extended Kalman filter (EKF) augmented with a local search. We demonstrate that the EKF-based algorithm performs competitively with a state-of-the-art algorithm based on Markov chain Monte Carlo sampling but is significantly less computationally demanding.
international conference on social computing | 2013
Kevin S. Xu; Alfred O. Hero
Significant efforts have gone into the development of statistical models for analyzing data in the form of networks, such as social networks. Most existing work has focused on modeling static networks, which represent either a single time snapshot or an aggregate view over time. There has been recent interest in statistical modeling of dynamic networks, which are observed at multiple points in time and offer a richer representation of many complex phenomena. In this paper, we propose a state-space model for dynamic networks that extends the well-known stochastic blockmodel for static networks to the dynamic setting. We then propose a procedure to fit the model using a modification of the extended Kalman filter augmented with a local search. We apply the procedure to analyze a dynamic social network of email communication.
international conference on social computing | 2011
Kevin S. Xu; Mark Kliger; Alfred O. Hero
The study of communities in social networks has attracted considerable interest from many disciplines. Most studies have focused on static networks, and in doing so, have neglected the temporal dynamics of the networks and communities. This paper considers the problem of tracking communities over time in dynamic social networks. We propose a method for community tracking using an adaptive evolutionary clustering framework. We apply the method to reveal the temporal evolution of communities in two real data sets. In addition, we obtain a statistic that can be used for identifying change points in the network.
international conference on acoustics, speech, and signal processing | 2010
Kevin S. Xu; Mark Kliger; Alfred O. Hero
Many practical applications of clustering involve data collected over time. In these applications, evolutionary clustering can be applied to the data to track changes in clusters with time. In this paper, we consider an evolutionary version of spectral clustering that applies a forgetting factor to past affinities between data points and aggregates them with current affinities. We propose to use an adaptive forgetting factor and provide a method to automatically choose this forgetting factor at each time step. We evaluate the performance of the proposed method through experiments on synthetic and real data and find that, with an adaptive forgetting factor, we are able to obtain improved clustering performance compared to a fixed forgetting factor.
Data Mining and Knowledge Discovery | 2013
Kevin S. Xu; Mark Kliger; Alfred O. Hero
Many real-world networks, including social and information networks, are dynamic structures that evolve over time. Such dynamic networks are typically visualized using a sequence of static graph layouts. In addition to providing a visual representation of the network structure at each time step, the sequence should preserve the mental map between layouts of consecutive time steps to allow a human to interpret the temporal evolution of the network. In this paper, we propose a framework for dynamic network visualization in the on-line setting where only present and past graph snapshots are available to create the present layout. The proposed framework creates regularized graph layouts by augmenting the cost function of a static graph layout algorithm with a grouping penalty, which discourages nodes from deviating too far from other nodes belonging to the same group, and a temporal penalty, which discourages large node movements between consecutive time steps. The penalties increase the stability of the layout sequence, thus preserving the mental map. We introduce two dynamic layout algorithms within the proposed framework, namely dynamic multidimensional scaling and dynamic graph Laplacian layout. We apply these algorithms on several data sets to illustrate the importance of both grouping and temporal regularization for producing interpretable visualizations of dynamic networks.
ieee signal processing workshop on statistical signal processing | 2011
Kevin S. Xu; Mark Kliger; Alfred O. Hero
The analysis of network data is of interest to many disciplines, ranging from sociology to computer science. Recent interest has shifted from static networks to dynamic networks, which evolve over time. A fundamental problem in the analysis of dynamic networks is tracking long-term trends, which are obscured by short-term variations. In this paper, we propose a method for minimum mean-squared error tracking of dynamic networks using a recursive shrinkage estimation framework that accounts for the spatial correlation in the network. Unlike model-based tracking methods such as the Kalman filter, the proposed method does not require knowledge about the network dynamics. We demonstrate that the proposed method is able to track dynamic networks effectively through experiments on simulated and real networks.
siam international conference on data mining | 2016
Yan Li; Kevin S. Xu; Chandan K. Reddy
Survival analysis aims to predict the occurrence of specific events of interest at future time points. The presence of incomplete observations due to censoring brings unique challenges in this domain and differentiates survival analysis techniques from other standard regression methods. In many applications where the distribution of the survival times can be explicitly modeled, parametric survival regression is a better alternative to the commonly used Cox proportional hazards model for this problem of censored regression. However, parametric survival regression suffers from model overfitting in high-dimensional scenarios. In this paper, we propose a unified model for regularized parametric survival regression for an arbitrary survival distribution. We employ a generalized linear model to approximate the negative log-likelihood and use the elastic net as a sparsity-inducing penalty to effectively deal with highdimensional data. The proposed model is then formulated as a penalized iteratively reweighted least squares and solved using a cyclical coordinate descent-based method. We demonstrate the performance of our proposed model on various high-dimensional real-world microarray gene expression benchmark datasets. Our experimental results indicate that the proposed model produces more accurate estimates compared to the other competing state-of-the-art methods.
ieee international conference on cloud computing technology and science | 2016
Ruthwik R. Junuthula; Kevin S. Xu; Vijay Devabhaktuni
The task of predicting future relationships in a social network, known as link prediction, has been studied extensively in the literature. Many link prediction methods have been proposed, ranging from common neighbors to probabilistic models. Recent work by Yang et al. has highlighted several challenges in evaluating link prediction accuracy. In dynamic networks where edges are both added and removed over time, the link prediction problem is more complex and involves predicting both newly added and newly removed edges. This results in new challenges in the evaluation of dynamic link prediction methods, and the recommendations provided by Yang et al. are no longer applicable, because they do not address edge removal. In this paper, we investigate several metrics currently used for evaluating accuracies of dynamic link prediction methods and demonstrate why they can be misleading in many cases. We provide several recommendations on evaluating dynamic link prediction accuracy, including separation into two categories of evaluation. Finally we propose a unified metric to characterize link prediction accuracy effectively using a single number.
IEEE Transactions on Neural Networks | 2016
Ko Jen Hsiao; Kevin S. Xu; Jeff Calder; Alfred O. Hero
We consider the problem of identifying patterns in a data set that exhibits anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g., as measured by the nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains, there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such cases, multiple dissimilarity measures can be defined, including nonmetric measures, and one can test for anomalies by scalarizing using a nonnegative linear combination of them. If the relative importance of the different dissimilarity measures are not known in advance, as in many anomaly detection applications, the anomaly detection algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we propose a method for similarity-based anomaly detection using a novel multicriteria dissimilarity measure, the Pareto depth. The proposed Pareto depth analysis (PDA) anomaly detection algorithm uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach is provably better than using linear combinations of the criteria, and shows superior performance on experiments with synthetic and real data sets.