Is this you? Create Your Porfile

Linhong Zhu

University of Southern California

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Linhong Zhu is active.

Explore More

Publication

Featured researches published by Linhong Zhu.

international conference on management of data | 2010

Finding maximal cliques in massive networks by H*-graph

James Cheng; Yiping Ke; Ada Wai-Chee Fu; Jeffrey Xu Yu; Linhong Zhu

Maximal clique enumeration (MCE) is a fundamental problem in graph theory and has important applications in many areas such as social network analysis and bioinformatics. The problem is extensively studied; however, the best existing algorithms require memory space linear in the size of the input graph. This has become a serious concern in view of the massive volume of todays fast-growing network graphs. Since MCE requires random access to different parts of a large graph, it is difficult to divide the graph into smaller parts and process one part at a time, because either the result may be incorrect and incomplete, or it incurs huge cost on merging the results from different parts. We propose a novel notion, H*-graph, which defines the core of a network and extends to encompass the neighborhood of the core for MCE computation. We propose the first external-memory algorithm for MCE (ExtMCE) that uses the H*-graph to bound the memory usage. We prove both the correctness and completeness of the result computed by ExtMCE. Extensive experiments verify that ExtMCE efficiently processes large networks that cannot be fit in the memory. We also show that the H*-graph captures important properties of the network; thus, updating the maximal cliques in the H*-graph retains the most essential information, with a low update cost, when it is infeasible to perform update on the entire network.

ACM Transactions on Database Systems | 2011

Finding maximal cliques in massive networks

James Cheng; Yiping Ke; Ada Wai-Chee Fu; Jeffrey Xu Yu; Linhong Zhu

Maximal clique enumeration is a fundamental problem in graph theory and has important applications in many areas such as social network analysis and bioinformatics. The problem is extensively studied; however, the best existing algorithms require memory space linear in the size of the input graph. This has become a serious concern in view of the massive volume of todays fast-growing networks. We propose a general framework for designing external-memory algorithms for maximal clique enumeration in large graphs. The general framework enables maximal clique enumeration to be processed recursively in small subgraphs of the input graph, thus allowing in-memory computation of maximal cliques without the costly random disk access. We prove that the set of cliques obtained by the recursive local computation is both correct (i.e., globally maximal) and complete. The subgraph to be processed each time is defined based on a set of base vertices that can be flexibly chosen to achieve different purposes. We discuss the selection of the base vertices to fully utilize the available memory in order to minimize I/O cost in static graphs, and for update maintenance in dynamic graphs. We also apply our framework to design an external-memory algorithm for maximum clique computation in a large graph.

knowledge discovery and data mining | 2012

Fast algorithms for maximal clique enumeration with limited memory

James Cheng; Linhong Zhu; Yiping Ke; Shumo Chu

Maximal clique enumeration (MCE) is a long-standing problem in graph theory and has numerous important applications. Though extensively studied, most existing algorithms become impractical when the input graph is too large and is disk-resident. We first propose an efficient partition-based algorithm for MCE that addresses the problem of processing large graphs with limited memory. We then further reduce the high cost of CPU computation of MCE by a careful nested partition based on a cost model. Finally, we parallelize our algorithm to further reduce the overall running time. We verified the efficiency of our algorithms by experiments in large real-world graphs.

international conference on management of data | 2014

Tripartite graph clustering for dynamic sentiment analysis on social media

Linhong Zhu; Aram Galstyan; James Cheng; Kristina Lerman

The growing popularity of social media (e.g., Twitter) allows users to easily share information with each other and influence others by expressing their own sentiments on various subjects. In this work, we propose an unsupervised tri-clustering framework, which analyzes both user-level and tweet-level sentiments through co-clustering of a tripartite graph. A compelling feature of the proposed framework is that the quality of sentiment clustering of tweets, users, and features can be mutually improved by joint clustering. We further investigate the evolution of user-level sentiments and latent feature vectors in an online framework and devise an efficient online algorithm to sequentially update the clustering of tweets, users and features with newly arrived data. The online framework not only provides better quality of both dynamic user-level and tweet-level sentiment analysis, but also improves the computational and storage efficiency. We verified the effectiveness and efficiency of the proposed approaches on the November 2012 California ballot Twitter data.

international conference on social computing | 2013

The Role of Social Media in the Discussion of Controversial Topics

Laura M. Smith; Linhong Zhu; Kristina Lerman; Zornitsa Kozareva

In recent years, social media has revolutionized how people communicate and share information. Twitter and other blogging sites have seen an increase in political and social activism. Previous studies on the behaviors of users in politics have focused on electoral candidates and election results. Our paper investigates the role of social media in discussing and debating controversial topics. We apply sentiment analysis techniques to classify the position (for, against, neutral) expressed in a tweet about a controversial topic and use the results in our study of user behavior. Our findings suggest that Twitter is primarily used for spreading information to like-minded people rather than debating issues. Users are quicker to rebroadcast information than to address a communication by another user. Individuals typically take a position on an issue prior to posting about it and are not likely to change their tweeting opinion.

IEEE Transactions on Knowledge and Data Engineering | 2016

Scalable Temporal Latent Space Inference for Link Prediction in Dynamic Social Networks

Linhong Zhu; Dong Guo; Junming Yin; Greg Ver Steeg; Aram Galstyan

We propose a temporal latent space model for link prediction in dynamic social networks, where the goal is to predict links over time based on a sequence of previous graph snapshots. The model assumes that each user lies in an unobserved latent space, and interactions are more likely to occur between similar users in the latent space representation. In addition, the model allows each user to gradually move its position in the latent space as the network structure evolves over time. We present a global optimization algorithm to effectively infer the temporal latent space. Two alternative optimization algorithms with local and incremental updates are also proposed, allowing the model to scale to larger networks without compromising prediction accuracy. Empirically, we demonstrate that our model, when evaluated on a number of real-world dynamic networks, significantly outperforms existing approaches for temporal link prediction in terms of both scalability and predictive power.We study temporal link prediction problem, where, given past interactions, our goal is to predict new interactions. We propose a dynamic link prediction method based on nonnegative matrix factorization. This method assumes that interactions are more likely between users that are similar to each other in the latent space representation. We propose a global optimization algorithm to effectively learn the temporal latent space with quadratic convergence rate and bounded error. In addition, we propose two alternative algorithms with local and incremental updates, which provide much better scalability without deteriorating prediction accuracy. We evaluate our model on a number of real-world dynamic networks and demonstrate that our model significantly outperforms existing approaches for temporal link prediction in terms of both scalability and predictive power.

knowledge discovery and data mining | 2016

Latent Space Model for Road Networks to Predict Time-Varying Traffic

Dingxiong Deng; Cyrus Shahabi; Ugur Demiryurek; Linhong Zhu; Rose Yu; Yan Liu

Real-time traffic prediction from high-fidelity spatiotemporal traffic sensor datasets is an important problem for intelligent transportation systems and sustainability. However, it is challenging due to the complex topological dependencies and high dynamism associated with changing road conditions. In this paper, we propose a Latent Space Model for Road Networks (LSM-RN) to address these challenges holistically. In particular, given a series of road network snapshots, we learn the attributes of vertices in latent spaces which capture both topological and temporal properties. As these latent attributes are time-dependent, they can estimate how traffic patterns form and evolve. In addition, we present an incremental online algorithm which sequentially and adaptively learns the latent attributes from the temporal graph changes. Our framework enables real-time traffic prediction by 1) exploiting real-time sensor readings to adjust/update the existing latent spaces, and 2) training as data arrives and making predictions on-the-fly. By conducting extensive experiments with a large volume of real-world traffic sensor data, we demonstrate the superiority of our framework for real-time traffic prediction on large road networks over competitors as well as baseline graph-based LSMs.

Information Systems | 2011

Structure and attribute index for approximate graph matching in large graphs

Linhong Zhu; Wee Keong Ng; James Cheng

The increasing popularity of graph data in various domains has lead to a renewed interest in developing efficient graph matching techniques, especially for processing large graphs. In this paper, we study the problem of approximate graph matching in a large attributed graph. Given a large attributed graph and a query graph, we compute a subgraph of the large graph that best matches the query graph. We propose a novel structure-aware and attribute-aware index to process approximate graph matching in a large attributed graph. We first construct an index on the similarity of the attributed graph, by partitioning the large search space into smaller subgraphs based on structure similarity and attribute similarity. Then, we construct a connectivity-based index to give a concise representation of inter-partition connections. We use the index to find a set of best matching paths. From these best matching paths, we compute the best matching answer graph using a greedy algorithm. Experimental results on real datasets demonstrate the efficiency of both index construction and query processing. We also show that our approach attains high-quality query answers.

advances in social networks analysis and mining | 2013

Graph-based informative-sentence selection for opinion summarization

Linhong Zhu; Sheng Gao; Sinno Jialin Pan; Haizhou Li; Dingxiong Deng; Cyrus Shahabi

In this paper, we propose a new framework for opinion summarization based on sentence selection. Our goal is to assist users to get helpful opinion suggestions from reviews by only reading a short summary with few informative sentences, where the quality of summary is evaluated in terms of both aspect coverage and viewpoints preservation. More specifically, we formulate the informative-sentence selection problem in opinion summarization as a community-leader detection problem, where a community consists of a cluster of sentences towards the same aspect of an entity. The detected leaders of the communities can be considered as the most informative sentences of the corresponding aspect, while informativeness of a sentence is defined by its informativeness within both its community and the document it belongs to. Review data from six product domains from Amazon.com are used to verify the effectiveness of our method for opinion summarization.

Geoinformatica | 2016

Task selection in spatial crowdsourcing from worker's perspective

Dingxiong Deng; Cyrus Shahabi; Ugur Demiryurek; Linhong Zhu

With the progress of mobile devices and wireless broadband, a new eMarket platform, termed spatial crowdsourcing is emerging, which enables workers (aka crowd) to perform a set of spatial tasks (i.e., tasks related to a geographical location and time) posted by a requester. In this paper, we study a version of the spatial crowdsourcing problem in which the workers autonomously select their tasks, called the worker selected tasks (WST) mode. Towards this end, given a worker, and a set of tasks each of which is associated with a location and an expiration time, we aim to find a schedule for the worker that maximizes the number of performed tasks. We first prove that this problem is NP-hard. Subsequently, for small number of tasks, we propose two exact algorithms based on dynamic programming and branch-and-bound strategies. Since the exact algorithms cannot scale for large number of tasks and/or limited amount of resources on mobile platforms, we propose different approximation algorithms. Finally, to strike a compromise between efficiency and accuracy, we present a progressive algorithms. We conducted a thorough experimental evaluation with both real-world and synthetic data on desktop and mobile platforms to compare the performance and accuracy of our proposed approaches.

Explore More