Shu Tao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shu Tao is active.

Explore More

Publication

Featured researches published by Shu Tao.

network operations and management symposium | 2012

Workload characterization and prediction in the cloud: A multiple time series approach

Arijit Khan; Xifeng Yan; Shu Tao; Nikos Anerousis

Cloud computing promises high scalability, flexibility and cost-effectiveness to satisfy emerging computing requirements. To efficiently provision computing resources in the cloud, system administrators need the capabilities of characterizing and predicting workload on the Virtual Machines (VMs). In this paper, we use data traces obtained from a real data center to develop such capabilities. First, we search for repeatable workload patterns by exploring cross-VM workload correlations resulted from the dependencies among applications running on different VMs. Treating workload data samples as time series, we develop a co-clustering technique to identify groups of VMs that frequently exhibit correlated workload patterns, and also the time periods in which these VM groups are active. Then, we introduce a method based on Hidden Markov Modeling (HMM) to characterize the temporal correlations in the discovered VM clusters and to predict variations of workload patterns. The experimental results show that our method can not only help better understand group-level workload characteristics, but also make more accurate predictions on workload changes in a cloud.

international conference on management of data | 2011

Neighborhood based fast graph search in large networks

Arijit Khan; Nan Li; Xifeng Yan; Ziyu Guan; Supriyo Chakraborty; Shu Tao

Complex social and information network search becomes important with a variety of applications. In the core of these applications, lies a common and critical problem: Given a labeled network and a query graph, how to efficiently search the query graph in the target network. The presence of noise and the incomplete knowledge about the structure and content of the target network make it unrealistic to find an exact match. Rather, it is more appealing to find the top-k approximate matches.n In this paper, we propose a neighborhood-based similarity measure that could avoid costly graph isomorphism and edit distance computation. Under this new measure, we prove that subgraph similarity search is NP hard, while graph similarity match is polynomial. By studying the principles behind this measure, we found an information propagation model that is able to convert a large network into a set of multidimensional vectors, where sophisticated indexing and similarity search algorithms are available. The proposed method, called Ness (Neighborhood Based Similarity Search), is appropriate for graphs with low automorphism and high noise, which are common in many social and information networks. Ness is not only efficient, but also robust against structural noise and information loss. Empirical results show that it can quickly and accurately find high-quality matches in large networks, with negligible cost.

knowledge discovery and data mining | 2008

Efficient ticket routing by resolution sequence mining

Qihong Shao; Yi Chen; Shu Tao; Xifeng Yan; Nikos Anerousis

IT problem management calls for quick identification of resolvers to reported problems. The efficiency of this process highly depends on ticket routing---transferring problem ticket among various expert groups in search of the right resolver to the ticket. To achieve efficient ticket routing, wise decision needs to be made at each step of ticket transfer to determine which expert group is likely to be, or to lead to the resolver.n In this paper, we address the possibility of improving ticket routing efficiency by mining ticket resolution sequences alone, without accessing ticket content. To demonstrate this possibility, a Markov model is developed to statistically capture the right decisions that have been made toward problem resolution, where the order of the Markov model is carefully chosen according to the conditional entropy obtained from ticket data. We also design a search algorithm, called Variable-order Multiple active State search(VMS), that generates ticket transfer recommendations based on our model. The proposed framework is evaluated on a large set of real-world problem tickets. The results demonstrate that VMS significantly improves human decisions: Problem resolvers can often be identified with fewer ticket transfers.

knowledge discovery and data mining | 2010

Generative models for ticket resolution in expert networks

Gengxin Miao; Louise E. Moser; Xifeng Yan; Shu Tao; Yi Chen; Nikos Anerousis

Ticket resolution is a critical, yet challenging, aspect of the delivery of IT services. A large service provider needs to handle, on a daily basis, thousands of tickets that report various types of problems. Many of those tickets bounce among multiple expert groups before being transferred to the group with the right expertise to solve the problem. Finding a methodology that reduces such bouncing and hence shortens ticket resolution time is a long-standing challenge. In this paper, we present a unified generative model, the Optimized Network Model (ONM), that characterizes the lifecycle of a ticket, using both the content and the routing sequence of the ticket. ONM uses maximum likelihood estimation, to represent how the information contained in a ticket is used by human experts to make ticket routing decisions. Based on ONM, we develop a probabilistic algorithm to generate ticket routing recommendations for new tickets in a network of expert groups. Our algorithm calculates all possible routes to potential resolvers and makes globally optimal recommendations, in contrast to existing classification methods that make static and locally optimal recommendations. Experiments show that our method significantly outperforms existing solutions.

very large data bases | 2008

EasyTicket: a ticket routing recommendation engine for enterprise problem resolution

Qihong Shao; Yi Chen; Shu Tao; Xifeng Yan; Nikos Anerousis

Managing problem tickets is a key issue in IT service industry. A large service provider may handle thousands of problem tickets from its customers on a daily basis. The efficiency of processing these tickets highly depends on ticket routing---transferring problem tickets among expert groups in search of the right resolver to the ticket. Despite that many ticket management systems are available, ticket routing in these systems is still manually operated by support personnel. In this demo, we introduce EasyTicket, a ticket routing recommendation engine that helps automate this process. By mining ticket history data, we model an enterprise social network that represents the functional relationships among various expert groups in ticket routing. Based on this network, our system then provides routing recommendations to new tickets. Our experimental studies on 1.4 million real-world problem tickets show that on average, EasyTicket can improve the efficiency of ticket routing by 35%.

business process management | 2010

Content-aware resolution sequence mining for ticket routing

Peng Sun; Shu Tao; Xifeng Yan; Nikos Anerousis; Yi Chen

Ticket routing is key to the efficiency of IT problem management. Due to the complexity of many reported problems, problem tickets typically need to be routed among various expert groups, to search for the right resolver. In this paper, we study the problem of using historical ticket data to make smarter routing recommendations for new tickets, so as to improve the efficiency of ticket routing, in terms of the Mean number of Steps To Resolve (MSTR) a ticket. n nPrevious studies on this problem have been focusing on mining ticket resolution sequences to generate more informed routing recommendations. In this work, we enhance the existing sequence-only approach by further mining the text content of tickets. Through extensive studies on real-world problem tickets, we find that neither resolution sequence nor ticket content alone is sufficient to deliver the most reduction in MSTR, while a hybrid approach that mines resolution sequences in a content-aware manner proves to be the most effective. We therefore propose such an approach that first analyzes the content of a new ticket and identifies a set of semantically relevant tickets, and then creates a weighted Markov model from the resolution sequences of these tickets to generate routing recommendations. Our experiments show that the proposed approach achieves significantly better results than both sequence-only and content-only solutions.

knowledge discovery and data mining | 2014

Analyzing expert behaviors in collaborative networks

Huan Sun; Mudhakar Srivatsa; Shulong Tan; Yang Li; Lance M. Kaplan; Shu Tao; Xifeng Yan

Collaborative networks are composed of experts who cooperate with each other to complete specific tasks, such as resolving problems reported by customers. A task is posted and subsequently routed in the network from an expert to another until being resolved. When an expert cannot solve a task, his routing decision (i.e., where to transfer a task) is critical since it can significantly affect the completion time of a task. In this work, we attempt to deduce the cognitive process of task routing, and model the decision making of experts as a generative process where a routing decision is made based on mixed routing patterns. In particular, we observe an interesting phenomenon that an expert tends to transfer a task to someone whose knowledge is neither too similar to nor too different from his own. Based on this observation, an expertise difference based routing pattern is developed. We formalize multiple routing patterns by taking into account both rational and random analysis of tasks, and present a generative model to combine them. For a held-out set of tasks, our model not only explains their real routing sequences very well, but also accurately predicts their completion time. Under three different quality measures, our method significantly outperforms all the alternatives with more than 75% accuracy gain. In practice, with the help of our model, hypotheses on how to improve a collaborative network can be tested quickly and reliably, thereby significantly easing performance improvement of collaborative networks.

international world wide web conferences | 2012

Understanding task-driven information flow in collaborative networks

Gengxin Miao; Shu Tao; Winnie Cheng; Randy Moulic; Louise E. Moser; David Lo; Xifeng Yan

Collaborative networks are a special type of social network formed by members who collectively achieve specific goals, such as fixing software bugs and resolving customers problems. In such networks, information flow among members is driven by the tasks assigned to the network, and by the expertise of its members to complete those tasks. In this work, we analyze real-life collaborative networks to understand their common characteristics and how information is routed in these networks. Our study shows that collaborative networks exhibit significantly different properties compared with other complex networks. Collaborative networks have truncated power-law node degree distributions and other organizational constraints. Furthermore, the number of steps along which information is routed follows a truncated power-law distribution. Based on these observations, we developed a network model that can generate synthetic collaborative networks subject to certain structure constraints. Moreover, we developed a routing model that emulates task-driven information routing conducted by human beings in a collaborative network. Together, these two models can be used to study the efficiency of information routing for different types of collaborative networks -- a problem that is important in practice yet difficult to solve without the method proposed in this paper.

knowledge discovery and data mining | 2012

Latent association analysis of document pairs

Gengxin Miao; Ziyu Guan; Louise E. Moser; Xifeng Yan; Shu Tao; Nikos Anerousis; Jimeng Sun

This paper presents Latent Association Analysis (LAA), a generative model that analyzes the topics within two document sets simultaneously, as well as the correlations between the two topic structures, by considering the semantic associations among document pairs. LAA defines a correlation factor that represents the connection between two documents, and considers the topic proportion of paired documents based on this factor. Words in the documents are assumed to be randomly generated by particular topic assignments and topic-to-word probability distributions. The paper also presents a new ranking algorithm, based on LAA, that can be used to retrieve target documents that are potentially associated with a given source document. The ranking algorithm uses the latent factor in LAA to rank target documents by the strength of their semantic associations with the source document. We evaluate the LAA algorithm with real datasets, specifically, the IT-Change and the IT-Solution document sets from the IBM IT service environment and the Symptom-Treatment document sets from Google Health. Experimental results demonstrate that the LAA algorithm significantly outperforms existing algorithms.

Archive | 2012

Reliable Ticket Routing in Expert Networks

Gengxin Miao; Louise E. Moser; Xifeng Yan; Shu Tao; Yi Chen; Nikos Anerousis

Problem ticket resolution is an important aspect of the delivery of IT services. A large service provider needs to handle, on a daily basis, thousands of tickets that report various types of problems. Many of those tickets bounce among multiple expert groups before being transferred to the group with the expertise to solve the problem. Finding a methodology that can automatically make reliable ticket routing decisions and that reduces such bouncing and, hence, shortens ticket resolution time is a long-standing challenge. Reliable ticket routing forwards the ticket to an expert who either can solve the problem reported in the ticket, or can reach an expert who can resolve the ticket. In this chapter, we present a unified generative model, the Optimized Network Model (ONM), that characterizes the lifecycle of a ticket, using both the content and the routing sequence of the ticket. ONM uses maximum likelihood estimation to capture reliable ticket transfer profiles on each edge of an expert network. These transfer profiles reflect how the information contained in a ticket is used by human experts to make ticket routing decisions. Based on ONM, we develop a probabilistic algorithm to generate reliable ticket routing recommendations for new tickets in a network of expert groups. Our algorithm calculates all possible routes to potential resolvers and makes globally optimal recommendations, in contrast to existing classification methods that make static and locally optimal.

Explore More