Zhongmei Yao
Texas A&M University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhongmei Yao.
international conference on network protocols | 2006
Zhongmei Yao; Derek Leonard; Xiaoming Wang; Dmitri Loguinov
Previous analytical results on the resilience of unstructured P2P systems have not explicitly modeled heterogeneity of user churn (i.e., difference in online behavior) or the impact of in-degree on system resilience. To overcome these limitations, we introduce a generic model of heterogeneous user churn, derive the distribution of the various metrics observed in prior experimental studies (e.g., lifetime distribution of joining users, joint distribution of session time of alive peers, and residual lifetime of a randomly selected user), derive several closed-form results on the transient behavior of in-degree, and eventually obtain the joint in/out degree isolation probability as a simple extension of the out-degree model.
IEEE ACM Transactions on Networking | 2007
Derek Leonard; Zhongmei Yao; Vivek Rai; Dmitri Loguinov
To model P2P networks that are commonly faced with high rates of churn and random departure decisions by end-users, this paper investigates the resilience of random graphs to lifetime-based node failure and derives the expected delay before a user is forcefully isolated from the graph and the probability that this occurs within his/her lifetime. Using these metrics, we show that systems with heavy-tailed lifetime distributions are more resilient than those with light-tailed (e.g., exponential) distributions and that for a given average degree, k-regular graphs exhibit the highest level of fault tolerance. As a practical illustration of our results, each user in a system with n = 100 billion peers, 30-minute average lifetime, and 1-minute node-replacement delay can stay connected to the graph with probability 1 - 1/n using only 9 neighbors. This is in contrast to 37 neighbors required under previous modeling efforts. We finish the paper by observing that many P2P networks are almost surely (i.e., with probability 1 -- o(1)) connected if they have no isolated nodes and derive a simple model for the probability that a P2P system partitions under churn.
IEEE ACM Transactions on Networking | 2008
Derek Leonard; Zhongmei Yao; Xiaoming Wang; Dmitri Loguinov
In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how both static and dynamic patterns of node failure affect the resilience of such graphs. We start by applying classical results from random graph theory to show that a large variety of deterministic and random P2P graphs almost surely (i.e., with probability 1-O(1)) remain connected under random failure if and only if they have no isolated nodes. This simple, yet powerful, result subsequently allows us to derive in closed-form the probability that a P2P network develops isolated nodes, and therefore partitions, under both types of node failure. We finish the paper by demonstrating that our models match simulations very well and that dynamic P2P systems are extremely resilient under node churn as long as the neighbor replacement delay is much smaller than the average user lifetime.
IEEE ACM Transactions on Networking | 2009
Zhongmei Yao; Xiaoming Wang; Derek Leonard; Dmitri Loguinov
Previous analytical studies of unstructured P2P resilience have assumed exponential user lifetimes and only con-sidered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-se-lection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared with uniform selection of neighbors. In fact, the second strategy based on random walks on age-proportional graphs demonstrates that, for lifetimes with infinite variance, the system monotonically increases its resilience as its age and size grow. Specifically, we show that the probability of isolation converges to zero as these two metrics tend to infinity. We finish the paper with simulations in finite-size graphs that demonstrate the effect of this result in practice.
international conference on network protocols | 2005
Derek Leonard; Zhongmei Yao; Xiaoming Wang; Dmitri Loguinov
In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how both static and dynamic patterns of node failure affect the resilience of such graphs. We start by applying classical results from random graph theory to show that a large variety of deterministic and random P2P graphs almost surely (i.e., with probability 1-o(1)) remain connected under random failure if and only if they have no isolated nodes. This simple, yet powerful, result subsequently allows us to derive in closed-form the probability that a P2P network develops isolated nodes, and therefore partitions, under both types of node failure. We finish the paper by demonstrating that our models match simulations very well and that dynamic P2P systems are extremely resilient under node churn as long as the neighbor replacement delay is much smaller than the average user lifetime
ieee international conference computer and communications | 2007
Zhongmei Yao; Xiaoming Wang; Derek Leonard; Dmitri Loguinov
Previous analytical studies [12], [18] of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared to uniform selection of neighbors. In fact, the second strategy based on random walks on age-weighted graphs demonstrates that for lifetimes with infinite variance, the system monotonically increases its resilience as its age and size grow. Specifically, we show that the probability of isolation converges to zero as these two metrics tend to infinity. We finish the paper with simulations in finite-size graphs that demonstrate the effect of this result in practice.
web intelligence | 2003
Zhongmei Yao; Ben Choi
We propose a new bidirectional hierarchical clustering system for addressing challenges of Web mining. The key feature of the approach is that it aims to maximize the intra-cluster similarity in the bottom-up cluster-merging phase and it ensures to minimize the inter-cluster similarity in the top-down refinement phase. This two-pass approach achieves better clustering than existing one-pass approaches. We also propose a new cluster-merging criterion for allowing more than two clusters to be merged in each step and a new measure of similarity for taking into consideration not only the inter-connectivity between clusters but also the internal connectivity within the clusters. These result in reducing the average complexity for creating the final hierarchical structure of clusters from O(n/sup 2/) to O(n). The hierarchical structure represents a semantic structure between concepts of clusters and is directly applicable to the future of semantic net.
International Journal of Intelligent Information Technologies | 2007
Zhongmei Yao; Ben Choi
Clustering is well suited for Web mining by automatically organizing Web pages into categories each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain, until now there is no such a method suitable for Web page clustering. To address this problem, we discovered a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page datasets. We also propose a new Bidirectional Hierarchical Clustering algorithm, which arranges individual Web pages into clusters and then arranges the clusters into larger clusters and so on until the average inter-cluster similarity approaches the constant factor. Having the new constant factor together with the new algorithm, we have developed a clustering system suitable for mining the Web.
international conference on computer communications | 2012
Derek Leonard; Zhongmei Yao; Xiaoming Wang; Dmitri Loguinov
Intrusion Detection Systems (IDS) have become ubiquitous in the defense against virus outbreaks, malicious exploits of OS vulnerabilities, and botnet proliferation. As attackers frequently rely on host scanning for reconnaissance leading to penetration, IDS is often tasked with detecting scans and preventing them. However, it is currently unknown how likely an IDS is to detect a given Internet-wide scan pattern and whether there exist sufficiently fast scan techniques that can remain virtually undetectable at large-scale. To address these questions, we propose a simple analytical model for the window-expiration rules of popular IDS tools (i.e., Snort and Bro) and utilize a variation of the Chen-Stein theorem to derive the probability that they detect some of the commonly used scan permutations. Using this analysis, we also prove the existence of stealth-optimal scan patterns, examine their performance, and contrast it with that of well-known techniques.
IEEE Transactions on Parallel and Distributed Systems | 2011
Zhongmei Yao; Dmitri Loguinov
Several models of user churn, resilience, and link lifetime have recently appeared in the literature; however, these results do not directly apply to classical Distributed Hash Tables (DHTs) in which neighbor replacement occurs not only when current users die, but also when new users arrive into the system, and where replacement choices are often restricted to the successor of the failed zone in the DHT space. To understand neighbor churn in such networks, which we call switching DHTs, this paper proposes a simple, yet accurate, model for capturing link dynamics in structured P2P systems and obtains the distribution of link lifetimes for fairly generic DHTs. Similar to, our results show that deterministic networks (e.g., Chord, CAN) unfortunately do not extract much benefit from heavy-tailed user lifetimes since link durations are dominated by small remaining lifetimes of newly arriving users that replace the more reliable existing neighbors. We also examine link lifetimes in randomized DHTs equipped with multiple choices for each link and show that selecting the best neighbor in these scenarios is rather complicated as it depends on the desired load balancing, link resilience, and overhead. We offer insight into the various selection algorithms, their performance, and possibilities for improvement.