Beixing Deng
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Beixing Deng.
international conference on distributed computing systems workshops | 2011
Tianyi Wang; Yang Chen; Zengbin Zhang; Tianyin Xu; Long Jin; Pan Hui; Beixing Deng; Xing Li
Being able to keep the graph scale small while capturing the properties of the original social graph, graph sampling provides an efficient, yet inexpensive solution for social network analysis. The challenge is how to create a small, but representative sample out of the massive social graph with millions or even billions of nodes. Several sampling algorithms have been proposed in previous studies, but there lacks fair evaluation and comparison among them. In this paper, we analyze the state-of art graph sampling algorithms and evaluate their performance on some widely recognized graph properties on directed graphs using large-scale social network datasets. We evaluate not only the commonly used node degree distribution, but also clustering coefficient, which quantifies how well connected are the neighbors of a node in a graph. Through the comparison we have found that none of the algorithms is able to obtain satisfied sampling results in both of these properties, and the performance of each algorithm differs much in different kinds of datasets.
IEEE Transactions on Network and Service Management | 2011
Yang Chen; Xiao Wang; Cong Shi; Eng Keong Lua; Xiaoming Fu; Beixing Deng; Xing Li
Network coordinate (NC) systems provide a lightweight and scalable way for predicting the distances, i.e., round-trip latencies among Internet hosts. Most existing NC systems embed hosts into a low dimensional Euclidean space. Unfortunately, the persistent occurrence of Triangle Inequality Violation (TIV) on the Internet largely limits the distance prediction accuracy of those NC systems. Some alternative systems aim at handling the persistent TIV, however, they only achieve comparable prediction accuracy with Euclidean distance based NC systems. In this paper, we propose an NC system, so-called Phoenix, which is based on the matrix factorization model. Phoenix introduces a weight to each reference NC and trusts the NCs with higher weight values more than the others. The weight-based mechanism can substantially reduce the impact of the error propagation. Using the representative aggregate data sets and the newly measured dynamic data set collected from the Internet, our simulations show that Phoenix achieves significantly higher prediction accuracy than other NC systems. We also show that Phoenix quickly converges to steady state, performs well under host churn, handles the drift of the NCs successfully by using regularization, and is robust against measurement anomalies. Phoenix achieves a scalable yet accurate end-to-end distances monitoring. In addition, we study how well an NC system can characterize the TIV property on the Internet by introducing two new quantitative metrics, so-called RERPL and AERPL. We show that Phoenix is able to characterize TIV better than other existing NC systems.
Iet Communications | 2009
Yang Chen; Yongqiang Xiong; Xiaohui Shi; Jiwen Zhu; Beixing Deng; Xing Li
Network coordinates (NC) system is an efficient mechanism for Internet distance prediction with scalable measurements. The intrinsical cause for the unsatisfactory accuracy of the simulation-based NC algorithms has been identified. Then Pharos, a fully decentralised and hierarchical scheme, is proposed to solve this problem. Pharos leverages multiple coordinate sets at different distance scales, with the right scale being chosen for prediction each time. We evaluate the performance of Pharos system with the King data set and latency data from PlanetLab, and compare it with the representative NC system, Vivaldi. The experimental results show that Pharos greatly outperforms Vivaldi in Internet distance prediction without adding any significant overhead. Our extensive evaluation results also demonstrate that Pharos can significantly improve the performance in distributed Internet applications, such as overlay multicast and server selection.
international conference on information networking | 2008
Weiyu Wu; Yang Chen; Xinyi Zhang; Xiaohui Shi; LinCong; Beixing Deng; Xing Li
As the substrate of structured peer-to-peer systems, distributed hash table (DHT) plays a key role in P2P routing infrastructures. Traditional DHT does not consider the location of the nodes for the assignment of identifiers, which will result in high end-to-end latency on DHT-based overlay networks. In this paper, we propose a design of locality-aware DHT called LDHT, which exploits network locality on DHT-based systems. Instead of assigning uniform random node identifiers in traditional DHT, nodes in LDHT are assigned locality-aware identifiers according to their autonomous system numbers (ASNs). As a result, each node will have more nearby neighbors than faraway neighbors in the overlay. We evaluate the performance of LDHT on different kinds of typical DHT protocols and on various topologies. The results show that LDHT improves the traditional DHT protocols a lot in terms of end-to-end latency, without introducing additional overhead. It is indicated that LDHT is fit for different kinds of DHT protocols and can work effectively on all structured P2P systems including Chord, Symphony and Kademlia.
global communications conference | 2007
Yang Chen; Yongqiang Xiong; Xiaohui Shi; Beixing Deng; Xing Li
Network coordinates (NC) system is an efficient mechanism for Internet distance prediction with limited measurements. In this paper, we identify the intrinsical cause for the inadequate accuracy of the simulation based NC algorithms. We then propose Pharos, a fully decentralized and hierarchical scheme, to remedy this problem. Pharos leverages multiple coordinate sets at different distance scales, with the right scale being chosen for prediction each time. We evaluate the performance of Pharos system with the King data set and latency data from PlanetLab, and compare it with the representative NC system, Vivaldi. The experimental results show that Pharos outperforms Vivaldi much without adding any significant overhead.
international conference on mobile systems applications and services | 2011
Long Jin; Yang Chen; Pan Hui; Cong Ding; Tianyi Wang; Athanasios V. Vasilakos; Beixing Deng; Xing Li
Nowadays, Online Social Networks (OSNs) have become dramatically popular and the study of social graphs attracts the interests of a large number of researchers. One critical challenge is the huge size of the social graph, which makes the graph analyzing or even the data crawling incredibly time consuming, and sometimes impossible to be completed. Thus, graph sampling algorithms have been introduced to obtain a smaller subgraph which reflects the properties of the original graph well. Breadth-First Sampling (BFS) is widely used in graph sampling, but it is biased towards high-degree vertices during the process of sampling. Besides, Metropolis-Hasting Random Walk (MHRW), which is proposed to get unbiased samples of the social graph, requires the graph to be well connected. In this paper, we propose a vertex sampling algorithm, so-called Albatross Sampling (AS), which introduces random jump strategy into MHRW during the sampling process. The embedded random jump makes the sampling procedure more flexible and avoids being trapped in some locally well connected part. According to our evaluation, we find that no matter using tightly or loosely connected graphs, AS performs significantly better than MHRW and BFS. On the one hand, AS estimates the degree distribution with much lower Normalized Mean Square Error (NMSE) by consuming the same resource budget. On the other hand, to get an acceptable estimation of the degree distribution, AS requires much less resource budget.
international ifip tc networking conference | 2009
Yang Chen; Xiao Wang; Xiaoxiao Song; Eng Keong Lua; Cong Shi; Xiaohan Zhao; Beixing Deng; Xing Li
Network coordinate (NC) system allows efficient Internet distance prediction with scalable measurements. Most of the NC systems are based on embedding hosts into a low dimensional Euclidean space. Unfortunately, the accuracy of predicted distances is largely hurt by the persistent occurrence of Triangle Inequality Violation (TIV) in measured Internet distances. IDES is a dot product based NC system which can tolerate the constraints of TIVs. However, it cannot guarantee the predicted distance non-negative and its prediction accuracy is close to the Euclidean distance based NC systems. In this paper, we propose Phoenix, an accurate, practical and decentralized NC system. It adopts a weighted model adjustment to achieve better prediction accuracy while it ensures the predicted distances to be positive and usable. Our extensive Internet trace based simulation shows that Phoenix can achieve higher prediction accuracy than other representative NC systems. Furthermore, Phoenix has fast convergence and robustness over measurement anomalies.
acm special interest group on data communication | 2010
Tianyi Wang; Yang Chen; Zengbin Zhang; Peng Sun; Beixing Deng; Xing Li
Microblogging services, such as Twitter, are among the most important online social networks(OSNs). Different from OSNs such as Facebook, the topology of microblogging service is a directed graph i...
global communications conference | 2011
Zhuo Chen; Yang Chen; Yibo Zhu; Cong Ding; Beixing Deng; Xing Li
Network Coordinate (NC) systems provide an efficient and scalable mechanism to estimate latencies among hosts. However, many popular algorithms like Vivaldi suffer greatly from the existence of Triangle Inequality Violations (TIVs). Two-layer systems like Pharos and hierarchical Vivaldi have been proposed to remedy the impact of TIVs. They divide the whole space into several location-based clusters and run NC systems on both global layer and local layer. However, the two-layer model is only able to optimize the intra-cluster links relating to a limited portion of TIV triangles. In this paper, we propose a new NC system, Tarantula, which divides the space in a novel way. By categorizing the TIVs into three classes, we show that Tarantula handles a much larger portion of existing TIVs than two-layer systems. Moreover, we present two techniques to further strengthen the Tarantula system: 1) relate the updating step size in the Vivaldi algorithm used in Tarantula to ground-truth latency so as to improve the prediction for short links; 2) propose Dynamic Cluster Optimization to dynamically adjust clustering of hosts. Our experimental results show that Tarantula outperforms Pharos and Vivaldi significantly in terms of estimation accuracy. When implementing different NC systems in the application of server selection and detour finding, Tarantula again performs the best.
international conference on computer communications | 2009
Guodong Wang; Yang Chen; Lei Shi; Eng Keong Lua; Beixing Deng; Xing Li
Anycast paradigm has been widely adopted by Internet application to find nearby resources. The current IP anycast implementation suffers from poor scalability. To overcome that, this paper proposes Proxima, a network coordinates (NC) based application layer infrastructure, which is capable to provide lightweight and flexible anycast service. Proxima accurately chooses the best receiver from a group of candidates according to the Round Trip Time (RTT) for certain application. Our experimental results have demonstrated the excellent performance of Proxima in real sever selection scenario.