Jiahai Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jiahai Yang is active.

Explore More

Publication

Featured researches published by Jiahai Yang.

acm special interest group on data communication | 2011

CNGI-CERNET2: an IPv6 deployment in China

Jianping Wu; Jessie Hui Wang; Jiahai Yang

Research and promotion of next generation Internet have drawn attention of researchers in many countries. In USA, FIND initiative takes a clean-slate approach. In EU, EIFFEL think tank concludes that both clean slate and evolutionary approach are needed. While in China, researchers and the country are enthusiastic on the promotion and immediate deployment of IPv6 due to the imminent problem of IPv4 address exhaustion. Since 2003, China launched a strategic programme called China Next Generation Internet (CNGI). China is expecting that Chinese industry is better positioned on future Internet technologies and services than it was for the first generation. Under the support of CNGI grant, China Education and Research Network (CERNET) started to build an IPv6-only network, i.e. CNGI-CERNET2. Currently it provides IPv6 access service for students and staff in many Chinese universities. In this article, we will introduce the CNGI programme, the architecture of CNGI-CERNET2, and some aspects of CNGI-CERNET2s deployment and operation, such as transition, security, charging and roaming service etc.

international conference on computer communications | 2011

On the scalability of router forwarding tables: Nexthop-Selectable FIB aggregation

Qing Li; Dan Wang; Mingwei Xu; Jiahai Yang

In recent years, the core-net routing table, e.g., Forwarding Information Base (FIB), is growing at an alarming speed and this has become a major concern for Internet Service Providers. One effective solution for this routing scalability problem, which requires only upgrades on individual routers, is FIB aggregation. Intrinsically, IP prefixes with numerical prefix matching and the same next hop can be aggregated. Very commonly, all previous studies assume that each IP prefix has one corresponding next hop, i.e., towards one optimal path. In this paper, we argue that a packet can be delivered to its destination through a path other than the one optimal path. Based on this observation, we for the first time propose Nexthop-Selectable FIB Aggregation that is fundamentally different from all previous aggregation schemes. IP prefixes are aggregated if they have numerical prefix matching and share one common next hop. Consequently, IP prefixes that cannot be aggregated, due to lack of the same next hop, are aggregated; and we achieve a substantially higher aggregation ratio. In this paper, we provide a systematic study on this Nexthop-Selectable FIB Aggregation problem. We present several practical choices to build the sets of selectable next hops for the IP prefixes. To maximize the aggregation, we formulate the problem as an optimization problem. We show that the problem can be solved by dynamic programming. While the straightforward application of dynamic programming has exponential complexity, we propose a novel algorithm that is O(N). We then develop an optimal online algorithm with constant running time. We evaluate our algorithms through a comprehensive set of simulations with BRITE with RIBs collected from RouteViews. Our evaluation shows that we can reduce more than an order of the FIB size.

global communications conference | 1999

A scalable, Web-based architecture for hierarchical network management

Jiahai Yang; Peiyu Wang; Jianping Wu

We present a hierarchical, Web- and platform-based network management architecture for resolving problems of scalability, management efficiency, and manager autonomy in large, multiple domain networks. The proposed architecture consists of multiple domain managers and a manager of managers, each responsible for a different management domain, and each can run independently or cooperatively. The structure of the SuperDomain, an implemented network management system, based on this architecture is outlined. The manager-to-manager communication mechanism based on the extensible SNMP Trap feature is discussed.

network operations and management symposium | 2012

PCA-subspace method — Is it good enough for network-wide anomaly detection

Bin Zhang; Jiahai Yang; Jianping Wu; Donghong Qin; Lei Gao

PCA-subspace method has been proposed for network-wide anomaly detection. Normal subspace contamination is still a great challenge for PCA although some methods are proposed to reduce the contamination. In this paper, we apply PCA-subspace method to six-month Origin-Destination (OD) flow data from the Abilene. The result shows that normal subspace contamination is mainly caused by anomalies from a few strongest OD flows, and seems unavoidable for subspace method. Further comparison of anomalies detected by subspace method and manually tagged anomalies from each OD flows, we find that anomalies detected by subspace method are mainly caused by anomalies from medium and a few large OD flows, and most anomalies of minor OD flows are buried in abnormal subspace and hard to be detected by PCA-subspace method. We analyze the reason for those anomalies undetected by subspace method and suggest to use normal subspace to detect anomalies caused by a few strongest OD flows, and to further divide abnormal subspace to detect more anomalies from minor OD flows. The goal of this paper is to address limitations neglected by prior works and further improve the subspace method on one hand, also call for novel detection methods for network-wide traffic on another hand.

international conference on distributed computing systems | 2009

Selective Protection: A Cost-Efficient Backup Scheme for Link State Routing

Meijia Hou; Dan Wang; Mingwei Xu; Jiahai Yang

In recent years, there are substantial demands to reduce packet loss in the Internet. Among the schemes proposed, finding backup paths in advance is considered to be an effective method to reduce the reaction time. Very commonly, a backup path is chosen to be a most disjoint path from the primary path, or in the network level, backup paths are computed for all links (e.g., IPRFF). The validity of this straightforward choice is based on 1) all the links may fail with equal probability; and 2) facing the high protection requirement today, having links not protected or sharing links between the primary and backup paths just simply look weird. Nevertheless, indications from many research studies have confirmed that the vulnerability of the links in the Internet is far from equality. In addition, we have seen that full protection schemes may introduce high costs. In this paper, we argue that such approaches may not be cost effective. We first analyze the failure characteristics based on real world traces from CERNET2, the China education and Research NETwork 2. We observe that the failure probabilities of the links is heavy-tail, i.e., a small set of links caused most of the failures. We thus propose a selective protection scheme. We carefully analyze the implementation details and the overhead for general backup path schemes of the Internet today. We formulate an optimization problem where the routing performance (in terms of network level availability) should be guaranteed and the backup cost should be minimized. This cost is special as it involves computation overhead. Consequently, we propose a novel Critical-Protection Algorithm which is fast itself. We evaluate our scheme systematically, using real world topologies and randomly generated topologies. We show significant gain even when the network availability requirement is 99.99\% as compared to that of the full protection scheme.

international conference on communication technology | 2011

IP traffic classification based on machine learning

Donghong Qin; Jiahai Yang; Jiamian Wang; Bin Zhang

With the rapid development of Internet, many network applications (e.g., P2P) use dynamic ports and encryption technology, which makes the traditional port and payload-based classification methods ineffective. Hence, it is important and necessary to find the more effective ones. Currently the machine learning (ML) techniques provide a promising alternative one for IP traffic classification. In this work, we use the ML-based classification method to identify the classes of the unknown flows using the payload-independent statistical features such as packet-length and arrival-interval. In order to improve the efficiency of the classification methods, the feature reduction techniques are further adopted to refine the selected features for attaining a best group of features. Finally we compare and evaluate the ML classification algorithms based on the BRASIL data source in terms of the three metrics such as overall accuracy, average precision and average recall. Our experiments show that the decision-tree algorithm is the best ML one for IP traffic classification and is able to construct the real-time classification system.

Journal of Parallel and Distributed Computing | 2016

Joint scheduling of MapReduce jobs with servers

Xiao Ling; Yi Yuan; Dan Wang; Jiangchuan Liu; Jiahai Yang

MapReduce-like frameworks have achieved tremendous success for large-scale data processing in data centers. A key feature distinguishing MapReduce from previous parallel models is that it interleaves parallel and sequential computation. Past schemes, and especially their theoretical bounds, on general parallel models are therefore, unlikely to be applied to MapReduce directly. There are many recent studies on MapReduce job and task scheduling. These studies assume that the servers are assigned in advance. In current data centers, multiple MapReduce jobs of different importance levels run together. In this paper, we investigate a schedule problem for MapReduce taking server assignment into consideration as well. We formulate a MapReduce server-job organizer problem (MSJO) and show that it is NP-complete. We develop a 3-approximation algorithm and a fast heuristic design. Moreover, we further propose a novel fine-grained practical algorithm for general MapReduce-like task scheduling problem. Finally, we evaluate our algorithms through both simulations and experiments on Amazon EC2 with an implementation with Hadoop. The results confirm the superiority of our algorithms. We investigate a schedule problem for MapReduce-like frameworks by taking server assignment into consideration.We formulate the MapReduce server-job organizer problem (MSJO) and show that it is NP-complete.We propose a 3-approximation algorithm and a fast heuristic design to address the MSJO problem.We implement our algorithms and some state-of-the-art algorithms on Amazon EC2 with deploying schedulers in Hadoop.By comprehensive simulations and experiments, the results show that our algorithm outperforms other classical strategies.

Computer Communications | 2014

Research papers: A study of traffic from the perspective of a large pure IPv6 ISP

Fuliang Li; Changqing An; Jiahai Yang; Jianping Wu; Hui Zhang

Our understanding of IPv6 traffic cannot keep up with the growth of IPv6 traffic. Unraveling the characteristics of traffic is essential for network scale expansion, network technology selection, network management and security enhancement. In this paper, we conduct a comprehensive study of IPv6 traffic based on the packet-level traces of a nation-wide pure IPv6 network - CERNET2, and track user behaviors in 6TUNET, one of the largest campus network in CERNET2, by binding IP address with user name. We first analyze the usage and development of IPv6 network, especially user behaviors and new technologies, e.g. the efficiency of fine-grained source address validation technology which is widely deployed in CERNET2. Then we investigate the distribution of the aggregate traffic and the results reveal that traffic distribution is highly skewed among protocols, ports, applications and hosts. We pay particular attention to dominating protocols, ports, applications and hosts, as well as special protocols of IPv6 network, e.g. the usage of extension headers, which supplement the simplified basic header of IPv6. At last, we model the skewness in traffic distribution and present the dynamics of the traffic from the aspects of traffic prediction and inference. Based on the analysis, we obtain a comprehensive knowledge of IPv6 traffic which, we believe, can provide an experimental basis for IPv6 network operators and researchers.

network operations and management symposium | 2012

What's going on in Chinese IPv6 world

Lei Gao; Jiahai Yang; Hui Zhang; Donghong Qin; Bin Zhang

With IPv4 addresses quickly dwindling, the Internet is forcing an evolution of itself. During the long term transition from IPv4 to IPv6, whats going on in IPv6 world becomes unknown for network operators and researchers. In this paper, we propose a heuristic algorithm to identify p2p traffic accurately and implement traffic classification based on Netflow v9 exports to illustrate what applications Chinese IPv6 users are really running. Additionally, we present a detailed study of p2p traffic over IPv6 and advice ISPs to localize p2p traffic at the AS level for future IPv6 traffic management and network resources planning, leaving modeling traffic behavior and deeper classification of IPv6 traffic as our future work.

asia-pacific network operations and management symposium | 2011

MCST: Anomaly detection using feature stability for packet-level traffic

Bin Zhang; Jiahai Yang; Jianping Wu; Donghong Qin; Lei Gao

In this paper, we present a statistical analysis of six traffic features based on entropy and distinct feature number at the packet level, and we find that, although these traffic features are unstable and show seasonal patterns like traffic volume for a long period, they are stable and consistent with Gaussian distribution in a short time period. However, this equilibrium property will be violated by some anomalies. Based on this observation, we propose a Multi-dimensional Clustering method for Short-time scale Traffic(MCST) to classify abnormal and normal traffic. We compare our new method to the well known wavelet technique. The detection result on synthetic anomaly traffic shows MCST can better detect the low-rate attacks than wavelet-based method, and detection result on real traffic demonstrates that MCST can detect more anomalies with low false alarm rate.

Explore More