Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where H. Jonathan Chao is active.

Publication


Featured researches published by H. Jonathan Chao.


conference on computer communications workshops | 2011

Use of devolved controllers in data center networks

Adrian Sai-Wah Tam; Kang Xi; H. Jonathan Chao

In a data center network, for example, it is quite often to use controllers to manage resources in a centralized manner. Centralized control, however, imposes a scalability problem. In this paper, we investigate the use of multiple independent controllers instead of a single omniscient controller to manage resources. Each controller looks after a portion of the network only, but they together cover the whole network. This therefore solves the scalability problem. We use flow allocation as an example to see how this approach can manage the bandwidth use in a distributed manner. The focus is on how to assign components of a network to the controllers so that (1) each controller only need to look after a small part of the network but (2) there is at least one controller that can answer any request. We outline a way to configure the controllers to fulfill these requirements as a proof that the use of devolved controllers is possible. We also discuss several issues related to such implementation.


Computer Networks | 2014

Improving the performance of load balancing in software-defined networks through load variance-based synchronization

Zehua Guo; Mu Su; Yang Xu; Zhemin Duan; Luo Wang; Shufeng Hui; H. Jonathan Chao

Software-Defined Networking (SDN) is a new network technology that decouples the con- trol plane logic from the data plane and uses a programmable software controller to man- age network operation and the state of network components. In an SDN network, a logically centralized controller uses a global network view to conduct management and operation of the network. The centralized control of the SDN network presents a tremen- dous opportunity for network operators to refactor the control plane and to improve the performance of applications. For the application of load balancing, the logically centralized controller conducts Real-time Least loaded Server selection (RLS) for multiple domains, where new flows pass by for the first time. The function of RLS is to enable the new flows to be forwarded to the least loaded server in the entire network. However, in a large-scale SDN network, the logically centralized controller usually consists of multiple distributed controllers. Existing multiple controller state synchronization schemes are based on Peri- odic Synchronization (PS), which can cause undesirable situations. For example, frequent synchronizations may result in high synchronization overhead of controllers. State desyn- chronization among controllers during the interval between two consecutive synchroniza- tions could lead to forwarding loops and black holes. In this paper, we propose a new type of controller state synchronization scheme, Load Variance-based Synchronization (LVS), to improve the load-balancing performance in the multi-controller multi-domain SDN net- work. Compared with PS-based schemes, LVS-based schemes conduct effective state syn- chronizations among controllers only when the load of a specific server or domain exceeds a certain threshold, which significantly reduces the synchronization overhead of controllers. The results of simulations show that LVS achieves loop-free forwarding and good load-balancing performance with much less synchronization overhead, as compared with existing schemes.


Computer Networks | 2000

A differentiated services architecture for multimedia streaming in next generation Internet

Yiwei Thomas Hou; Dapeng Wu; Bo Li; Takeo Hamada; Ishfaq Ahmad; H. Jonathan Chao

This paper presents a DiAerentiated Services (DiAserv or DS) architecture for multimedia streaming applications. Specifically, we define two types of services in the context of Assured Forwarding (AF) per hop behavior (PHB) that are diAerentiated in terms of reliability of packet delivery: the High Reliable (HR) service and the Less Assured (LA) service. We propose a novel node mechanism called Selective Pushout with Random Early Detection (SPRED) that is capable of simultaneously achieving the following four objectives: (1) a core router does not maintain any state information for each flow (i.e., core-stateless); (2) the packet sequence within each flow is not re-ordered at a node; (3) packets from HR service are delivered more reliably than packets from LA service at a node during congestion; and (4) packets from TCP traAc are dropped randomly to avoid global synchronization during congestion. We show that SPRED is a generalized buAer management algorithm of both tail-dropping and Random Early Detection (RED), and combines the best features of pushout (PO), RED and RED with In/Out (RIO) mechanisms. Simulation results demonstrate that under the same link speed and network topology, network nodes employing our DiAserv architecture have substantial performance improvement over the current Best EAort (BE) Internet architecture for multimedia streaming applications. ” 2000 Elsevier Science B.V. All rights reserved.


international conference on computer communications | 2012

Block permutations in Boolean Space to minimize TCAM for packet classification

Rihua Wei; Yang Xu; H. Jonathan Chao

Packet classification is one of the major challenges in designing high-speed routers and firewalls as it involves sophisticated multi-dimensional searching. Ternary Content Addressable Memory (TCAM) has been widely used to implement packet classification thanks to its parallel search capability and constant processing speed. However, TCAM-based packet classification has the well-known range expansion problem, resulting in a huge waste of TCAM entries. In this paper, we propose a novel technique called Block Permutation (BP) to compress the packet classification rules stored in TCAMs. The compression is achieved by performing block-based permutations on the rules represented in Boolean Space. We develop an efficient heuristic approach to find the permutations for compression and design its hardware implementation. Experiments on ClassBench classifiers and ISP classifiers show that the proposed BP technique can reduce TCAM entries by 53.99% on average.


international conference on communications | 2013

Intelligent virtual machine placement for cost efficiency in geo-distributed cloud systems

Kuan Yin Chen; Yang Xu; Kang Xi; H. Jonathan Chao

An important challenge of running large-scale cloud services in a geo-distributed cloud system is to minimize the overall operating cost. The operating cost of such a system includes two major components: electricity cost and wide-area-network (WAN) communication cost. While the WAN communication cost is minimized when all virtual machines (VMs) are placed in one datacenter, the high workload at one location requires extra power for cooling facility and results in worse power usage effectiveness (PUE). In this paper, we develop a model to capture the intrinsic trade-off between electricity and WAN communication costs, and formulate the optimal VM placement problem, which is NP-hard due to its binary and quadratic nature. While exhaustive search is not feasible for large-scale scenarios, heuristics which only minimize one of the two cost terms yield less optimized results. We propose a cost-aware two-phase metaheuristic algorithm, Cut-and-Search, that approximates the best trade-off point between the two cost terms. We evaluate Cut-and-Search by simulating it over multiple cloud service patterns. The results show that the operating cost has great potential of improvement via optimal VM placement. Cut-and-Search achieves a highly optimized trade-off point within reasonable computation time, and outperforms random placement by 50%, and the partial-optimizing heuristics by 10-20%.


international conference on computer communications | 2010

FlashTrie: Hash-based Prefix-Compressed Trie for IP Route Lookup Beyond 100Gbps

Masanori Bando; H. Jonathan Chao

It is becoming apparent that the next generation IP route lookup architecture needs to achieve speeds of 100-Gbps and beyond while supporting both IPv4 and IPv6 with fast real-time updates to accommodate ever-growing routing tables. Some of the proposed multibit-trie based schemes, such as Tree Bitmap, have been used in todays high-end routers. However, their large data structure often requires multiple external memory accesses for each route lookup. A pipelining technique is widely used to achieve high-speed lookup with a cost of using many external memory chips. Pipelining also often leads to poor memory load-balancing. In this paper, we propose a new IP route lookup architecture called FlashTrie that overcomes the shortcomings of the multibit-trie based approach. We use a hash-based membership query to limit off-chip memory accesses per lookup to one and to balance memory utilization among the memory modules. We also develop a new data structure called Prefix-Compressed Trie that reduces the size of a bitmap by more than 80%. Our simulation and implementation results show that FlashTrie can achieve 160-Gbps worst-case throughput while simultaneously supporting 2-M prefixes for IPv4 and 279-k prefixes for IPv6 using one FPGA chip and four DDR3 SDRAM chips. FlashTrie also supports incremental real-time updates.


IEEE ACM Transactions on Networking | 2012

FlashTrie: beyond 100-Gb/s IP route lookup using hash-based prefix-compressed trie

Masanori Bando; Yi Li Lin; H. Jonathan Chao

It is becoming apparent that the next-generation IP route lookup architecture needs to achieve speeds of 100 Gb/s and beyond while supporting IPv4 and IPv6 with fast real-time updates to accommodate ever-growing routing tables. Some of the proposed multibit-trie-based schemes, such as TreeBitmap, have been used in todays high-end routers. However, their large data structures often require multiple external memory accesses for each route lookup. A pipelining technique is widely used to achieve high-speed lookup with the cost of using many external memory chips. Pipelining also often leads to poor memory load-balancing. In this paper, we propose a new IP route lookup architecture called FlashTrie that overcomes the shortcomings of the multibit-trie-based approaches. We use a hash-based membership query to limit off-chip memory accesses per lookup and to balance memory utilization among the memory modules. By compacting the data structure size, the lookup depth of each level can be increased. We also develop a new data structure called Prefix-Compressed Trie that reduces the size of a bitmap by more than 80%. Our simulation and implementation results show that FlashTrie can achieve 80-Gb/s worst-case throughput while simultaneously supporting 2 M prefixes for IPv4 and 318 k prefixes for IPv6 with one lookup engine and two Double-Data-Rate (DDR3) SDRAM chips. When implementing five lookup engines on a state-of-the-art field programmable gate array (FPGA) chip and using 10 DDR3 memory chips, we expect FlashTrie to achieve 1-Gpps (packet per second) throughput, equivalent to 400 Gb/s for IPv4 and 600 Gb/s for IPv6. FlashTrie also supports incremental real-time updates.


Archive | 2013

A Petabit Bufferless Optical Switch for Data Center Networks

Kang Xi; Yu-Hsiang Kao; H. Jonathan Chao

As critical infrastructures in the Internet, data centers have evolved to include hundreds of thousands of servers in a single facility to support data- and computing-intensive applications. For such large-scale systems, it becomes a great challenge to design an interconnection network that provides high capacity, low complexity, and low latency. The traditional approach is to build a hierarchical packet network using switches and routers. This approach has scalability problems in the aspects of wiring, control, and latency. We tackle the challenge by designing a novel switch architecture that supports direct interconnection of a huge number of server racks and provides Petabit switching capacity. Our design combines the best features of electronics and optics. Exploiting recent advances in optics, we propose to build a bufferless optical switch fabric that includes interconnected arrayed waveguide grating routers (AWGRs) and tunable wavelength converters (TWCs). The optical fabric is integrated with electronic buffering and control to perform high-speed switching with nanosecond-level reconfiguration overhead. In particular, our architecture reduces the wiring complexity from O(N) to O(sqrt(N)). We design a practical and scalable scheduling algorithm to achieve high throughput under various traffic load. We also discuss implementation issues to justify the feasibility of this design. Simulation results show that our design achieves good throughput and delay performance.


acm special interest group on data communication | 2014

CAB: a reactive wildcard rule caching system for software-defined networks

Bo Yan; Yang Xu; Hongya Xing; Kang Xi; H. Jonathan Chao

Software-Defined Networking (SDN) enables flexible flow control by caching policy rules at OpenFlow switches. Compared with exact-match rule caching, wildcard rule caching can better preserve the flow table space at switches. However, one of the challenges for wildcard rule caching is the dependency between rules, which is generated by caching wildcard rules overlapped in field space with different priorities. Failure to handle the rule dependency may lead to wrong matching decisions for newly arrived flows, or may introduce high storage overhead in flow table memory. In this paper, we propose a wildcard rule caching system for SDN named CAching in Buckets (CAB). The main idea of CAB is to partition the field space into logical structures called buckets, and cache buckets along with all the associated rules. Through CAB, we resolve the rule dependency problem with small storage overhead. Compared to previous schemes, CAB reduces the flow setup requests by an order of magnitude, saves control bandwidth by a half, and significantly reduce average flow setup time.


networks on chips | 2010

Design of High-Radix Clos Network-on-Chip

Yu Hsiang Kao; Najla Alfaraj; Ming Yang; H. Jonathan Chao

Many high-radix Network-on-Chip (NOC) topologies have been proposed to improve network performance with an ever-growing number of processing elements (PEs) on a chip. We believe Clos Network-on-Chip (CNOC) is the most promising with its low average hop counts and good load-balancing characteristics. In this paper, we propose (1) a high-radix router architecture with Virtual Output Queue (VOQ) buffer structure and Packet Mode Dual Round-Robin Matching (PDRRM) scheduling algorithm to achieve high speed and high throughput in CNOC, (2) a heuristic floor-planning algorithm to minimize the power consumption caused by the long wires. Experimental results show that the throughput of a 64-node 3-stage CNOC under uniform traffic increases from 62% to 78% by replacing the baseline routers with PDRRM VOQ routers. We also compared CNOC with other NOC topologies, and found that using the new design techniques, CNOC has the highest throughput, lowest zero-load latency, and best power efficiency.

Collaboration


Dive into the H. Jonathan Chao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zehua Guo

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge