Is this you? Create Your Porfile

Hao Che

University of Texas at Arlington

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hao Che is active.

Explore More

Publication

Featured researches published by Hao Che.

IEEE Journal on Selected Areas in Communications | 2002

Hierarchical Web caching systems: modeling, design and experimental results

Hao Che; Ye Tung; Zhijun Wang

This paper aims at finding fundamental design principles for hierarchical Web caching. An analytical modeling technique is developed to characterize an uncooperative two-level hierarchical caching system where the least recently used (LRU) algorithm is locally run at each cache. With this modeling technique, we are able to identify a characteristic time for each cache, which plays a fundamental role in understanding the caching processes. In particular, a cache can be viewed roughly as a low-pass filter with its cutoff frequency equal to the inverse of the characteristic time. Documents with access frequencies lower than this cutoff frequency have good chances to pass through the cache without cache hits. This viewpoint enables us to take any branch of the cache tree as a tandem of low-pass filters at different cutoff frequencies, which further results in the finding of two fundamental design principles. Finally, to demonstrate how to use the principles to guide the caching algorithm design, we propose a cooperative hierarchical Web caching architecture based on these principles. Both model-based and real trace simulation studies show that the proposed cooperative architecture results in more than 50% memory saving and substantial central processing unit (CPU) power saving for the management and update of cache entries compared with the traditional uncooperative hierarchical caching architecture.

Performance Evaluation | 2006

The LCD interconnection of LRU caches and its analysis

Nikolaos Laoutaris; Hao Che; Ioannis Stavrakakis

In a multi-level cache such as those used for web caching, a hit at level l leads to the caching of the requested object in all intermediate caches on the reverse path (levels l - 1 ..... 1). This paper shows that a simple modification to this de facto behavior, in which only the l - 1 level cache gets to store a copy, can lead to significant performance gains. The modified caching behavior is called Leave Copy Down (LCD); it has the merit of being able to avoid the amplification of replacement errors and also the unnecessary repetitious caching of the same objects at multiple levels. Simulation results against other cache interconnections show that when LCD is applied under typical web workloads, it reduces the average hit distance. We construct an approximate analytic model for the case of LCD interconnection of LRU caches and use it to gain a better insight as to why the LCD interconnection yields an improved performance.

international conference on computer communications | 2001

Analysis and design of hierarchical Web caching systems

Hao Che; Zhijung Wang; Ye Tung

This paper aims at finding fundamental design principles for hierarchical Web caching. An analytical modeling technique is developed to characterize an uncooperative two-level hierarchical caching system where the least recently used (LRU) algorithm is locally run at each cache. With this modeling technique, we are able to identify a characteristic time for each cache, which plays a fundamental role in understanding the caching processes. In particular, a cache can be viewed roughly as a lowpass filter with its cutoff frequency equal to the inverse of the characteristic time. Documents with access frequencies lower than this cutoff frequency will have good chances to pass through the cache without cache hits. This viewpoint enables us to take any branch of the cache tree as a tandem of lowpass filters at different cutoff frequencies, which further results in the finding of two fundamental design principles. Finally, to demonstrate how to use the principles to guide the caching algorithm design, we propose a cooperative hierarchical Web caching architecture based on these principles. The simulation study shows that the proposed cooperative architecture results in 50% saving of the cache resource compared with the traditional uncooperative hierarchical caching architecture.

IEEE Transactions on Computers | 2008

DRES: Dynamic Range Encoding Scheme for TCAM Coprocessors

Hao Che; Zhijun Wang; Kai Zheng; Bin Liu

One of the most critical resource management issues in the use of ternary content-addressable memory (TCAM) for packet classification/filtering is how to effectively support filtering rules with ranges, known as range matching. In this paper, the dynamic range encoding scheme (DRES) is proposed to significantly improve the TCAM storage efficiency for range matching. Unlike the existing range encoding schemes requiring additional hardware support, DRES uses the TCAM coprocessor itself to assist range encoding. Hence, DRES can be readily programmed in a network processor using a TCAM coprocessor for packet classification. A salient feature of DRES is its ability to allow a subset of ranges to be encoded and, hence, to have full control over the range code size. This advantage allows DRES to exploit the TCAM structure to maximize the TCAM storage efficiency. DRES is a comprehensive solution, including a dynamic range selection algorithm, a search key encoding scheme, a range encoding scheme, and a dynamic encoded range update algorithm. Although the dynamic range selection algorithm running in the software allows optimal selection of ranges to be encoded to fully utilize the TCAM storage, the dynamic encoded range update algorithm allows the TCAM database to be updated lock free without interrupting the TCAM database lookup process. DRES is evaluated based on real-world databases and the results show that DRES can reduce the TCAM storage expansion ratio from 6.20 to 1.23. The performance analysis of DRES based on a probabilistic model demonstrates that DRES significantly improves the TCAM storage efficiency for a wide spectrum of range distributions.

IEEE Transactions on Computers | 2004

CoPTUA: Consistent Policy Table Update Algorithm for TCAM without locking

Zhijun Wang; Hao Che; Mohan Kumar; Sajal K. Das

Due to deterministic and fast lookup performance, ternary content addressable memory (TCAM) has recently been gaining popularity in general policy filtering (PF) for packet classification in high-speed networks. However, the PF table update poses significant challenges for efficient use of TCAM. To avoid erroneous and inconsistent rule matching, the traditional approach is to lock the PF table during the rule update period, but table locking has a negative impact on data path processing. In this paper, we propose a novel scheme, called Consistent Policy Table Update Algorithm (CoPTUA), for TCAM. Instead of minimizing the number of rule moves to reduce the locking time, CoPTUA maintains a consistent PF table throughout the update process, thus eliminating the need for locking the PF table while-ensuring correctness of rule matching. Our analysis and simulation show that, even for a PF table with 100,000 rules, an arbitrary number of rules can be updated simultaneously within 1 second in the worst case, provided that 2 percent of the PF table entries are empty. Thus, CoPTUA enforces any new rule in less than 1 second for practical PF table size with high memory utilization and without impacting data path processing.

IEEE Transactions on Parallel and Distributed Systems | 2004

A Scalable Asynchronous Cache Consistency Scheme (SACCS) for mobile environments

Zhijun Wang; Sajal K. Das; Hao Che; Mohan Kumar

In the literature, there exit two types of cache consistency maintenance algorithms for mobile computing environments: stateless and stateful. In a stateless approach, the server is unaware of the cache contents at a mobile user (MU). Even though stateless approaches employ simple database management schemes, they lack scalability and ability to support user disconnectedness and mobility. On the other hand, a stateful approach is scalable for large database systems at the cost of nontrivial overhead due to server database management. We propose a novel algorithm, called Scalable Asynchronous Cache Consistency Scheme (SACCS), which inherits the positive features of both stateless and stateful approaches. SACCS provides a weak cache consistency for unreliable communication (e.g., wireless mobile) environments with small stale cache hit probability. It is also a highly scalable algorithm with minimum database management overhead. The properties are accomplished through the use of flag bits at the server cache (SC) and MU cache (MUC), an identifier (ID) in MUC for each entry after its invalidation, and estimated time-to-live (TTL) for each cached entry, as well as rendering of all valid entries of MUC to uncertain state when an MU wakes up. The stale cache hit probability is analyzed and also simulated under the Rayleigh fading model of error-prone wireless channels. Comprehensive simulation results show that the performance of SACCS is superior to those of other existing stateful and stateless algorithms in both single and multicell mobile environments.

IEEE Transactions on Computers | 2006

DPPC-RE: TCAM-based distributed parallel packet classification with range encoding

Kai Zheng; Hao Che; Zhijun Wang; Bin Liu; Xin Zhang

Packet classification has been a critical data path function for many emerging networking applications. An interesting approach is the use of ternary content addressable memory (TCAM) to achieve deterministic, high-speed packet classification performance. However, apart from high cost and power consumption, due to slow growing clock rate for memory technology, in general, the traditional single TCAM-based solution has difficulty to keep up with fast growing line rates. Moreover, the TCAM storage efficiency is largely affected by the need to support rules with ranges or range matching. In this paper, a distributed TCAM scheme that exploits chip-level-parallelism is proposed to greatly improve the throughput performance. This scheme seamlessly integrates with a range encoding scheme which not only solves the range matching problem, but also ensures a balanced high throughput performance. A thorough theoretical worst-case analysis of throughput, processing delay, and power consumption, as well as the experimental results show that the proposed solution can achieve scalable throughput performance matching up to OC768 line rate or higher. The added TCAM storage overhead is found to be reasonably small for the five real-world classifiers studied

IEEE ACM Transactions on Networking | 2007

End-to-end optimal algorithms for integrated QoS, traffic engineering, and failure recovery

Bernardo A. Movsichoff; Constantino M. Lagoa; Hao Che

This paper addresses the problem of optimal quality of service (QoS), traffic engineering (TE) and failure recovery (FR) in computer networks by introducing novel algorithms that only use source inferrable information. More precisely, optimal data rate adaptation and load balancing laws are provided which are applicable to networks where multiple paths are available and multiple classes of service (CoS) are to be provided. Different types of multiple paths are supported, including point-to-point multiple paths, point-to-multipoint multiple paths, and multicast trees. In particular, it is shown that the algorithms presented only need a minimal amount of information for optimal control, i.e., whether a path is congested or not. Hence, the control laws provided in this paper allow source inferred congestion detection without the need for explicit congestion feedback from the network. The proposed approach is applicable to utility functions of a very general form and endows the network with the important property of robustness with respect to node/link failures; i.e., upon the occurrence of such a failure, the presented control laws reroute traffic away from the inoperative node/link and converge to the optimal allocation for the ldquoreducedrdquo network. The proposed control laws set the foundation for the development of highly scalable feature-rich traffic control protocols at the IP, transport, or higher layers with provable global stability and convergence properties.

international conference on computer communications | 2005

TCAM-based distributed parallel packet classification algorithm with range-matching solution

Kai Zheng; Hao Che; Zhijun Wang; Bin Liu

Packet classification (PC) has been a critical data path function for many emerging networking applications. An interesting approach is the use of TCAM to achieve deterministic, high speed PC However, apart from high cost and power consumption, due to slow growing clock rate for memory technology in general, PC based on the traditional single TCAM solution has difficulty to keep up with fast growing line rates. Moreover, the TCAM storage efficiency is largely affected by the need to support rules with ranges, or range matching. In this paper, a distributed TCAM scheme that exploits chip-level-parallelism is proposed to greatly improve the PC throughput. This scheme seamlessly integrates with a range encoding scheme, which not only solves the range matching problem but also ensures a balanced high throughput performance. Using commercially available TCAM chips, the proposed scheme achieves PC performance of more than 100 million packets per second (Mpps), matching OC768 (40 Gbps) line rate.

IEEE Journal on Selected Areas in Communications | 2005

Decentralized optimal traffic engineering in connectionless networks

Bernardo A. Movsichoff; Constantino M. Lagoa; Hao Che

This work addresses the problem of optimal traffic engineering in a connectionless autonomous system. Based on nonlinear control theory, the approach taken in This work provides a family of optimal adaptation laws. These laws enable each node in the network to independently distribute traffic among any given set of next hops in an optimal way, as measured by a given global utility function of a general form. This optimal traffic distribution is achieved with minimum information exchange between neighboring nodes. Furthermore, this approach not only allows for optimal multiple forwarding paths but also enables multiple classes of service, e.g., classes of service defined in the differentiated services architecture. Moreover, the proposed decentralized control scheme enables optimal traffic redistribution in the case of link failures. Suboptimal control laws are also presented in an effort to reduce the computational burden imposed on the nodes of the network. Finally, an implementation of these laws with currently available technology is discussed.

Explore More