Wen-Hsiang Hu
University of California, Irvine
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wen-Hsiang Hu.
parallel, distributed and network-based processing | 2011
Chifeng Wang; Wen-Hsiang Hu; Nader Bagherzadeh
Aggressive scaling of transistors allows integration of hundreds of processors on a chip. However, on-chip interconnects carrying signals between different blocks will be the bottleneck for system performance and reliability. To tackle this problem, we developed an on-chip communication infrastructure based on a network-on-chip architecture and developed a hybrid mechanism to transfer data among IP cores by taking advantages of both wired and wireless communications. By using on-chip antennas, one can provide on-chip wireless communication to transfer data across long distances and minimize transfer latency and energy dissipation accordingly. A wireless network-on-chip architecture was designed and evaluated, and the experimental results showed significant improvement in transfer latency, network throughput and energy dissipation.
Journal of Systems Architecture | 2011
Chifeng Wang; Wen-Hsiang Hu; Seung Eun Lee; Nader Bagherzadeh
This paper proposes a novel Network-on-Chip architecture that not only enhances network transmission performance while maintaining a feasible implementation cost, but also provides a power-efficient solution for interconnection network scenarios. Diagonally-linked mesh NoC that uses wormhole packet switching technique implements a high-performance NoC platform to meet both cost and power consumption requirements. The proposed architecture uses an adaptive quasi-minimal routing algorithm so that it can improve average latency and saturation traffic load owing to its flexibility and adaptiveness. Based on these features, a congestion-aware routing algorithm is proposed to balance traffic load so as to alleviate congestion caused by high throughput network activities. Simulation results show that saturation load is improved dramatically for various traffic patterns. Implementation results also show that employing diagonal links is a more area-efficient method for improving network performance than using large buffers. It is shown that congestion-aware router requires negligible cost overhead but provides better throughput. Finally, simulation results also reveal that power consumption in the proposed architecture outperforms traditional mesh networks.
international symposium on computer architecture | 2010
Chifeng Wang; Wen-Hsiang Hu; Nader Bagherzadeh
This paper proposes a novel congestion-aware Network-on-Chip (NoC) architecture that not only enhances network transmission performance while maintaining a feasible implementation cost, but also improves overall network throughput in various traffic scenarios. This congestion control scheme which consists of dynamic input arbitration and adaptive routing path selection is proposed to balance traffic load distribution so as to alleviate congestion caused by heavy network activities. Simulation results show that throughput is improved dramatically while maintaining superior latency performance for various traffic patterns. Cost evaluation results also show that congestion-aware router requires negligible cost overhead but provides better throughput for both mesh and diagonally-linked mesh NoC platforms.
The Journal of Supercomputing | 2015
Wen-Hsiang Hu; Chifeng Wang; Nader Bagherzadeh
Network-on-chip (NoC) architecture is regarded as a solution for future on-chip interconnects. However, the performance advantages of conventional NoC architectures are limited by the long latency and high power consumption due to multi-hop long-distance communication among processing elements. To solve these limitations, we employed on-chip wireless communication as express links for transferring data so that transfer latency can be reduced. A hybrid NoC architecture utilizing both wired and wireless communication approaches is proposed in this paper. We also devised a deadlock-free routing algorithm that is able to make efficient use of the incorporated wireless links. Moreover, simulated annealing optimization techniques were applied to find optimal locations for wireless routers. Cycle-accurate simulation results showed a significant improvement in transfer latency. Area and power consumption analysis demonstrates the feasibility of our proposed NoC architecture.
Journal of Computer and System Sciences | 2013
Chifeng Wang; Wen-Hsiang Hu; Nader Bagherzadeh
Adaptive routing algorithms have been employed in interconnection networks to improve network throughput and provide better fault tolerance characteristics. However, they can harm performance by disturbing any inherent global load balance through greedy local decisions. This paper proposes a novel scalable load balancing congestion-aware Network-on-Chip (NoC) architecture that not only enhances network transmission performance while maintaining a feasible implementation cost, but also improves overall network throughput for various traffic scenarios. This congestion control scheme which consists of dynamic input arbitration and adaptive routing path selection is proposed to balance global traffic load distribution so as to alleviate congestion caused by heavy network activities. Furthermore, faulty links information can be broadcasted by existing congestion management control signals to prevent packets from routing through defected areas in order to eliminate potential heavy congestion situations around these regions. Experimental results show that throughput is improved dramatically while maintaining superior latency performance for various traffic patterns. Compared to a baseline router, the proposed congestion management mechanism requires negligible cost overhead but provides better throughput for both mesh and diagonally-linked mesh NoC platforms.
Microprocessors and Microsystems | 2012
Chifeng Wang; Wen-Hsiang Hu; Nader Bagherzadeh
Integration of hundreds of processors on a chip will become practical thanks to ultra-deep submicron VLSI technology. As wiring delay becomes a bottleneck for a scalable design, on-chip interconnects turn into the critical issues for system performance and reliability. To mitigate long wiring delay impact, wireless on-chip communication infrastructures featuring hybrid mechanisms exploiting both wired and wireless communications have been proposed. By shortening long distance transmission latency and lowering wired network overhead, overall system transfer latency and energy dissipation are improved accordingly. However, unbalanced traffic management reduces the benefit of high speed wireless links so that wired networks also degrade accordingly. To extract the best performance for these hybrid networks, an intelligent congestion-aware router design is needed to balance traffic load and eliminate congestion delay. A sophisticated routing scheme accommodating more throughput and featuring light weight congestion-aware mechanism which uses the number of blocking buffers as a guideline to avoid serious congestion was designed and evaluated. Modified 7-port routers achieve better performance and the proposed routing algorithm successfully eliminates congestion scenarios and efficiently balances traffic loads. This is the first work to exchange congestion information locally and globally to improve network utilization and transmission quality. The experimental results showed significant improvement in transfer latency, network throughput and power efficiency with moderate hardware cost overhead.
symposium on computer architecture and high performance computing | 2009
Wen-Hsiang Hu; Jun Ho Bahn; Nader Bagherzadeh
Low Density Parity Check (LDPC) code is an error correction code that can achieve performance close to Shannon limit and inherently suitable for parallel implementation. It has been widely adopted in various communication standards such as DVB-S2, WiMAX, and Wi-Fi. However, the irregular message exchange pattern is a major challenge in LDPC decoder implementation In addition, faced with an era that diverse applications are integrated in a single system, a flexible, scalable, efficient and cost-effective implementation of LDPC decoder is highly preferable. In this paper, we proposed a multi-processor platform based on network-on-chip (NoC) interconnect as a solution to these problems. By using a distributed and cooperative way for LDPC decoding, the memory bottleneck commonly seen in LDPC decoder design is eliminated. Simulation results from long LDPC codes with various code rates show good scalability and speedups are obtained by our approach.
international soc design conference | 2009
Wen-Hsiang Hu; Chun-Yi Chen; Nader Bagherzadeh
Low Density Parity Check (LDPC) code is an error correction code that has near Shannon limit performance and is inherently suitable for parallel implementation. It has been widely used in several communication standards such as DVB-S2, WiMAX, and Wi-Fi. To address the need for supporting various LDPC codes in an era where diverse applications are integrated onto a single system, a multi-processor based implementation of the LDPC decoder was proposed. However, the heavy message exchange among processors limits the expected performance. In this paper, we present a partitioning algorithm based on graph spectral clustering to reduce the data communication during the decoding process. From the experiments, our approach successfully decreased the amount of inter-processor communication by 33% ∼ 52%, as compared to the original sequential mapping approach. Together with the more balanced computation load from our algorithm, an improvement of up to 85% in the overall decoding time was observed.
parallel, distributed and network-based processing | 2010
Chifeng Wang; Wen-Hsiang Hu; Seung Eun Lee; Nader Bagherzadeh
This paper proposes a novel Network-on-Chip (NoC) architecture that not only enhances network transmission performance while maintaining implementation cost feasible, but also provides a power-efficient solution for interconnection network scenarios. Diagonally-linked mesh (DMesh) NoC that uses wormhole packet switching technique implements a high-performance NoC platform to meet both cost and power consumption requirements. The proposed architecture uses an adaptive quasi-minimal routing algorithm so that DMesh can improve average latency and saturation traffic load owing to its flexibility and adaptiveness. In addition, implementation results show that employing diagonal links is a more area-efficient way for improving network performance than using large buffers. Simulation results also reveal that power consumption in DMesh networks outperforms traditional Mesh networks.
parallel, distributed and network-based processing | 2012
Wen-Hsiang Hu; Chifeng Wang; Nader Bagherzadeh
Network-on-chip (NoC) architecture is regarded as a solution for future on-chip interconnects. However, the performance advantages of conventional NoC architectures are limited by the long latency and high power consumption due to multi-hop long distance communication among processing elements. To solve these limitations, we employed on-chip wireless communication as express links for transferring data so that transfer latency can be reduced. A hybrid NoC architecture utilizing both wired and wireless communication approaches is proposed in this paper. We also devised a deadlock free routing algorithm that is able to make efficient use of the incorporated wireless links. Moreover, simulated annealing optimization techniques were applied to find optimal locations for wireless routers. Cycle-accurate simulation results showed a significant improvement in transfer latency. Area and power consumption analysis demonstrates the feasibility of our proposed NoC architecture.