Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chih-Hao Chao is active.

Publication


Featured researches published by Chih-Hao Chao.


networks on chips | 2010

Traffic- and Thermal-Aware Run-Time Thermal Management Scheme for 3D NoC Systems

Chih-Hao Chao; Kai-Yuan Jheng; Hao-Yu Wang; Jia-Cheng Wu; An-Yeu Wu

Three-dimensional network-on-chip (3D NoC), the combination of NoC and die-stacking 3D IC technology, is motivated to achieve lower latency, lower power consumption, and higher network bandwidth. However, the length of heat conduction path and power density per unit area increase as more dies stack vertically. Routers of NoC have comparable thermal impact as processors and contributes significant to overall chip temperature. High temperature increases the vulnerability of the system in performance, power, reliability, and cost. To ensure both thermal safety and less performance impact from temperature regulation, we propose a traffic- and thermal-aware run-time thermal management (RTM) scheme. The scheme is composed of a proactive downward routing and a reactive vertical throttling. Based on a validated traffic-thermal mutual-coupling co-simulator, our experiments show the proposed scheme is effective. The proposed RTM can be combined with thermal-aware mapping techniques to have potential for higher run-time thermal safety.


networks on chips | 2007

A New Binomial Mapping and Optimization Algorithm for Reduced-Complexity Mesh-Based On-Chip Network

Wein-Tsung Shen; Chih-Hao Chao; Yu-Kuang Lien; An-Yeu Wu

This paper presents an efficient binomial IP mapping and optimization algorithm (BMAP) to reduce the hardware cost of on-chip network (OCN) infrastructure. The complexity of BMAP is O(N2log(N)). Based on our OCN system synthesis flow, the proposed algorithm provides more economic network component mapping in comparison with traditional OCN mapping algorithm. The experimental result shows total traffic on network is reduced by 37% and average network hop count is reduced by 46%. With further optimization, the hardware efficiency is enhanced therefore the total hardware cost of network infrastructure is reduced to 51%~85%


international symposium on vlsi design, automation and test | 2010

Traffic-thermal mutual-coupling co-simulation platform for three-dimensional Network-on-Chip

Kai-Yuan Jheng; Chih-Hao Chao; Hao-Yu Wang; An-Yeu Wu

Thermal issue is one of the major challenges in the research field of three-dimensional (3D) IC. Network-on-Chip (NoC) has been viewed as a practical communication infrastructure for 3D IC. To facilitate such research, an accurate and non-proprietary environment for simulating the NoC traffic and temperature is necessary. In this paper, we present a traffic-thermal mutual-coupling co-simulation platform for 3D NoC. The translation error is eliminated, and therefore our co-simulation has no accuracy loss on mutual coupling. Our simulation results, validated with a commercial tool, show the temperature error of our platform is between −1 and 4 K. The simulation results also show the thermal profile of 3D NoC, in which the temperature is imbalance even under the balanced traffic. Hence, the proposed platform can be used for 3D thermal-aware design, 3D dynamic thermal management technology, and other related researches in the future.


IEEE Transactions on Computers | 2008

Traffic-Balanced Routing Algorithm for Irregular Mesh-Based On-Chip Networks

Shu-Yen Lin; Chun-Hsiang Huang; Chih-Hao Chao; Keng-Hsien Huang; An-Yeu Wu

On-chip networks (OCNs) have been proposed to solve the increasing scale and complexity of the designs in nanoscale multicore VLSI designs. The concept of irregular meshes is an important issue because IPs of different sizes may be supported by various vendors. In order to solve routing problems in irregular meshes, modified routing algorithms to detour oversized IPs (OIPs) are needed. However, directly applying fault-tolerant routing algorithms may cause two serious problems: 1) heavy traffic loads around OIPs and 2) unbalanced traffic loads in irregular meshes. In this paper, we propose an OIP avoidance prerouting (OAPR) algorithm to solve the aforementioned problems. The proposed OAPR can make traffic loads evenly spread on the networks and shorten the average paths of packets. Therefore, the networks using the OAPR have lower latency and higher throughput than those using fault- tolerant routing algorithms. In our experiments, four different cases are simulated to demonstrate that the proposed OAPR improves 13.3 percent to 100 percent sustainable throughputs than two previous fault-tolerant routing algorithms. Moreover, the hardware overhead of the OAPR is less than 1 percent compared to the cost of a whole router. Hence, the proposed OAPR algorithm has good performance and is practical for irregular mesh-based OCNs.


international symposium on vlsi design, automation and test | 2009

Fault-tolerant router with built-in self-test/self-diagnosis and fault-isolation circuits for 2D-mesh based chip multiprocessor systems

Shu-Yen Lin; Wen-Chung Shen; Chan-Cheng Hsu; Chih-Hao Chao; An-Yeu Wu

A fault-tolerant router design (20-path router) is proposed to reduce the impacts of faulty routers for 2D-mesh based chip multiprocessor systems. In our experiments, the OCNs using 20PRs can reduce 75.65% ∼ 85.01% unreachable packets and 7.78% ∼ 26.59% latency in comparison with the OCNs using generic XY routers.


signal processing systems | 2010

Efficient parallelized particle filter design on CUDA

Min-An Chao; Chun-Yuan Chu; Chih-Hao Chao; An-Yeu Wu

Particle filtering is widely used in numerous nonlinear applications which require reconfigurability, fast prototyping, and online parallel signal processing. The emerging computing platform, CUDA, may be regarded as the most appealing platform for such implementation. However, there are not yet literatures exploring how to utilize CUDA for particle filters. This parer aims to provide two design techniques, A) finite-redraw importance-maximizing (FRIM) prior editing and B) localized resampling, for efficient implementation of particle filters on CUDA, which can be verified to reduce global operations and provide significant speedup. The modifications on algorithm and architectural mapping are evaluated with conceptual and quantitative analysis. From the classic bearings-only tracking experiments, the proposed design is 5.73 times faster than the direct implementation on GeForce 9400m.


nature and biologically inspired computing | 2010

Regional ACO-based routing for load-balancing in NoC systems

Hsien-Kai Hsin; En-Jui Chang; Chih-Hao Chao; An-Yeu Wu

Ant Colony Optimization (ACO) is a problem-solving technique that was inspired by the related research on the behavior of real-world ant colony. In the domain of Network-on-chip (NoC), ACO-based adaptive routing has been applied to achieve load-balancing effectively with historical information. However, the cost of the ACO network pheromone table is too high, and this overhead grows fast with the scaling of NoC. In order to fix this problem, it is essential to model the ACO algorithm in more careful consideration of the system architecture, available hardware resource, and appropriate transformation from the ant colony metaphor. In this paper, we analyzed the NoC network characteristic and bring about the corresponding issues of implementing ACO on NoC. We proposed a Regional ACO-based routing (RACO) with static and dynamic regional table forming technique to reduce the cost of table, share pheromone information, and adopt look-ahead model for further load-balancing. The experimental results show that RACO can be implemented with less memory, less cost increase on scaling, and better performance of load-balancing compared to traditional ACO-based routing.


signal processing systems | 2008

Location-Constrained Particle Filter human positioning and tracking system

Chih-Hao Chao; Chun-Yuan Chu; An-Yeu Wu

This paper proposes a Location-Constrained Particle Filter (LC-PF) for Radio Signal Strength Indication (RSSI) based indoor localization system. Based on proposed LC-PF, the RSSI fluctuation problem can be restrained. The proposed methods include location-constrained importance weight updating (LC-WU) and location-constrained propagation model (LC-model). LC-WU eliminates particles in prohibited regions based on the geolocation of the map. The LC-model propagates particles based on different turning probabilities in different regions. These two methods can be applied separately or jointly. The proposed LC-PF has 2.48 m average accuracy improvement over basic PF with 68% error reduction, and results in 2.07 m accuracy with 90% confidence.


ACM Transactions in Embedded Computing Systems | 2013

Transport-layer-assisted routing for runtime thermal management of 3D NoC systems

Chih-Hao Chao; Kun-Chih Chen; Tsu-Chu Yin; Shu-Yen Lin; An-Yeu Wu

To ensure thermal safety and to avoid performance degradation from temperature regulation in 3D NoC, we propose a new temperature-traffic control framework. The framework contains the vertical throttling-based runtime thermal management (VT-RTM) scheme and the transport-layer assisted routing (TLAR) scheme. VT-RTM scheme increases the cooling speed and maintains high availability. TLAR scheme sustains the throughput of the nonstationary irregular mesh network. In our experiments, VT-RTM scheme reduces cooling time by 84% and achieves 98% network availability; the overall performance impact is around 8% of traditional schemes. TLAR scheme reduces average latency by 35∼% and improves sustainable throughput by 76%


international conference on green circuits and systems | 2010

ACO-based Cascaded Adaptive Routing for traffic balancing in NoC systems

En-Jui Chang; Chih-Hao Chao; Kai-Yuan Jheng; Hsien-Kai Hsin; An-Yeu Wu

Ant Colony Optimization (ACO) is a bio-inspired algorithm extensively applied in optimization problems. The performance of Network-on-Chip (NoC) is generally dominated by traffic distribution and routing. With more precise network information for path selection by using pheromone, ACO-based adaptive routing has higher potential to overcome the unbalance and unpredictable traffic load. On the other hand, the implementation cost of ACO is in general too high to store network information in pheromone memory, which is a routing table of all destination-channel pairs. We propose an ACO-based Cascaded Adaptive Routing (ACO-CAR) by combining two features: 1) table reforming by eliminating redundant information of far destinations from full routing table, and 2) adaptive searching of cascaded point for more precise network information. Our experimental results show that ACO-CAR has lower latency and higher saturation throughput, and can be implemented with 19.05% memory of full routing table.

Collaboration


Dive into the Chih-Hao Chao's collaboration.

Top Co-Authors

Avatar

An-Yeu Wu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chun-Yuan Chu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Min-An Chao

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

En-Jui Chang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hsien-Kai Hsin

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Kai-Yuan Jheng

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Kun-Chih Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hao-Yu Wang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

An-Yeu Andy Wu

National Taiwan University

View shared research outputs
Researchain Logo
Decentralizing Knowledge