Daihan Wang
Keio University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daihan Wang.
asia and south pacific design automation conference | 2008
Hiroki Matsutani; Michihiro Koibuchi; Daihan Wang; Hideharu Amano
Since on-chip routers in network-on-chips play a key role in on-chip communication between cores, they should be always preparing for packet injections even if a part of cores are in standby mode, resulting in a larger standby power of routers compared with cores. The run-time power gating of individual channels in a router is one of attractive solutions to reduce the standby power of chip without affecting the on-chip communication. However, a state transition between sleep and active mode incurs the performance penalty, and turning a power switch on or off dissipates the overhead energy, which means a short-term sleep adversely increases the power consumption. In this paper, we propose a sleep control method based on look-ahead routing that detects the arrival of packets two hops ahead, so as to hide the wake-up delay and reduce the short-term sleeps of channels. Simulation results using real application traces show that the proposed method conceals the wake-up delay of less than five cycles, and more leakage power can be saved compared with the original naive method.
networks on chips | 2008
Hiroki Matsutani; Michihiro Koibuchi; Daihan Wang; Hideharu Amano
In this paper, we introduce the use of slow-silent virtual channels to reduce the switching power of on-chip networks while keeping the leakage power small. Adding virtual channels to a network improves the throughput until each link bandwidth is saturated. This enables us to reduce the switching power of on-chip networks by decreasing their operating frequency and supply voltage. However, adding virtual channels increases the leakage power of routers as well as the area due to their large buffers; so the runtime power gating is applied to individual virtual channels to eliminate this problem. We evaluate the performance of slow-silent virtual channels by using real application traces, and their power consumption (switching and leakage) is evaluated based on the detailed design of a virtual-channel router placed and routed with a 90 nm technology. These evaluation results show that a network with three or four virtual channels achieves the best energy efficiency in a uniform traffic. In the cases of neighboring communications, a network with two virtual channels is better than the other networks with more virtual channels, because the performance improvement from no virtual channel to two virtual channels is the largest and their frequency and supply voltage can also be reduced well in these cases.
field-programmable logic and applications | 2008
Daihan Wang; Hiroki Matsutani; Hideharu Amano; Michihiro Koibuchi
While the regular 2-D mesh topology has been utilized for most of network-on-chips (NoCs) on FPGAs, spatially biased traffic in some applications make some customization method feasible. A link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost with enough performance being kept. Two policies are proposed to avoid deadlocks and better performance can be achieved compared with up*/down* routing on the irregular topology with links removed. In the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.
field-programmable logic and applications | 2007
Daihan Wang; Hiroki Matsutani; Hideharu Amano; Michihiro Koibuchi
A temporal correlation based port combination algorithm that customizes the router design in network-on-chip (NoC) is proposed for reconfigurable systems in order to minimize required hardware amount. Given the traffic characteristics of the target application and the expected hardware amount reduction rate, the algorithm automatically makes the port combination plan for the networks. Since the port combination technique has the advantage of almost keeping the topology, it does not affect the design of the other layers, such as task mapping and scheduling. The algorithm shows much better efficiency than the algorithm without temporal correlation. For the multimedia stream processing application, the algorithm can save 55% of the hardware amount without performance degradation, while the non-temporal correlation algorithm suffers from 30% performance loss.
embedded and real-time computing systems and applications | 2011
Daihan Wang; Michihiro Koibuchi; Tomohiro Yoneda; Hiroki Matsutani; Hideharu Amano
Network-on-Chip (NoC) is considered to be a promising approach to implement many-core systems and a large number of on-chip router optimization studies have been proposed. In this paper, we propose to dynamically adjust link-width of each port on a router optimized to spatially biased traffic. Different from the previous No Coptimization approaches, in which the optimization is almost performed in the NoC design step, the proposed method achieves a dynamical link-width optimization at run-time.
IEICE Transactions on Information and Systems | 2007
Daihan Wang; Hiroki Matsutani; Michihiro Koibuchi; Hideharu Amano
A temporal correlation based port combination algorithm that customizes the router design in Network-on-Chip (NoC) is proposed for reconfigurable systems in order to minimize required hardware amount. Given the traffic characteristics of the target application and the expected hardware amount reduction rate, the algorithm automatically makes the port combination plan for the networks. Since the port combination technique has the advantage of almost keeping the topology including two-surface layout, it does not affect the design of the other layer, such as task mapping and scheduling. The algorithm shows much better efficiency than the algorithm without temporal correlation. For the multimedia stream processing application, the algorithm can save 55% of the hardware amount without performance degradation, while the none temporal correlation algorithm suffers from 30% performance loss.
ERSA | 2006
Daihan Wang; Hiroki Matsutani; Masato Yoshimi; Michihiro Koibuchi; Hideharu Amano
IEICE technical report. Dependable computing | 2011
Chammika Mannakkara; Daihan Wang; Vijay Holimath; Tomohiro Yoneda
電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム | 2006
Daihan Wang; Hiroki Matsutani; Masato Yoshimi; Michihiro Koibuchi; Hideharu Amano
IEICE technical report. Dependable computing | 2011
Daihan Wang; Chammika Mannakkara; Vijay Holimath; Tomohiro Yoneda