Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shu-Yen Lin is active.

Publication


Featured researches published by Shu-Yen Lin.


IEEE Transactions on Computers | 2008

Traffic-Balanced Routing Algorithm for Irregular Mesh-Based On-Chip Networks

Shu-Yen Lin; Chun-Hsiang Huang; Chih-Hao Chao; Keng-Hsien Huang; An-Yeu Wu

On-chip networks (OCNs) have been proposed to solve the increasing scale and complexity of the designs in nanoscale multicore VLSI designs. The concept of irregular meshes is an important issue because IPs of different sizes may be supported by various vendors. In order to solve routing problems in irregular meshes, modified routing algorithms to detour oversized IPs (OIPs) are needed. However, directly applying fault-tolerant routing algorithms may cause two serious problems: 1) heavy traffic loads around OIPs and 2) unbalanced traffic loads in irregular meshes. In this paper, we propose an OIP avoidance prerouting (OAPR) algorithm to solve the aforementioned problems. The proposed OAPR can make traffic loads evenly spread on the networks and shorten the average paths of packets. Therefore, the networks using the OAPR have lower latency and higher throughput than those using fault- tolerant routing algorithms. In our experiments, four different cases are simulated to demonstrate that the proposed OAPR improves 13.3 percent to 100 percent sustainable throughputs than two previous fault-tolerant routing algorithms. Moreover, the hardware overhead of the OAPR is less than 1 percent compared to the cost of a whole router. Hence, the proposed OAPR algorithm has good performance and is practical for irregular mesh-based OCNs.


international symposium on vlsi design, automation and test | 2009

Fault-tolerant router with built-in self-test/self-diagnosis and fault-isolation circuits for 2D-mesh based chip multiprocessor systems

Shu-Yen Lin; Wen-Chung Shen; Chan-Cheng Hsu; Chih-Hao Chao; An-Yeu Wu

A fault-tolerant router design (20-path router) is proposed to reduce the impacts of faulty routers for 2D-mesh based chip multiprocessor systems. In our experiments, the OCNs using 20PRs can reduce 75.65% ∼ 85.01% unreachable packets and 7.78% ∼ 26.59% latency in comparison with the OCNs using generic XY routers.


IEEE Transactions on Parallel and Distributed Systems | 2013

Topology-Aware Adaptive Routing for Nonstationary Irregular Mesh in Throttled 3D NoC Systems

Kun-Chih Chen; Shu-Yen Lin; Hui-Shun Hung; An-Yeu Andy Wu

Three-dimensional network-on-chip (3D NoC) has been proposed to solve the complex on-chip communication issues in future 3D multicore systems. However, the thermal problems of 3D NoC are more serious than 2D NoC due to chip stacking. To keep the temperature below a certain thermal limit, the thermal emergent routers are usually throttled. Then, the topology of 3D NoC becomes a Nonstationary Irregular Mesh (NSI-Mesh). To ensure the successful packet delivery in the NSI-Mesh, some routing algorithms had been proposed in the previous works. However, the network still suffers from extremely traffic imbalance among lateral and vertical logic layer. In this paper, we propose a Topology Aware Adaptive Routing (TAAR) to balance the traffic load for NSI-Mesh in 3D NoC. TAAR has three routing modes, which can be dynamically adjusted based on the topology status of the routing path. In addition to increasing routing flexibility, the TAAR also increases both vertical and lateral path diversity to balance the traffic load. Compared with the related adaptive routing methods, the experimental results show that the proposed TAAR can reduce 19 to 295 percent traffic loads in the bottom logic layer and improve around 7.7 to 380 percent network throughput. According to our proposed VLSI architecture, the TAAR only needs less than 24.8 percent hardware overhead compared with the previous works.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2014

Path-Congestion-Aware Adaptive Routing With a Contention Prediction Scheme for Network-on-Chip Systems

En-Jui Chang; Hsien-Kai Hsin; Shu-Yen Lin; An-Yeu Wu

Network-on-chip systems can achieve higher performance than bus systems for chip multiprocessor systems. However, as the complexity of the network increases, the channel and switch congestion problems become major performance bottlenecks. An effective adaptive routing algorithm can help minimize path congestion through load balancing. However, conventional adaptive routing schemes only use channel-based information to detect the congestion status. Due to the lack of switch-based information, channel-based information is difficult to reveal the real congestion status along the routing path. Therefore, in this paper, we remodel the path congestion information to show hidden spatial congestion information and improve the effectiveness of routing path selection. We propose a path-congestion-aware adaptive routing (PCAR) scheme based on the following techniques: 1) a path-congestion-aware selection strategy that simultaneously considers switch congestion and channel congestion, and 2) a contention prediction technique that uses the rate of change in the buffer level to predict possible switch contention. The experimental results show that the proposed PCAR scheme can achieve a high saturation throughput with an improvement of 15.4%-48.7% compared to existing routing schemes. The proposed PCAR method also includes a VLSI architecture, which has higher area efficiency with an improvement of 16%-35.7% compared with the other router designs.


international symposium on vlsi design, automation and test | 2011

Traffic-and thermal-aware routing for throttled three-dimensional Network-on-Chip systems

Shu-Yen Lin; Tzu-Chu Yin; Hao-Yu Wang; An-Yeu Wu

Thermal issue is one of the major challenges in the research field of three-dimensional (3D) IC. Network-on-Chip (NoC) has been viewed as a practical communication infrastructure in 3D IC. In this paper, we proposed an adaptive routing algorithm, Traffic- and Throttling-Awareness Routing (TTAR), to address the traffic congestion due to throttling of transient-temperature control. TTAR can balance the network traffic and detour the throttled tiles. The experimental results show that TTAR does not influence the ability of heat dissipation seriously and can achieve 8% ∼ 680% throughput improvements than the previous routing algorithms at 50-cycle average latency.


IEEE Transactions on Computers | 2015

Regional ACO-Based Cascaded Adaptive Routing for Traffic Balancing in Mesh-Based Network-on-Chip Systems

En-Jui Chang; Hsien-Kai Hsin; Chih-Hao Chao; Shu-Yen Lin; An-Yeu Andy Wu

The regular topology of mesh-based network-on-chip (NoC) provides flexible and scalable architecture for chip multiprocessor (CMP) systems. However, as the complexity of network increases, routing problems become performance bottlenecks. In the field of wide area networks (WANs), ant colony optimization (ACO) has been applied to an adaptive routing for improving performance and achieving load balancing. Nevertheless, if we directly apply ACO to NoC systems, the implementation cost of ACO is excessively high. To overcome this problem, the ACO-based adaptive routing must be reformulated while considering both router cost and NoC efficiency. This work proposes the regional ACO-based cascaded adaptive routing (RACO-CAR) scheme with the following techniques: 1) table elimination by removing redundant information, 2) table sharing by grouping pheromone information to merge table content, and 3) cascaded routing that assigns traffic to different uncongested regions to balance traffic. Our experimental results demonstrate that the RACO-CAR scheme has an improvement of 3.9-36.84 percent in saturation throughput compared with existing adaptive routing schemes. The implementation cost of the RACO-CAR router is only 37.4 percent of that of the ACO-based router with full routing table. Therefore, the proposed RACO-CAR scheme has high area efficiency, defined as saturation throughput divided by the total cost of router.


ACM Transactions in Embedded Computing Systems | 2013

Transport-layer-assisted routing for runtime thermal management of 3D NoC systems

Chih-Hao Chao; Kun-Chih Chen; Tsu-Chu Yin; Shu-Yen Lin; An-Yeu Wu

To ensure thermal safety and to avoid performance degradation from temperature regulation in 3D NoC, we propose a new temperature-traffic control framework. The framework contains the vertical throttling-based runtime thermal management (VT-RTM) scheme and the transport-layer assisted routing (TLAR) scheme. VT-RTM scheme increases the cooling speed and maintains high availability. TLAR scheme sustains the throughput of the nonstationary irregular mesh network. In our experiments, VT-RTM scheme reduces cooling time by 84% and achieves 98% network availability; the overall performance impact is around 8% of traditional schemes. TLAR scheme reduces average latency by 35∼% and improves sustainable throughput by 76%


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2014

DArT: A Component-Based DRAM Area, Power, and Timing Modeling Tool

Hsiu-Chuan Shih; Pei-Wen Luo; Jen-Chieh Yeh; Shu-Yen Lin; Ding-Ming Kwai; Shih-Lien Lu; Andre Schaefer; Cheng-Wen Wu

DRAM renovation calls for a holistic architecture exploration to cope with bandwidth growth and latency reduction need. In this paper, we present DRAM area power timing (DArT), a DRAM area, power, and timing modeling tool, for array assembly and interface customization. Through proper design abstraction, our component-based modeling approach provides increased flexibility and higher accuracy, making DArT suitable for DRAM architecture exploration and performance estimation. We validate the accuracy of DArT with respect to the physical layout and circuit simulation of an industrial 68 nm commodity DRAM device as a reference. The experiment results show that the maximum deviations from the reference design, in terms of area, timing, and power, are 3.2%, 4.92%, and 1.73%, respectively. For an architectural projection by porting it to a 45 nm process, the maximum deviations are 3.4%, 3.42%, and 8.57%, respectively. The combination of modeling performance, flexibility, and accuracy of DArT allows us to easily explore new DRAM architectures in the future, including 3-D stacked DRAM.


international symposium on vlsi design, automation and test | 2013

Design of thermal management unit with vertical throttling scheme for proactive thermal-aware 3D NoC systems

Kun-Chih Chen; Shu-Yen Lin; An-Yeu Wu

The three-dimensional Network-on-Chip (3D NoC) has been proposed to solve the complex on-chip communication issues. However, the thermal problems become more exacerbated because of the larger power density and the heterogeneous thermal conductance in different silicon layer of 3D NoC. To regulate the system temperature, the Dynamic Thermal Management (DTM) techniques will be triggered when the device is thermal-emergent. However, these kinds of reactive DTM schemes result in significant system performance degradation. In this paper, we propose a proactive DTM with vertical throttling (PDTM-VT) scheme, which is controlled by the distributed Thermal Management Unit (TMU) of each NoC node. Based on the expected temperature resulted from the proposed thermal prediction model, the TMU can early control the temperature of the thermal-emergent device. The experimental results show that the proposed thermal prediction model has less than 0.25% prediction error against actual temperature measurement within 50ms. Besides, the PDTM-VT can reduce 11.84%~23.18% thermal-emergent nodes and improve 0.47%~47.90% network throughput.


international symposium on circuits and systems | 2009

A Scalable built-in self-test/self-diagnosis architecture for 2D-mesh based chip multiprocessor systems

Shu-Yen Lin; Chan-Cheng Hsu; An-Yeu Wu

In this paper, we proposed a scalable built-in self-test/self-diagnosis architecture, Surrounding Test Ring (STR), to detect and locate faulty FIFOs and faulty MUXs for 2D-mesh based CMP systems. Proposed STR supports 97.79% fault coverage in FIFOs and MUXs and tests with 388 ∼ 2886 test cycles in different testing methods and mesh sizes.

Collaboration


Dive into the Shu-Yen Lin's collaboration.

Top Co-Authors

Avatar

An-Yeu Wu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kun-Chih Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chih-Hao Chao

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

En-Jui Chang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hsien-Kai Hsin

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hui-Shun Hung

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge