Yi-Jung Chen
National Chi Nan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yi-Jung Chen.
international conference on computer design | 2012
Chung-Hsiang Lin; De-Yu Shen; Yi-Jung Chen; Chia-Lin Yang; Michael Wang
DRAMs are used as the main memory in most computing systems today. Studies show that DRAMs contribute to a significant part of overall system power consumption. Therefore, one of the main challenges in low-power DRAM design is the inevitable refresh process. Due to process variation, memory cells exhibit retention time variations. Current DRAMs use a single worst-case refresh period. Prolonging refresh intervals introduces retention errors. Previous works adopt conventional ECC (Error Correcting Code) to correct retention errors. These approaches introduce significant area and energy overheads. In this paper, we propose a novel error correction framework for retention errors in DRAMs, called SECRET (Selective Error Correction for Refresh Energy reducTion). The key observation we make is that retention errors can be treated as hard errors rather than soft errors, and only few DRAM cells have large leakage. Therefore, instead of equipping error correction capability in all memory cells as existing ECC schemes, we only allocate error correction information to leaky cells under a refresh interval. Our SECRET framework contains two parts, an off-line phase to identify memory cells with retention errors given a target error rate, and a low-overhead error correction mechanism. The experimental results show that the proposed SECRET framework can reduce refresh power by 87.2%, and overall DRAM power by 18.57% with negligible area and performance overheads.
Journal of Systems Architecture | 2009
Yi-Jung Chen; Chia-Lin Yang; Yen-Sheng Chang
Network-on-Chip (NoC) has been proposed to overcome the complex on-chip communication problem of System-on-Chip (SoC) design in deep sub-micron. A complete NoC design contains exploration on both hardware and software architectures. The hardware architecture includes the selection of Processing Elements (PEs) with multiple types and their topology. The software architecture contains allocating tasks to PEs, scheduling of tasks and their communications. To find the best hardware design for the target tasks, both hardware and software architectures need to be considered simultaneously. Previous works on NoC design have concentrated on solving only one or two design parameters at a time. In this paper, we propose a hardware-software co-synthesis algorithm for a heterogeneous NoC architecture. The design goal is to minimize energy consumption while meeting the real-time requirements commonly seen in embedded applications. The proposed algorithm is based on Simulated-Annealing (SA). To compare the solution quality and efficiency of the proposed algorithm, we also implement the branch-and-bound and iterative algorithm to solve the hardware-software co-synthesis problem of a heterogeneous NoC. With the given synthetic task sets, the experimental results show that the proposed SA-based algorithm achieves near-optimal solution in a reasonable time, while the branch-and-bound algorithm takes a very long time to find the optimal solution, and the iterative algorithm fails to achieve good solution quality. When applying the co-synthesis algorithms to a real-world application with PE library that has little variation in PE performance and energy consumption, the iterative algorithm achieves solution quality comparable to that of the proposed SA-based algorithm.
compilers, architecture, and synthesis for embedded systems | 2007
Jaw-Wei Chi; Chia-Lin Yang; Yi-Jung Chen; Jien-Jia Chen
Leakage energy consumption is an increasingly important issue as the technology continues to shrink. Since on-chip caches constitute a major portion of the processors transistor budget, several leakage control policies have been proposed to reduce cache leakage. However, these policies introduce performance unpredictability thereby not suitable for hard real-time applications that require the timing constraint is met in all cases. In this paper, we propose the first approach to apply existing low leakage circuit techniques on hard real-time applications. The proposed timing-aware cache leakage control mechanism exploits task slack time to turn cache lines into the low-leakage state provided that the timing constraint is met. The experimental results show that the proposed cache leakage control policy achieves comparable leakage reduction to the leakage control policy that aggressively turn cache lines into low-leakage modes without considering the timing constraint.
acm symposium on applied computing | 2007
Wei-Hsuan Hung; Yi-Jung Chen; Chia-Lin Yang; Yen-Sheng Chang; Alan P. Su
Network-on-Chip (NoC) has been proposed to overcome the complex on-chip communication problem of SoC (System-on-Chip) design in deep submicron. A complete NoC design contains exploration on both hardware and software architectures. The hardware architecture includes the selection of PEs (Processing Elements) with multiple types and their topology. The software architecture contains the allocation of tasks to PEs, scheduling of tasks and their communications. To find the best hardware design for the target tasks, both hardware and software architectures need to be considered simultaneously. Previous works on NoC design have concentrated on solving for only one or two design parameters at a time. In this paper, we propose a hardware-software co-synthesis algorithm for a heterogeneous NoC architecture. The design goal is to minimize energy consumption while meeting the real-time requirements commonly seen in the embedded applications.
international conference on computer aided design | 2012
Yi-Jung Chen; Chia-Lin Yang; Jian-Jia Chen
Stacking DRAMs on processing cores by Through-Silicon Vias (TSVs) provides abundant bandwidth and enables a distributed memory interface design. To achieve the best balance in performance and cost in an application-specific system, the distributed memory interface should be tailored for the target applications. In this paper, we propose the first distributed memory interface synthesis framework for application-specific Network-on-Chips (NoCs) with 3D-stacked DRAMs. To maximize the performance of a selected hardware configuration, the proposed framework co-synthesizes the hardware configuration of the distributed memory interface, and the software configuration, e.g. task mapping and data assignment. Since TSVs have adverse impact on chip costs and yields, the goal of the framework is minimizing the number of TSVs provided that the user-defined performance constraint is met.
IEEE Transactions on Computers | 2011
Yi-Jung Chen; Chia-Lin Yang; Jaw-Wei Chi; Jian-Jia Chen
Leakage energy consumption is an increasingly important issue as the technology continues to shrink. Existing leakage reduction techniques for hard real-time systems utilize slack to turn off a CPU completely. However, turning on/off a processor involves high performance and energy overheads. Hence, a hard real-time system is very likely to have unutilized slack if only the CPU shutdown technique is used to reduce leakage. Architectural-level shutdown techniques in all instances have a much lower overheads than turning off a CPU; therefore, they can be utilized in a hard real-time system to further reduce CPU leakage. However, existing architecture-level shutdown techniques cause unpredictable performance degradation thereby unsuitable for a hard real-time system that must meet the timing constraint in all cases. This paper is the first attempt to bridge this gap. This paper focuses on cache leakage reduction and proposes the first Timing-Aware Cache Leakage Control (TACLC) mechanism. TACLC exploits system slack to turn cache lines into low-leakage states provided that the timing constraint is met. The experimental results demonstrate that TACLC effectively utilizes system slack to reduce cache leakage. For systems with low CPU utilization, TACLC achieves comparable leakage reduction to the leakage control policy that aggressively turns cache lines into low-leakage modes while neglecting the timing constraint.
IEEE Transactions on Computers | 2017
Che-Wei Chang; Geng-You Chen; Yi-Jung Chen; Chia-Wei Yeh; Pei Yin Eng; Ana Cheung; Chia-Lin Yang
Given the needs of data-intensive web services and cloud computing applications, storage centers play an important role in serving the demanded data access while jointly considering low cost, qualified performance, and good scalability. To manage peak workloads with performance requirements for read/write latencies, overprovisioning more storage nodes is common but also increases total cost as well as power consumption. Recently, due to the growing capacity and dropping price, NAND-flash-based Solid-State Drives (SSDs) have become an attractive storage solution in datacenters. In this work, we exploit the write heterogeneity in Multi-Level-Cell (MLC) NAND flash memory to meet Service-Level Objectives (SLOs) of applications and to avoid storage overprovision. In MLC NAND flash memory, a memory cell can be programmed as a Single-Level Cell (SLC) or a multi-level cell at runtime, and SLC writes take shorter latency with the cost of larger consumed capacity. The proposed SLO-aware morphable SSD design seeks to meet the SLO requirement by deciding the write mode of each write request while minimizing the number of SLC writes. Experimental results show that the proposed design meets the SLO requirement for all of the tested I/O traces with less than 2.8 percent extra erase counts in average, while conventional MLC SSDs require up to 2.375 times storage overprovision to meet the SLO requirement.
ACM Transactions on Architecture and Code Optimization | 2015
Chung-Hsiang Lin; De-Yu Shen; Yi-Jung Chen; Chia-Lin Yang; Cheng-Yuan Michael Wang
DRAMs are used as the main memory in most computing systems today. Studies show that DRAMs contribute to a significant part of overall system power consumption. One of the main challenges in low-power DRAM design is the inevitable refresh process. Due to process variation, memory cells exhibit retention time variations. Current DRAMs use a single refresh period determined by the cell with the largest leakage. Since prolonging refresh intervals introduces retention errors, a set of previous works adopt conventional error-correcting code (ECC) to correct retention errors. However, these approaches introduce significant area and energy overheads. In this article, we propose a novel error correction framework for retention errors in DRAMs, called SECRET (selective error correction for refresh energy reduction). The key observations we make are that retention errors are hard errors rather than soft errors, and only few DRAM cells have large leakage. Therefore, instead of equipping error correction capability for all memory cells as existing ECC schemes, we only allocate error correction information to leaky cells under a refresh interval. Our SECRET framework contains two parts: an offline phase to identify memory cells with retention errors given a target error rate and a low-overhead error correction mechanism. The experimental results show that among all test cases performed, the proposed SECRET framework can reduce refresh power by 87.2% and overall DRAM power up to 18.57% with negligible area and performance overheads.
ACM Sigapp Applied Computing Review | 2016
Yi-Jung Chen; Chia-Lin Yang; Pin-Sheng Lin; Yi-Chang Lu
Chip-Multiprocessors (CMPs) with 3D-stacked DRAMs is promising for solving the memory wall problem, but the high power density makes 3D ICs frequently operate at or near the thermal limit. System hot spot of a CMP with 3Dstacked DRAMs is usually in DRAMs that are in the layers farthest from the heat sink. Heat from DRAMs and cores are all accumulated in DRAMs. Therefore, existing thermal managements for 3D ICs all perform thermal control on cores only because lowering the power-level of cores can also lower DRAM access frequency. However, as the power consumption of single DRAM access increases with the number of DRAM stacks and the width of the vertical links, the instantaneous DRAM accesses may easily overheat the system. So, in addition to lowering the access frequency of DRAMs, reducing the power consumption per DRAM access is also crucial. In this paper, we characterize the thermal and performance behavior of the target architecture when the voltage and frequency levels of cores and DRAMs are synergistically controlled. We also evaluate the thermal and performance behavior of existing thermal control methods that can be applied to the target architecture. The insights provided by the characterizations presented in this paper are important for developing an effective thermal management policy for CMPs with 3D-stacked DRAMs. Our results show that, synergistically controlling the voltage-frequency levels of cores and DRAMs does achieve higher thermal efficiency than controlling cores only.
research in adaptive and convergent systems | 2015
Yi-Jung Chen; Chia-Lin Yang; Ping-Sheng Lin; Yi-Chang Lu
Chip-Multiprocessors (CMPs) with 3D-stacked DRAMs is promising for solving the memory wall problem, but the high power density makes 3D ICs frequently operate at or near the thermal limit. System hot spot of a CMP with 3D-stacked DRAMs is usually in DRAMs that are in the layers farthest from the heat sink. Heat from DRAMs and cores are all accumulated in DRAMs. Therefore, existing thermal managements for 3D ICs all perform thermal control on cores only because lowering the power-level of cores can also lower DRAM access frequency. However, as the power consumption of single DRAM access increases with the number of DRAM stacks and the width of the vertical links, the instantaneous DRAM accesses may easily overheat the system. So, in addition to lowering the access frequency of DRAMs, reducing the power consumption per DRAM access is also crucial. In this paper, we characterize the thermal and performance behavior of the target architecture when the voltage and frequency levels of cores and DRAMs are synergistically controlled. We also evaluate the thermal and performance behavior of existing thermal control methods that can be applied to the target architecture. The insights provided by the characterizations presented in this paper are important for developing an effective thermal management policy for CMPs with 3D-stacked DRAMs. Our results show that, synergistically controlling the voltage-frequency levels of cores and DRAMs does achieve higher thermal efficiency than controlling cores only.