Qingfeng Zhuge | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Qingfeng Zhuge is active.

Explore More

Publication

Featured researches published by Qingfeng Zhuge.

design, automation, and test in europe | 2011

Towards energy efficient hybrid on-chip Scratch Pad Memory with non-volatile memory

Jingtong Hu; Chun Jason Xue; Qingfeng Zhuge; Wei-Che Tseng; Edwin Hsing-Mean Sha

Scratch Pad Memory (SPM), a software-controlled on-chip memory, has been widely adopted in many embedded systems due to its small area and low power consumption. As technology scaling reaches the sub-micron level, leakage energy consumption is surpassing dynamic energy consumption and becoming a critical issue. In this paper, we propose a novel hybrid SPM which consists of non-volatile memory (NVM) and SRAM to take advantage of the ultra-low leakage power consumption and high density of NVM as well as the efficient writes of SRAM. A novel dynamic data allocation algorithm is proposed to make use of the full potential of both NVM and SRAM. According to the experimental results, with the help of the proposed algorithm, the novel hybrid SPM architecture can reduce memory access time by 18.17%, dynamic energy by 24.29%, and leakage power by 37.34% on average compared with a pure SRAM based SPM with the same size area.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Write Activity Minimization for Nonvolatile Main Memory Via Scheduling and Recomputation

Jingtong Hu; Wei-Che Tseng; Chun Jason Xue; Qingfeng Zhuge; Yingchao Zhao; Edwin Hsing-Mean Sha

Nonvolatile memories such as Flash memory, phase change memory (PCM), and magnetic random access memory (MRAM) have many desirable characteristics for embedded systems to employ them as main memory. However, there are two common challenges we need to answer before we can apply nonvolatile memory as main memory practically. First, nonvolatile memory has limited write/erase cycles compared to DRAM. Second, a write operation is slower than a read operation on nonvolatile memory. These two challenges can be answered by reducing the number of write activities on nonvolatile main memory. In this paper, we proposed two optimization techniques, write-aware scheduling and recomputation, to minimize write activities on nonvolatile memory. With the proposed techniques, we can both speed up the completion time of programs and extend nonvolatile memorys lifetime. The experimental results show that the proposed techniques can reduce the number of write activities on nonvolatile memory by 55.71% on average. Thus, the lifetime of nonvolatile memory is extended to 2.5 times as long as before on average. The completion time of programs can be reduced by 56.67% on systems with NOR Flash memory and by 47.63% on systems with NAND Flash memory on average.

symposium on application specific processors | 2010

Minimizing write activities to non-volatile memory via scheduling and recomputation

Jingtong Hu; Chun Jason Xue; Wei-Che Tseng; Qingfeng Zhuge; Edwin Hsing-Mean Sha

Non-volatile memories, such as flash memory, Phase Change Memory (PCM), and Magnetic Random Access Memory (MRAM), have many desirable characteristics for embedded DSP systems to employ them as main memory. These characteristics include low-cost, shock-resistivity, non-volatility, power-economy and high density. However, there are two common challenges we need to answer before we can apply non-volatile memory as main memory practically. First, non-volatile memory has limited write/erase cycles compared to DRAM. Second, a write operation is slower than a read operation on non-volatile memory. These two challenges can be answered by reducing the number of write activities on non-volatile main memory. In this paper, we propose two optimization techniques, write-aware scheduling and recomputation, to minimize write activities on non-volatile memory. With the proposed techniques, we can both speed up the completion time of programs and extend non-volatile memorys lifetime. The experimental results show that the proposed techniques can reduce the number of write activities on non-volatile memory by 55.71% on average. Thus, the lifetime of non-volatile memory is extend to 2.5 times as long as before on average. The completion time of programs can be reduced by 55.32% on systems with NOR flash memory and by 40.69% on systems with NAND flash memory on average.

ACM Transactions in Embedded Computing Systems | 2003

Code size reduction technique and implementation for software-pipelined DSP applications

Qingfeng Zhuge; Bin Xiao; Edwin Hsing-Mean Sha

Software pipelining technique is extensively used to exploit instruction-level parallelism of loops, but also significantly expands the code size. For embedded systems with very limited on-chip memory resources, code size becomes one of the most important optimization concerns. This paper presents the theoretical foundation of code size reduction for software-pipelined loops based on retiming concept. We propose a general Code-size REDuction technique (CRED) for various kinds of processors. Our CRED algorithms integrate the code size reduction with software pipelining. The experimental results show the effectiveness of the CRED technique on both code size reduction and code size/performance trade-off space exploration.

international conference on parallel processing | 2011

Optimal Data Allocation for Scratch-Pad Memory on Embedded Multi-core Systems

Yibo Guo; Qingfeng Zhuge; Jingtong Hu; Meikang Qiu; Edwin Hsing-Mean Sha

Multi-core systems have been a popular design for high-performance embedded systems. Scratch Pad Memory (SPM), a software-controlled on-chip memory, has been widely adopted in many embedded systems due to its small area and low energy consumption. Existing data allocation algorithms either cannot achieve optimal results or take exponential time to complete. In this paper, we propose one polynomial-time algorithms to solve the data allocation problem on multi-core system with exclusive data copy. According to the experimental results, the proposed optimal data allocation method alone reduces time cost of memory accesses by 16.45% on average compared with greedy algorithm. The proposed data allocation algorithm also can reduce the energy cost significantly.

international conference on parallel and distributed systems | 2005

Minimizing Energy via Loop Scheduling and DVS for Multi-Core Embedded Systems

Ying Chen; Zili Shao; Qingfeng Zhuge; Chun Xue; Bin Xiao; Edwin Hsing-Mean Sha

Low energy consumptions are extremely important in real-time embedded systems, and scheduling is one of the techniques used to obtain lower energy consumptions. In this paper, we propose loop scheduling algorithms for minimizing energy based on rotation scheduling and DVS (dynamic voltage and frequency scaling) for real-time multi-core embedded systems. The experimental results show that our algorithms have better performances than list scheduling and pure ILP (integer linear programming) scheduling with DVS

IEEE Transactions on Signal Processing | 2004

Efficient variable partitioning and scheduling for DSP processors with multiple memory modules

Qingfeng Zhuge; Edwin Hsing-Mean Sha; Bin Xiao; Chantana Chantrapornchai

Multiple on-chip memory modules are attractive to many high-performance digital signal processing (DSP) applications. This architectural feature supports higher memory bandwidth by allowing multiple data memory accesses to be executed in parallel. However, making effective use of multiple memory modules remains difficult. The performance gain in this kind of architecture strongly depends on variable partitioning and scheduling techniques. In this paper, we propose a graph model known as the variable independence graph (VIG) and algorithms to tackle the variable partitioning problem. Our results show that VIG is more effective than interference graph for solving variable partitioning problem. Then, we present a scheduling algorithm known as the rotation scheduling with variable repartition (RSVR) to improve the schedule lengths efficiently on a multiple memory module architecture. This algorithm adjusts the variable partitions during scheduling and generates a compact schedule based on retiming and software pipelining. The experimental results show that the average improvement on schedule lengths is 44.8% by using RSVR with VIG. We also propose a design space exploration algorithm using RSVR to find the minimum number of memory modules and functional units satisfying a schedule length requirement. The algorithm produces more feasible solutions with equal or fewer number of functional units compared with the method using interference graph.

system on chip conference | 2010

Optimal scheduling to minimize non-volatile memory access time with hardware cache

Wei-Che Tseng; Chun Jason Xue; Qingfeng Zhuge; Jingtong Hu; Edwin Hsing-Mean Sha

In power and size sensitive embedded systems, flash memory and phase change memory are replacing DRAM as the main memory. Unfortunately, these technologies are limited by their endurance and long write latencies. To minimize the main memory access time, we optimally schedule tasks by an ILP formulation that can be generally applied to other main memory technologies, including DRAM. We also present a heuristic, Wander Scheduling, to solve larger instances in a reasonable amount of time. Our experimental results show that when compared with list scheduling, Wander Scheduling can reduce memory access times by an average of 40.73% and increase the lifetime of flash and phase change memory by 82.56%.

ACM Transactions on Design Automation of Electronic Systems | 2006

Loop scheduling with timing and switching-activity minimization for VLIW DSP

Zili Shao; Bin Xiao; Chun Xue; Qingfeng Zhuge; Edwin Hsing-Mean Sha

In embedded systems, high-performance DSP needs to be performed not only with high-data throughput but also with low-power consumption. This article develops an instruction-level loop-scheduling technique to reduce both execution time and bus-switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can reduce both schedule length and bus-switching activities. Compared with the work of Lee et al. [2003], SAMLS shows an average 11.5% reduction in schedule length and an average 19.4% reduction in bus-switching activities.

embedded and ubiquitous computing | 2006

Efficent algorithm of energy minimization for heterogeneous wireless sensor network

Meikang Qiu; Chun Xue; Zili Shao; Qingfeng Zhuge; Meilin Liu; Edwin Hsing-Mean Sha

Energy and delay are critical issues for wireless sensor networks since most sensors are equipped with non-rechargeable batteries that have limited lifetime. Due to the uncertainties in execution time of some tasks, this paper models each varied execution time as a probabilistic random variable and incorporating applications performance requirements to solve the MAP (Mode Assignment with Probability) problem. Using probabilistic design, we propose an optimal algorithm to minimize the total energy consumption while satisfying the timing constraint with a guaranteed confidence probability. The experimental results show that our approach achieves significant energy saving than previous work. For example, our algorithm achieves an average improvement of 32.6% on total energy consumption

Explore More