Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wenrui Gong is active.

Publication


Featured researches published by Wenrui Gong.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2007

Ant Colony Optimizations for Resource- and Timing-Constrained Operation Scheduling

Gang Wang; Wenrui Gong; Brian DeRenzi; Ryan Kastner

Operation scheduling (OS) is a fundamental problem in mapping an application to a computational device. It takes a behavioral application specification and produces a schedule to minimize either the completion time or the computing resources required to meet a given deadline. The OS problem is NP-hard; thus, effective heuristic methods are necessary to provide qualitative solutions. We present novel OS algorithms using the ant colony optimization approach for both timing-constrained scheduling (TCS) and resource-constrained scheduling (RCS) problems. The algorithms use a unique hybrid approach by combining the MAX-MIN ant system metaheuristic with traditional scheduling heuristics. We compiled a comprehensive testing benchmark set from real-world applications in order to verify the effectiveness and efficiency of our proposed algorithms. For TCS, our algorithm achieves better results compared with force-directed scheduling on almost all the testing cases with a maximum 19.5% reduction of the number of resources. For RCS, our algorithm outperforms a number of different list-scheduling heuristics with better stability and generates better results with up to 14.7% improvement. Our algorithms outperform the simulated annealing method for both scheduling problems in terms of quality, computing time, and stability


Journal of Low Power Electronics | 2005

Algorithm/Architecture Co-exploration for Designing Energy Efficient Wireless Channel Estimator

Yan Meng; Wenrui Gong; Ryan Kastner; Timothy Sherwood

Wireless networks are making the vision of ubiquitous computing a reality: users will be able to connect anytime and anywhere from anything. To achieve this vision, the next generation of wireless devices must learn about, and adapt to, the transmission environment through a process called channel estimation. In this paper, we describe a cross-cutting approach to explore the design space to solve the channel estimation problem on reconfigurable devices. In particular we focus on the matching pursuit algorithm, which is a fast and accurate iterative algorithm for multipath channel estimation. Our methodology models modern reconfigurable devices as an array of Block RAMlevel operation blocks (“BLOBs”), which act as flexible data paths. With the model, we describe design techniques and tradeoffs, resulting in novel optimizations at every level in building an energy efficient MP core, from the theory and algorithms to the bit level. We present results from our design space exploration over a number of different parameters, including both high level characteristics of the application, data and computation partitioning schemes, and module- and bit-level low-power techniques. The results demonstrate the effectiveness and efficiency of our approach to building a high speed and low power channel estimator. The total power saving is 25.4%. We further show that the local, distributed computation is, on average, 145% faster with minimum cost in power dissipation, than the global, centralized computation.


design automation conference | 2006

Design space exploration using time and resource duality with the ant colony optimization

Gang Wang; Wenrui Gong; Brian DeRenzi; Ryan Kastner

Design space exploration during high level synthesis is often conducted through ad-hoc probing of the solution space using some scheduling algorithm. This is not only time consuming but also very dependent on designers experience. We propose a novel design exploration method that exploits the duality between the time and resource constrained scheduling problems. Our exploration automatically constructs a high quality time/area tradeoff curve in a fast, effective manner. It uses the max-min ant colony optimization to solve both the time and resource constrained scheduling problems. We switch between the time and resource constrained algorithms to quickly traverse the design space. Our algorithm provides a significant solution quality savings (average 17.3% reduction of resource counts) with similar run time on a comprehensive benchmark suite constructed with classic and real-life samples, compared to using force directed scheduling exhaustively at every time step. Our algorithms scale well over different applications and problem sizes


international conference on computer aided design | 2006

On the use of Bloom filters for defect maps in nanocomputing

Gang Wang; Wenrui Gong; Ryan Kastner

While the exact manufacturing process for nanoscale computing devices is uncertain, it is abundantly clear that future technology nodes will see an increase in defect rates. Therefore, it is of paramount importance to construct new architectures and design methodologies that can tolerate large numbers of defects. Defect maps are a necessity in the future design flows, and research on their practical construction is essential. In this work, we study the use of Bloom filters as a data structure for defect maps. We show that Bloom filters provide the right tradeoff between accuracy and space-efficiency. In particular, they can help simplify the nanosystem design flow by embedding defect information within the nanosystem delivered by the manufacturers. We develop a novel nanoscale memory design that uses this concept. It does not rely on a voting strategy, and utilizes the device redundancy more effectively than existing approaches


great lakes symposium on vlsi | 2005

Instruction scheduling using MAX-MIN ant system optimization

Gang Wang; Wenrui Gong; Ryan Kastner

Instruction scheduling is a fundamental step for mapping an application to a computational device. It takes a behavioral application specification and produces a schedule for the instructions onto a collection of processing units. The objective is to minimize the completion time of the given application while effectively utilizing the computational resources. The instruction scheduling problem is NP-hard, thus effective heuristic methods are necessary to provide a qualitative scheduling solution. In this paper, we present a novel instruction scheduling algorithm using MAX-MIN Ant System Optimization approach. The algorithm utilizes a unique hybrid approach by combining the ant system meta-heuristic with list scheduling, where the local and global heuristics are dynamically adjusted to the input application in an iterative manner. Compared with force-directed scheduling and a number of different list scheduling heuristics, our algorithm generates better results over all the tested benchmarks with better stability. Furthermore, by solving the test samples optimally using ILP formulation, we show that our algorithm consistently achieves a near optimal solution.


design, automation, and test in europe | 2006

Layout Driven Data Communication Optimization for High Level Synthesis

Ryan Kastner; Wenrui Gong; Xin Hao; Forrest Brewer; Adam Kaplan; P. Brisbane; Majid SarrafzadehWenrui

High level synthesis transformations play a major part in shaping the properties of the final circuit. However, most optimizations are performed without much knowledge of the final circuit layout. In this paper, we present a physically aware design flow for mapping high level application specifications to a synthesizable register transfer level hardware description. We study the problem of optimizing the data communication of the variables in the application specification. Our algorithm uses floorplan information that guides the optimization. We develop a simple, yet effective, incremental floorplanner to handle the perturbations caused by the data communication optimization. We show that the proposed techniques can reduce the wirelength of the final design, while maintaining a legal floorplan with the same area as the initial floorplan


international conference on computer aided design | 2005

Storage assignment during high-level synthesis for configurable architectures

Wenrui Gong; Gang Wang; Ryan Kastner

Modern, high performance configurable architectures integrate on-chip, distributed block RAM modules to provide ample data storage. Synthesizing applications to these complex systems requires an effective and efficient approach to conduct data partitioning and storage assignment. In this paper, we present a data and iteration space partitioning solution that focuses on minimizing remote memory accesses or, equivalently, maximizing the local computation. Using the same code but different data partitionings, we can achieve faster clock frequencies, without increasing the number of cycles, by simply minimizing global memory accesses. Other optimization techniques like scalar replacement, prefetching and buffer insertion can further minimize remote accesses and lead to average 4.8/spl times/ speedup in overall runtime.


field-programmable custom computing machines | 2006

Defect-Tolerant Nanocomputing Using Bloom Filters

Gang Wang; Wenrui Gong; Ryan Kastner

The authors propose a novel defect-tolerant design methodology using Bloom filters for defect mapping for nanoscale computing devices. It is a general approach that can be used for any permanent defects incurred during the manufacturing process. The redundant design methodology does not rely on a voting strategy, thus it utilizes the device redundancy more effectively than existing approaches. Additionally, our method does not have false-positive in defect identification, i.e. it will not report a defective device as functional. Moreover, it is very space economic and can be programmed to fit different scales and characteristics of the underlying specific nanoscale devices used in the system


Journal of Embedded Computing | 2006

Application partitioning on programmable platforms using the ant colony optimization

Gang Wang; Wenrui Gong; Ryan Kastner


Archive | 2005

Data Partitioning for Reconfigurable Architectures with Distributed Block RAM

Wenrui Gong; Gang Wang; Ryan Kastner

Collaboration


Dive into the Wenrui Gong's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gang Wang

University of California

View shared research outputs
Top Co-Authors

Avatar

Brian DeRenzi

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yan Meng

University of California

View shared research outputs
Top Co-Authors

Avatar

Adam Kaplan

University of California

View shared research outputs
Top Co-Authors

Avatar

Forrest Brewer

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

P. Brisbane

University of California

View shared research outputs
Top Co-Authors

Avatar

Xin Hao

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge