Renqiu Huang
University of Cincinnati
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Renqiu Huang.
international parallel and distributed processing symposium | 2004
Renqiu Huang; Ranga Vemuri
Summary form only given. Incorporating physical information into earlier architectural and logic synthesis stages is highly desirable since it allows more realistic exploration of the design space and the generation of solutions with predictable metrics. We present a forward-looking synthesis methodology in which we weigh all nets in the control data flow graph (CDFG) according to their criticality. We cluster operations in the CDFG into macros while satisfying logical and physical constraints. We perform relational placement on these macros. We have evaluated the proposed approach using a set of benchmark designs by comparing it with the results of a traditional synthesis flow. The results show that our methodology achieves up to 26% improvement in clock frequency without any area overhead, and average 12.7% improvement in critical path delay with no or little place-and-route time overhead.
international conference on vlsi design | 2005
Renqiu Huang; Ranga Vemuri
An important application of dynamically and partially reconfigurable computing platforms is in dynamic task allocation and execution. On-line synthesis, on-line placement and on-line routing are the three essential steps in implementing an incoming task on the FPGA during run-time. Whereas there has been some research in on-line placement, on-line synthesis received relatively little attention. We present what is believed to be the first on-line synthesis methodology for partially reconfigurable FPGAs. In on-line synthesis, time for synthesis should be kept low while ensuring the placeability of the synthesized design on the FPGA in the available empty area and meeting the performance requirements. We ensure placeability by considering and maintaining the available area on the FPGA surface as a collection of maximal empty rectangles. The proposed synthesizer allocates the FPGA resources adoptively and is incremental in nature. The algorithm is designed to be linear in terms of the number of operations to ensure its on-line usage. Our experimental results demonstrate the advantages of the proposed approach.
international conference on computer aided design | 2004
Renqiu Huang; Ranga Vemuri
In this paper, a cluster-based FPGA is proposed. The proposed FPGA has a hybrid interconnect structure which takes advantages of both mesh and tree topologies. We analyze the area and performance of proposed FPGA in terms of the needed switches by comparing with those of conventional FPGAs. We evaluate the proposed architecture on a series of benchmark designs. The experimental results show that the proposed model can significantly reduce the routing area, achieve high performance and admit more implementations of various designs at the price of a modest increase of switches required for that architecture.
field-programmable logic and applications | 2004
Renqiu Huang; Manish Handa; Ranga Vemuri
Dynamically reconfigurable devices allow run-time reconfiguration to permit execution of incoming tasks or task fragments. One of the important issues in run-time reconfiguration is the fragmentation of the device area as the reconfigurable blocks are allocated and released when tasks are placed, executed and deleted. Due to those scattered, unused resources, an incoming application may not be placeable or routable. A cluster-based reconfigurable FPGA architecture is proposed to alleviate this difficulty. We present an assessment of the proposed architecture. We develop a fast evaluation tool to simulate on-line placement and routing effects on a run-time reconfigurable platform. The simulation results show the efficiency of the proposed architecture in relieving the fragmentation problem at the price of a modest increase in the number of switches.
great lakes symposium on vlsi | 2006
Renqiu Huang; Ranga Vemuri
Without the adequate awareness of trade-off between different resources, it is extremely difficult for system synthesis tools to achieve high performance solutions when mapping the applications to FPGA-based computing engines. In this paper, we present an automatic synthesis methodology which attacks both memory and logic assignments by interacting with behavioral synthesis. The problem is formulated as part of the heuristic algorithm by exploiting application specific information and organizing possible data structures and computations for data-intensive applications. We have evaluated the proposed framework on a set of DSP benchmarks and a real multimedia application by generating register-transfer level (RTL) implementations. The results show that, by using our proposed techniques, it is possible the synthesized designs obtain significant (avg. of 34.8%) performance improvements over the conventional synthesis approaches.
ieee computer society annual symposium on vlsi | 2005
Renqiu Huang; Ranga Vemuri
Mesh interconnect can be efficiently utilized while tree networks encourage the short routing distances. In this paper, we present the property analysis of a cluster-based interconnect model on a set of key architectural parameters. The evaluations show that the analyzed structure is not only insensitive to the various parameters for the formulation of that architecture, but also admits more designs with potential high performance improvement.
field-programmable logic and applications | 2005
Renqiu Huang; Ranga Vemuri
In this abstract, we have presented our research efforts toward the integration of physical synthesis with high level synthesis. By incorporating physical considerations into high level specification, we restrict computations and communications to geographic proximities while reserve the quality of the final result to a large extent within limited resources of FPGAs. We believe that the proposed methodology provides possible directions for synthesis unification of high level abstraction and lower level implementation, and is on the right track towards achieving a well-balanced (or even a globally optimum mapping, this is the long-run objective of PAHLS) synthesis result.
field-programmable logic and applications | 2003
Renqiu Huang; Tommy Cheung; Ted Chi-Wah Kok
This paper investigates an analysis tool for the routing resources in the FPLD architecture design. The developed tool can assess the performance of a given architecture specified by the physical configuration of logic blocks and the switch boxes topology. Two problems are mainly considered in this paper: given an architecture, the terminal distribution of each switch box is first determined via probabilistic assumptions, then the sizes of required universal switch boxes are evaluated for routing successfully. The estimations are validated by comp aring them with the results obtained in the previous published experimental study on FPGA benchmark circuits. Moreover, our result confirms that the universal switch block is a good candidate for FPLD design.
ERSA | 2004
Jawad Khan; Jayanthi Rajagopalan; Renqiu Huang; Ranga Vemuri
Physical aware high level synthesis and interconnect for fpgas | 2006
Ranga Vemuri; Renqiu Huang