Iyad Ouaiss
University of Cincinnati
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Iyad Ouaiss.
international parallel processing symposium | 1998
Iyad Ouaiss; Sriram Govindarajan; Vinoo Srinivasan; Meenakshi Kaul; Ranga Vemuri
This paper presents an integrated design system called SPARCS (Synthesis and Partitioning for Adaptive Reconfigurable Computing Systems) for automatically partitioning and synthesizing designs for reconfigurable boards with multiple field-programmable devices (FPGAS). The SPARCS system accepts design specifications at the behavior level, in the form of task graphs. The system contains a temporal partitioning tool to temporally divide and schedule the tasks on the reconfigurable architecture, a spatial partitioning tool to map the tasks to individual FPGAs, and a high-level synthesis tool to synthesize efficient register-transfer level designs for each set of tasks destined to be downloaded on each FPGA. Commercial logic and layout synthesis tools are used to complete logic synthesis, placement, and routing for each FPGA design segment. A distinguishing feature of the SPARCS system is the tight integration of the partitioning and synthesis tools to accurately predict and control design performance and resource utilizations. This paper presents an overview of SPARCS and the various algorithms used in the system, along with a brief description of how a JPEG-like image compression algorithm is mapped to a Multi-FPGA board using SPARCS.
design automation conference | 1999
Meenakshi Kaul; Ranga Vemuri; Sriram Govindarajan; Iyad Ouaiss
We present an automated temporal partitioning and loop transformation approach for developing dynamically reconfigurable designs starting from behavior level specifications. An Integer Linear Programming (ILP) model is formulated to achieve near-optimal latency designs. We, also present a loop restructuring method to achieve maximum throughput for a class of DSP applications. This restructuring transformation is performed on the temporally partitioned behavior and results in near-optimization of throughput. We discuss efficient memory mapping and address generation techniques for the synthesis of reconfigurable designs. A case study on the Joint Photographic Experts Group (JPEG) image compression algorithm demonstrates the effectiveness of our approach.
design, automation, and test in europe | 2001
Iyad Ouaiss; Ranga Vemuri
One step in the synthesis for FPGA-based Reconfigurable Computers (RCs) involves mapping the design data structures onto the physical memory banks available in the hardware. The advent of Xilinx Virtex-style FPGAs and of hierarchical memory schemes on reconfigurable boards introduced an added complexity to this mapping. The new RC boards offer a wealth of memory banks many of them on-chip (such as the BlockRAMs available in the Virtex architecture) and many of them offering variable number of ports and several depth/width configurations. Along with the external RAMs, a hierarchy of memories with varying access performances are available in a reconfigurable computer. It becomes critical to perform a good mapping to achieve optimal design performance. This paper presents an automatic memory mapping methodology which takes into account: the number of words and word size of design data segments and physical memory banks, number of ports on the banks, access latency of the banks, proximity of the banks to the processing unit, life cycle analysis of data segments, and it also incorporates configuration selection from the multiple configurations available in BlockRAMs of Virtex series FPGAs. In the case of multiple processing elements on board, the paper also provides a framework in which the task of memory mapping interacts with spatial partitioning to provide the best implementation.
field-programmable custom computing machines | 1998
Sriram Govindarajan; Iyad Ouaiss; Meenakshi Kaul; Vinoo Srinivasan; Ranga Vemuri
The SPARCS system is an integrated partitioning and synthesis environment for reconfigurable architectures. In this paper, we use the Joint Photographic Experts Group (JPEG) image compression algorithm as a design example to demonstrate the effectiveness of dynamic reconfiguration achieved using SPARCS. We present a typical design process using the SPARCS system consisting of temporal partitioning, spatial partitioning, and design synthesis. The results, obtained on a commercial RC architecture, show that the multiply-reconfigured version of the JPEG compression algorithm achieves reasonable improvement in execution times compared to the one-time configured version.
field-programmable custom computing machines | 2004
Hassan Al Atat; Iyad Ouaiss
The trend in new state-of-the-art FPGAs is to have large amounts of on-chip embedded memory blocks. These memory blocks are used to hold the input/output data for various applications. Existing register binding techniques in high-level synthesis aim at minimizing the storage requirements of circuits by sharing variables among registers and thus minimizing the required number of registers for a specific design. In this paper, a new technique is proposed that makes use of the existing embedded memory blocks and maps variables to these blocks. The proposed memory binding approach gives considerable performance increase over the existing register binding techniques. The memory binding technique resulted in up to 57% savings in the total chip area (number of logic cells/elements occupied on the FPGA) over the old register binding techniques for a small resource bag and up to 6% savings for a large resource bag.
design, automation, and test in europe | 2000
Iyad Ouaiss; Ranga Vemuri
In a multi-FPGA synthesis system, ideally the designer has only an abstract view of the board architecture. This abstract modeling of the underlying reconfigurable computer poses complex challenges to the synthesis and partitioning tools. Since the design specification is not constrained by the number of memory segments on the board or the number of pins between FPGAs, it is difficult for the CAD tools to transform the design into one that maps onto the multi-FPGA board. This paper describes an arbitration mechanism that bridges the abstraction between the implicit design and the reconfigurable architecture. Since this mechanism allows such architecture abstraction between the design and the board, it becomes easier to port a design from one target architecture to another. This arbitration mechanism introduces very little overhead in terms of area and delay. It has been used in data-dominated applications; in this paper fast Fourier transform (FFT) is shown as an illustrative example.
Computer Networks | 2010
Wissam Fawaz; Iyad Ouaiss; Ken Chen
This article addresses the ubiquitous topic of quality of service (QoS) aware connection provisioning in wavelength-routed WDM optical networks. The impact of the connection setup time of an optical connection has not been adequately addressed in the open literature. As such, this paper presents a novel approach that uses the optical connection setup time as a service differentiator during connection provisioning. The proposed approach utilizes the Earliest Deadline First (EDF) queueing algorithm to achieve deadline-based connection setup management with the deadline being the setup time requirement of an optical connection. The proposed EDF-based approach would allow the network operator to improve the QoS perceived by the end clients. Performance of this novel scheme is analyzed by accurately calculating various parameters, such as the fraction of connections provisioned on-time (i.e. prior to deadline expiration) and the average time it takes to successfully setup a connection. In addition, the presented approach is validated by a simulation that analyzes the performance of the proposed connection setup scheme in the specific context of the National Science Foundation Network (NSFNET). The obtained results show that a deadline-based setup strategy can minimize blocking probability while achieving QoS differentiation.
reconfigurable computing and fpgas | 2005
Annie Avakian; Iyad Ouaiss
When variables are assigned to registers or memories in FPGAs, multiplexers are needed for correct operation of the design. These multiplexers are needed at the input registers or memories if different functional units are writing to the same storage unit. Since in FPGAs the area covered by multiplexers is significantly large compared with the area of the overall design, reducing the area of the multiplexers can reduce the overall area occupied by a design. Reducing the area of a design is essential to efficiently utilize the logic area of the FPGAs. This paper proposes a solution that applies simulated annealing after binding variables to storage elements. This solution optimizes the assignment of variables onto registers when standard techniques such as clique partitioning are used; and onto on-chip memory banks when two different memory binding techniques are used. The savings obtained in terms of multiplexer area reaches 27% with an average of 16%; moreover, the overall logic area savings reaches 17% with an average of 7%
IEEE Transactions on Signal Processing | 2010
Samer S. Saab; Jad G. Hobeika; Iyad Ouaiss
Communication applications are increasingly relying on spread-spectrum techniques requiring the use of different types of pseudorandom noise generators (PRNGs). Such generators typically produce periodic deterministic signals, with key attributes of PRNGs being: signals produced have long periods, a large number of weakly correlated signals is produced with compatible spectral properties, most of the signal power of generated signals is contained in the desired frequency band, and arbitrary band selectivity of produced signals. Random generators can also be used for band jamming, with key attributes for band jamming being: most of the signal power is contained in the desired frequency band, arbitrary band selectivity, and a considerably flat power spectral density within the selected band. In this paper, a novel PRNG approach is proposed that can be used in several applications, including spread-spectrum techniques, as well as in band jamming. The signals produced by the proposed generator are based on a linear combination of continuous-time composite sinusoidal functions. Numerical examples are included in order to illustrate the performance of the proposed generator.
international parallel and distributed processing symposium | 2001
Iyad Ouaiss; Ranga Vemuri
Synthesizing designs for FPGA-based reconfigurable systems involves the task of mapping variables and data structures of the application onto RAMs of the reconfigurable board. The variety in types and performance of onboard and on-chip RAMs, their proximity to the processing units, and the interconnection scheme of the reconfigurable system, all contribute to an intricate memory mapping problem. An intelligent memory assignment minimizes the total latency of the design and the interconnection requirements due to memory accesses. A complete Integer Linear Programming (ILP) formulation of the problem results in an optimized memory mapping; however, the formulation is complex and takes a very long time to produce a solution. In order to efficiently solve the problem, the concept of global/detailed memory mapping is introduced in this paper. An ILP formulation of the global mapping process is described. This formulation is simpler and faster than the complete formulation, and it leaves the task of detailed mapping to a post-ILP tool that does not affect the optimality of the memory assignment. As a result, larger designs can be handled at a faster rate and more constraints can be introduced to the formulation.