Proshanta Saha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Proshanta Saha is active.

Explore More

Publication

Featured researches published by Proshanta Saha.

field-programmable custom computing machines | 2007

Software/Hardware Co-Scheduling for Reconfigurable Computing Systems

Proshanta Saha; Tarek A. El-Ghazawi

We report the results of an FPGA implementation of double precision floating-point division with IEEE rounding. We achieve a total latency (i.e., cycles times clock period) that is 2:6 times smaller than the latency of the fastest previous implementation on FPGAs. The amount of hardware, on the other hand, is comparable to commercial cores. The division circuit is based on Goldschmidts algorithm. All IEEE rounding modes are supported and are implemented using dewpoint rounding. The precision of the initial approximation of the reciprocal is 14 bits. To save hardware and reduce the critical path, a half-sized 62x30 Booth radix-8 multiplier is used. This multiplier can receive both the multiplicand and the multiplier in carry-save representation. The division circuit is partitioned into four pipeline stages, has a latency of 11 cycles, and may restart a new double precision division operation after 8 cycles. Synthesis results of an implementation (not including the computation of the initial approximation of the reciprocal and the exponent path) guarantee a clock frequency of 131 MHz on an Altera Stratix II using 3592 ALMs. The implementation was successfully tested with over 10 million random vectors as well as over a million hard-to-round vectors.A formal methodology for automatic hardware-software partitioning and co-scheduling between the P and the FPGA has not yet been established. Current work in automatic task partitioning and scheduling for the reconfigurable systems strictly addresses the FPGA hardware, and does not take advantage of the synergy between the microprocessor and the FPGA. In this work, we consider the problem of co-scheduling task graphs on reconfigurable systems. The target systems have an execution model which allows any subtask that can run on the FPGA to also run on the microprocessor, and allows reconfigurability of the FPGA (subject to area, performance, resource, and timing constraints). In this paper, we introduce a new heuristic algorithm for such hardware/software co-scheduling, ReCoS. It will be shown that the proposed algorithm provides up to an order of magnitude improvement in scheduling and execution times when compared with hardware/software co-schedulers found in the embedded systems area, after adapting them for reconfigurable computing.

field-programmable logic and applications | 2007

Automatic Software Hardware Co-Design for Reconfigurable Computing Systems

Proshanta Saha

A formal methodology for automatic hardware-software partitioning and co-scheduling between the muP and the field programmable gate array (FPGA) has not yet been established. Current work in automatic task partitioning and scheduling for the reconfigurable systems strictly addresses the FPGA hardware, and does not take advantage of the synergy between the microprocessor and the FPGA. In this research, we consider the problem of formalizing a co-scheduling methodology and develop a set of intuitive tools to assist users in realizing the full potential of an RC architecture. Scheduling is critical for efficient resource utilization and achieving speedup in high performance reconfigurable computers (HPRC). The primary targets of this research are reconfigurable computing (RC) systems that have both microprocessors and FPGAs.

southern conference programmable logic | 2007

Extending Embedded Computing Scheduling Algorithms for Reconfigurable Computing Systems

Proshanta Saha; Tarek A. El-Ghazawi

Current work on automatic task partitioning and scheduling for reconfigurable computing (RC) systems strictly addresses the FPGA hardware, and does not take advantage of the synergy between the microprocessor and the FPGA. Efforts on partitioning between muP and the FPGA are a manual and laborious effort, as a formal methodology for automatic hardware-software partitioning has not been established. Related fields such as heterogeneous computing (HC) and embedded computing (EC) have an extensive body of work for scheduling for heterogeneous processors. Unlike the HC scheduling algorithms, the EC algorithms take into account the differences in computational capabilities of each processing element. In this work, we adapt EC scheduling algorithms for RC systems, and show how simply adapting the algorithms alone is not sufficient to take advantage of the reconfigurable hardware. We introduce new heuristic algorithms based on EC scheduling algorithms and show that they provide up to an order of magnitude improvement in scheduling and execution times.

national aerospace and electronics conference | 2008

Strategic Challenges for Application Development Productivity in Reconfigurable Computing

Saumil G. Merchant; Brian Holland; Casey Reardon; Alan D. George; Herman Lam; Greg Stitt; Melissa C. Smith; Nahid Alam; Ivan Gonzalez; Esam El-Araby; Proshanta Saha; Tarek A. El-Ghazawi; Harald Simmler

Performance and versatility requirements arising from escalating fabrication costs and design complexities are making reconfigurable computing technologies increasingly advantageous on the roadmap towards many-core technologies. This reformation in device architectures is necessitating a critical reformation in application design methods to bridge the widening semantic gap between design productivity and execution efficiency. This paper explores the strategic challenges in FPGA design methodologies and evaluates potential solutions and their impact on future DoD applications and users. A new research initiative, strategic infrastructure for reconfigurable computing applications (SIRCA), has also been proposed as a potential new DARPA program to address the FPGA productivity problem.

parallel computing | 2008

Portable library development for reconfigurable computing systems: A case study

Proshanta Saha; Esam El-Araby; Miaoqing Huang; Mohamed Taher; Sergio Lopez-Buedo; Tarek A. El-Ghazawi; Chang Shu; Kris Gaj; Alan Michalski; Duncan A. Buell

Portable libraries of highly-optimized hardware cores can significantly reduce the development time of reconfigurable computing applications. This paper presents the tradeoffs and challenges in the design of such libraries. A set of library development guidelines is provided, which has been validated with the RCLib case study. RCLib is a set of portable libraries with over 100 cores, targeting a wide range of applications. RCLib portability has been verified in three major High-Performance reconfigurable computing architectures: SRC6, Cray XD1 and SGI RC100. Compared to full-software implementations, applications using RCLib hardware acceleration cores show speedups ranging from one to four orders of magnitude.

international workshop on high-performance reconfigurable computing technology and applications | 2008

Hardware task scheduling optimizations for reconfigurable computing

Miaoqing Huang; Harald Simmler; Proshanta Saha; Tarek A. El-Ghazawi

Reconfigurable computers (RC) can provide significant performance improvement for domain applications. However, wide acceptance of todaypsilas RCs among domain scientist is hindered by the complexity of design tools and the required hardware design experience. Recent developments in hardware/software co-design methodologies for these systems provide the ease of use, but they are not comparable in performance to manual co-design. This paper aims at improving the overall performance of hardware tasks assigned to FPGA. Particularly the analysis of inter-task communication as well as data dependencies among tasks are used to reduce the number of configurations and to minimize the communication overhead and task processing time. This work leverages algorithms developed in the RC and reconfigurable hardware (RH) domains to address efficient use of hardware resources to propose two algorithms, weight-based scheduling (WBS) and highest priority first-next fit (HPF-NF). However, traditional resource based scheduling alone is not sufficient to reduce the performance bottleneck, therefore a comprehensive algorithm is necessary. The reduced data movement scheduling (RDMS) algorithm is proposed to address dependency analysis and inter-task communication optimizations. Simulation shows that compared to WBS and HPF-NF, RDMS is able to reduce the amount of FPGA configurations to schedule random generated graphs with heavy weight nodes by 30% and 11% respectively. Additionally, the proof-of-concept implementation of a complex 13-node example task graph on the SGI RC100 reconfigurable computer shows that RDMS is not only able to trim down the amount of necessary configurations from 6 to 4 but also to reduce communication overhead by 48% and the hardware processing time by 33%.

international conference on formal methods and models for co-design | 2007

A Methodology for Automating Co-Scheduling for Reconfigurable Computing Systems

Proshanta Saha; Tarek A. El-Ghazawi

A formal methodology for automatic hardware-software partitioning and co-scheduling between the muP and the FPGA has not yet been established. Current work in automatic task partitioning and scheduling for reconfigurable systems strictly addresses the FPGA hardware, and does not take advantage of the synergy between the microprocessor and the FPGA. In this work, we consider the problem of co-scheduling task graphs on reconfigurable systems. The target systems have an execution model which allows any subtask that can run on the FPGA to also run on the microprocessor, and allows reconfigurability of the FPGA (subject to area, performance, resource, and timing constraints). In this paper, we introduce a methodology for automatic co- scheduling using a proposed heuristic algorithm for hardware/software co-scheduling, ReCoS. It will be shown that the proposed algorithm provides up to an order of magnitude improvement in scheduling and execution times when compared with hardware/software co-schedulers found in related fields such as embedded systems, heterogeneous systems, and reconfigurable hardware systems.

national aerospace and electronics conference | 2008

Classification of Application Development for FPGA-Based Systems

Ivan Gonzalez; Esam El-Araby; Proshanta Saha; Tarek A. El-Ghazawi; Harald Simmler; Saumil G. Merchant; Brian Holland; Casey Reardon; Alan D. George; Herman Lam; Greg Stitt; Nahid Alam; Melissa C. Smith

Field-programmable gate arrays (FPGAs) have been used to accelerate DoD-related applications with promising performance. However, current development tools require significant hardware knowledge and are not amenable to the increasing complexity of FPGA-based systems. The application requirements are expected to change dramatically for future use cases, and require a well defined development methodology. This paper presents the results obtained after conducting an extensive survey and study about current FPGA tools. A classification for DoD use cases and FPGA tools is provided. This classification provides the current status of the available tools and identifies current tool limitations for DoD use cases.

acs/ieee international conference on computer systems and applications | 2007

Applications of Heterogeneous Computing in Hardware/Software Co-Scheduling

Proshanta Saha; Tarek A. El-Ghazawi

Current work on automatic task partitioning and scheduling for reconfigurable computing (RC) systems strictly addresses the field programmable gate array (FPGA) hardware, and does not take advantage of the synergy between the microprocessor and the FPGA. Efforts on partitioning between the microprocessor and the FPGA are often times a manual and laborious effort as a formal methodology for automatic hardware-software partitioning for RC systems has not yet been established. Related fields such as heterogeneous computing (HC) and embedded computing (EC) have an extensive body of work for scheduling for heterogeneous processors. In this work, we adapt HC scheduling algorithms for RC systems, and show how simply adapting the algorithms alone is not sufficient to take advantage of the reconfigurable hardware. In many cases, the HC heuristics algorithms do not generate efficient schedules necessary to take advantage of the synergy between the microprocessor and the FPGA. We introduce new heuristic algorithms based on HC scheduling algorithms and show that they provide up to an order of magnitude improvement in execution time.

field programmable gate arrays | 2005

Reconfigurable computers: an empirical analysis (abstract only)

Tarek A. El-Ghazawi; Kris Gaj; Nikitas A. Alexandridis; Allen Michalski; Devrim Fidanci; Mohamed Taher; Esam El-Araby; Esmail Chitalwala; Proshanta Saha

Reconfigurable Computers are parallel systems that are designed around multiple general-purpose processors and multiple field programmable gate array (FPGA) chips. These systems can leverage the synergism between conventional processors and FPGAs to provide low-level hardware functionality at the same level of programmability as general-purpose computers. In this work we conduct an experimental study using one of the state-of-the-art reconfigurable computers and a representative set of applications to assess the field, uncover the challenges, propose solutions, and conceive a realistic evolution path. We consider issues of concern including performance/cost. We also consider productivity in the sense of development, compiling, running, and system reliability. It will be shown that for some applications, the performance/cost can be orders of magnitude better than conventional computers. It will be also shown that programming such machines may still require some hardware knowledge, similar to hardware knowledge computer programmers must acquire to write scalable programs.

Explore More