Maya Gokhale | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maya Gokhale is active.

Explore More

Publication

Featured researches published by Maya Gokhale.

field programmable custom computing machines | 2000

Stream-oriented FPGA computing in the Streams-C high level language

Maya Gokhale; Jeffrey M. Arnold; Mirek Kalinowski

Stream oriented processing is an important methodology used in FPGA-based parallel processing. Characteristics of stream-oriented computing include high-data-rate flow of one or more data sources; fixed size, small stream payload (one byte to one word); compute-intensive operations, usually low precision fixed point, on the data stream; access to small local memories holding coefficients and other constants; and occasional synchronization between computational phases. We describe language constructs, compiler technology, and hardware/software libraries embodying the Streams-C system which has been developed to support stream-oriented computation on FPGA-based parallel computers. The language is implemented as a small set of library functions callable from a C language program. The Streams-C compiler synthesizes hardware circuits for multiple FPGAs as well as a multi-threaded software program for the control processor. Our system includes a functional simulation environment based on POSIX threads, allowing the programmer to simulate the collection of parallel processes and their communication at the functional level. Finally we present an application written both in Streams-C and hand-coded in VHDL. Compared to the hand-crafted design, the Streams-C-generated circuit takes 3x the area and runs at 1/2 the clock rate. In terms of time to market, the hand-done design took a month to develop by an experienced hardware developer. The Streams-C design rook a couple of days, for a productivity increase of 10x.

IEEE Computer | 1991

Building and using a highly parallel programmable logic array

Maya Gokhale; William Holmes; Andrew Kopser; Sara Lucas; Ronald Minnich; Douglas Sweely; Daniel P. Lopresti

A two-slot addition called Splash, which enables a Sun workstation to outperform a Cray-2 on certain applications, is discussed. Following an overview of the Splash design and programming, hardware development is described. The development of the logic description generator is examined in detail. Splashs runtime environment is described, and an example application, that of sequence comparison, is given.<<ETX>>

field-programmable custom computing machines | 1998

NAPA C: compiling for a hybrid RISC/FPGA architecture

Maya Gokhale

Hybrid architectures combining conventional processors with configurable logic resources enable efficient coordination of control with datapath computation. With integration of the two components on a single device, loop control and data-dependent branching can be handled by the conventional processor. While regular datapath computation occurs on the configurable hardware. This paper describes a novel pragma-based approach to programming such hybrid devices. The NAPA C language provides pragma directives so that the programmer (or an automatic partitioner) can specify where data is to reside and where computation is to occur with statement-level granularity. The NAPA C compiler, targeting National Semiconductors NAPA1000 chip, performs semantic analysis of the pragma-annotated program and co-synthesizes a conventional program executable combined with a configuration bit stream for the adaptive logic. Compiler optimizations include synthesis of hardware pipelines from pipelineable loops.

field-programmable custom computing machines | 1998

The NAPA adaptive processing architecture

Charle' R. Rupp; Mark Landguth; Tim Garverick; Edson Gomersall; Harry Holt; Jeffrey M. Arnold; Maya Gokhale

The National Adaptive Processing Architecture (NAPA) is a major effort to integrate the resources needed to develop teraops class computing systems based on the principles of adaptive computing. The primary goals for this effort include: (1) the development of an example NAPA component which achieves an order of magnitude cost/performance improvement compared to traditional FPGA based systems, (2) the creation of a rich but effective application development environment for NAPA systems based on the ideas of compile time functional partitioning and (3) significantly improve the base infrastructure for effective research in reconfigurable computing. This paper emphasizes the technical aspects of the architecture to achieve the first goal while illustrating key architectural concepts motivated by the second and third goals.

field-programmable custom computing machines | 2006

Hardware/Software Approach to Molecular Dynamics on Reconfigurable Computers

Ronald Scrofano; Maya Gokhale; Frans Trouw; Viktor K. Prasanna

With advances in re configurable hardware, especially field-programmable gate arrays (FPGAs), it has become possible to use reconfigurable hardware to accelerate complex applications, such as those in scientific computing. There has been a resulting development of reconfigurable computers - computers which have both general purpose processors and reconfigurable hardware, as well as memory and high-performance interconnection networks. In this paper, we study the acceleration of molecular dynamics simulations using reconfigurable computers. We describe how we partition the application between software and hardware and then model the performance of several alternatives for the task mapped to hardware. We describe an implementation of one of these alternatives on a reconfigurable computer and demonstrate that for two real-world simulations, it achieves a 2 times speed-up over the software baseline. We then compare our design and results to those of prior efforts and explain the advantages of the hardware/software approach, including flexibility

field-programmable custom computing machines | 1993

FPGA computing in a data parallel C

Maya Gokhale; Ron Minnich

The authors demonstrate a new technique for automatically synthesizing digital logic from a high level algorithmic description in a data parallel language. The methodology has been implemented using the Splash 2 reconfigurable logic arrays for programs written in Data-parallel Bit-serial C (dbC). The translator generates a VHDL description of a SIMD processor array with one or more processors per Xilinx 4010 FPGA. The instruction set of each processor is customized to the dbC program being processed. In addition to the usual arithmetic operations, nearest neighbor communication, host-to-processor communication, and global reductions are supported.<<ETX>>

The Journal of Supercomputing | 2003

Experience with a Hybrid Processor: K-Means Clustering

Maya Gokhale; Janette Frigo; Kevin McCabe; James Theiler; Christophe Wolinski; Dominique Lavenier

We discuss hardware/software co-processing on a hybrid processor for a compute- and data-intensive multispectral imaging algorithm, k-means clustering. The experiments are performed on two models of the Altera Excalibur board, the first using the soft IP core 32-bit NIOS 1.1 RISC processor, and the second with the hard IP core ARM processor. In our experiments, we compare performance of the sequential k-means algorithm with three different accelerated versions. We consider granularity and synchronization issues when mapping an algorithm to a hybrid processor. Our results show that speedup of 11.8X is achieved by migrating computation to the Excalibur ARM hardware/software as compared to software only on a Gigahertz Pentium III. Speedup on the Excalibur NIOS is limited by the communication cost of transferring data from external memory through the processor to the customized circuits. This limitation is overcome on the Excalibur ARM, in which dual-port memories, accessible to both the processor and configurable logic, have the biggest performance impact of all the techniques studied.

field-programmable logic and applications | 2005

Trident: an FPGA compiler framework for floating-point algorithms

Justin L. Tripp; Kristopher D. Peterson; Christine Ahrens; Jeffrey D. Poznanovic; Maya Gokhale

Trident is a compiler for floating point algorithms written in C, producing circuits in reconfigurable logic that exploit the parallelism available in the input description. Trident automatically extracts parallelism and pipelines loop bodies using conventional compiler optimizations and scheduling techniques. Trident also provides an open framework for experimentation, analysis, and optimization of floating point algorithms on FPGAs and the flexibility to easily integrate custom floating point libraries.

IEEE Transactions on Circuits and Systems | 2007

Reliability Analysis of Large Circuits Using Scalable Techniques and Tools

Debayan Bhaduri; Sandeep K. Shukla; Paul S. Graham; Maya Gokhale

The rapid development of CMOS and non-CMOS nanotechnologies has opened up new possibilities and introduced new challenges for circuit design. One of the main challenges is in designing reliable circuits from defective nanoscale devices. Hence, there is a need to develop methodologies to accurately evaluate circuit reliability. In recent years, a number of reliability evaluation methodologies based on probabilistic model checking, probabilistic transfer matrices, probabilistic gate models, etc., have been proposed. Scalability has been a concern in the applicability of these methodologies to the reliability analysis of large circuits. In this paper, we develop a general, scalable technique for these reliability evaluation methodologies. Specifically, an algorithm is developed for the model checking-based methodology and implemented in a tool called Scalable, Extensible Tool for Reliability Analysis (SETRA). SETRA integrates the scalable model checking-based algorithm into the conventional computer-aided design circuit design flow. The paper also discusses ways to modify the scalable algorithm for the other reliability estimation methodologies and plug them into SETRAs extensible framework. Our preliminary experiments show how SETRA can be used effectively to evaluate and compare the robustness of different circuit designs.

field programmable custom computing machines | 1997

High level compilation for fine grained FPGAs

Maya Gokhale; D. Gomersall

The authors present an integrated tool set to generate highly optimized hardware computation blocks from a C language subset. By starting with a C language description of the algorithm, they address the problem of making FPGA processors accessible to programmers as opposed to hardware designers. Their work is specifically targeted to fine grained FPGAs such as the National Semiconductor CLAy/sup TM/ FPGA family. Such FPGAs exhibit extremely high performance on regular data path circuits, which are more prevalent in computationally oriented hardware applications. Dense packing of data path functional elements makes it possible to fit the computation on one or a small number of chips, and the use of local routing resources makes it possible to clock the chip at a high rate. By developing a lower level tool suite that exploits the regular, geometric nature of fine grained FPGAs, and mapping the compiler output to this tool suite, they greatly improve performance over traditional high level synthesis to fine grained FPGAs.

Explore More