Balkrishna Ramkumar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Balkrishna Ramkumar is active.

Explore More

Publication

Featured researches published by Balkrishna Ramkumar.

ieee international symposium on fault tolerant computing | 1997

Portable checkpointing for heterogeneous architectures

Balkrishna Ramkumar; Volker Strumpen

Current approaches for checkpointing assume system homogeneity, where checkpointing and recovery are both performed on the same processor architecture and operating system configuration. Sometimes it is desirable or necessary to recover a failed computation on a different processor architecture. For such situations checkpointing and recovery must be portable. In this paper, we argue that source-to-source compilation is an appropriate concept for this purpose. We describe the compilation techniques that we developed for the design of the c2ftc prototype. The c2fte compiler enables machine-independent checkpoints by automatic generation of checkpointing and recovery code. Sequential C programs are compiled into fault tolerant C programs, whose checkpoints can be migrated across heterogeneous networks, and restarted on binary incompatible architectures. Experimental results on several systems provide evidence that the performance penalty of portable checkpointing is negligible for realistic checkpointing frequencies.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1997

An evaluation of parallel simulated annealing strategies with application to standard cell placement

John A. Chandy; Sungho Kim; Balkrishna Ramkumar; Steven Parkes; Prithviraj Banerjee

Simulated annealing, a methodology for solving combinatorial optimization problems, is a very computationally expensive algorithm and, as such, numerous researchers have undertaken efforts to parallelize it. In this paper, we investigate three of these parallel simulated annealing strategies when applied to standard cell placement, specifically the TimberWolfSC placement tool. We have examined a parallel moves strategy, as well as two new approaches to parallel cell placement-multiple Markov chains and speculative computation. These algorithms have been implemented in ProperPLACE, our parallel cell placement application, as part of the ProperCAD II project. We have constructed ProperPLACE so that it is portable across a wide range of parallel architectures. Our parallel moves algorithm uses novel approaches to dynamic message sizing, message prioritization, and error control. We show that parallel moves and multiple Markov chains are effective approaches to parallel simulated annealing when applied to TimberWolfSC, yet speculative computation is wholly inadequate.

international conference on computer aided design | 1992

Portable parallel test generation for sequential circuits

Balkrishna Ramkumar; Prithviraj Banerjee

A parallel test generation algorithm, ProperTEST, for sequential circuits that is portable across a range of MIMD parallel architectures is discussed. It uses prioritized execution to ensure consistent speedups as the number of processors is increased. This consistency is achieved without loss of fault coverage with increase in the number of processors. This also permits the use of parallel processing to improve the fault coverage when the execution time is bounded. Results on ISCAS 89 benchmark programs are provided on a shared memory machine, a message passing machine, and a network of workstations. ProperTEST was run unchanged on these different architectures.<<ETX>>

Proceedings of the US/Japan Workshop on Parallel Symbolic Computing: Languages, Systems, and Applications | 1992

Prioritization in Parallel Symbolic Computing

Laxmikant V. Kalé; Balkrishna Ramkumar; Vikram A. Saletore; Amitabh Sinha

It is argued that scheduling is an important determinant of performance for many parallel symbolic computations, in addition to the issues of dynamic load balancing and grain size control. We propose associating unbounded levels of priorities with tasks and messages as the mechanism of choice for specifying scheduling strategies. We demonstrate how priorities can be used in parallelizing computations in different search domains, and show how priorities can be implemented effectively in parallel systems. Priorities have been implemented in the Charm portable parallel programming system. Performance results on shared-memory machines with tens of processors and nonshared-memory machines with hundreds of processors are given. Open problems for prioritization in specific domains are given, which will constitute fertile area for future research in this field.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1994

ProperCAD: A portable object-oriented parallel environment for VLSI CAD

Balkrishna Ramkumar; Prithviraj Banerjee

Most parallel algorithms for VLSI CAD proposed to date work efficiently only on machines that they were designed for. As a result, these algorithms are dependent on the architecture for which they are developed and do not port easily to other parallel architectures. In an effort to address this problem, we are developing a Portable object-oriented parallel environment for CAD algorithms (ProperCAD). The objectives of this research are two-fold: 1) To develop new parallel algorithms that run in a portable object-oriented environment. We accomplish this in two stages. First, we develop CAD algorithms using a general purpose platform for portable parallel programming called CHARM developed at the University of Illinois. Concurrently, we are developing a C++ environment that is truly object-oriented and specialized for CAD applications; and 2) To design the parallel algorithms around a good sequential algorithm with a well-defined parallel-sequential interface. This will permit the parallel algorithm to benefit from future developments in sequential algorithms. This approach is described using one CAD application that has been implemented as part of this project-ProperEXT: a flat extractor for VLSI circuits. The algorithm, its implementation, and performance of ProperEXT on a range of parallel machines is presented. The implementation is portable across a variety of parallel platforms without change. It currently runs on an Encore Multimax, a Sequent Symmetry, Inter iPSC/2 and i860 hypercubes, a NCUBE 2 hypercube and a network of Sun Sparc workstations. >

international parallel processing symposium | 1994

ProperPLACE: a portable parallel algorithm for standard cell placement

S. Kim; John A. Chandy; Steven Parkes; Balkrishna Ramkumar; Prithviraj Banerjee

Parallel algorithms developed for CAD problems today suffer from three important drawbacks. First, they are machine specific and tend to perform poorly on architectures other than the one for which they were designed. Second, they cannot use the latest advances in improved versions of the sequential algorithms for solving the problem. Third, the quality of results degrade significantly during parallel execution. We address these three problems for an important CAD application: standard cell placement. We have developed a parallel placement algorithm that is portable across a range of MIMD parallel architectures. The algorithm is part of the ProperCAD project which allows the development and implementation of a parallel algorithm such that it can be executed on a wide variety of parallel machines without any change to the source. The parallel placement algorithm is based on an existing implementation of the sequential simulated annealing algorithm, TimberWolfSC 6.0 (C. Sechen and A. Sangiovanni-Vincentelli, 1985).<<ETX>>

international conference on computer aided design | 1992

ProperSYN: A portable parallel algorithm for logic synthesis

Kaushik De; Balkrishna Ramkumar; Prithviraj Banerjee

An algorithm based on the transduction method and implemented in the ProperCAD environment is described. The parallel ProperSYN algorithm attempts to make the execution time manageably small. The algorithm uses an asynchronous message driven computing model with no synchronizing barriers, and hence it is scalable to a larger number of processors. Also, the algorithm is portable across a wide variety parallel machines. Experimental results on various parallel machines are presented. The algorithm is built around a well-defined sequential algorithm interface such that there can be benefits from future expansion of the sequential algorithm.<<ETX>>

international conference on computer design | 1992

ProperCAD: a portable object-oriented parallel environment for VLSI CAD

Balkrishna Ramkumar; Prithviraj Banerjee

A portable object-oriented parallel environment for CAD algorithms (ProperCAD) is described. The objectives of this research are twofold: to develop parallel algorithms that are portable and to design the parallel algorithms around a good sequential algorithm with a well-defined parallel-sequential interface, permitting the parallel algorithm to benefit from future developments in sequential algorithms. The first is achieved by writing the algorithms using the ProperCAD environment, a library of functions that permits portability of parallel CAD algorithms across MIMD machines. Programs written using this environment run unchanged on all parallel machines for which this environment is available.<<ETX>>

IEEE Transactions on Parallel and Distributed Systems | 1994

Machine independent AND and OR parallel execution of logic programs. II. Compiled execution

Balkrishna Ramkumar; Laxmikant V. Kalé

For pt.I. see ibid., p. 170-80. In pt.I, we presented a binding environment for the AND and OR parallel execution of logic programs. This environment was instrumental in rendering a compiler for the AND and OR parallel execution of logic programs machine independent. In this paper, we describe a compiler based on the Reduce-OR process model (ROPM) for the parallel execution of Prolog programs, and provide performance of the compiler on five parallel machines: the Encore Multimax, the Sequent Symmetry, the NCUBE 2, the Intel i860 hypercube and a network of Sun workstations. The compiler is part of a machine independent parallel Prolog development system built on top of a run time environment for parallel programming called the Chare kernel, and runs unchanged on these multiprocessors. In keeping with the objectives behind the ROPM, the compiler supports both on and independent AND parallelism in Prolog programs and is suitable for execution on both shared and nonshared memory machines. We discuss the performance of the Prolog compiler in some detail and describe how grain size can be used to deliver performance that is within 10% of the underlying sequential Prolog compiler on one processor, and scale linearly with increasing number of processors on problems exhibiting sufficient parallelism. The loose coupling between parallel and sequential components makes it possible to use the best available sequential compiler as the sequential component of our compiler. >

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1997

ProperTEST: a portable parallel test generator for sequential circuits

Balkrishna Ramkumar; Prithviraj Banerjee

Parallel algorithms developed for CAD problems today suffer from two important drawbacks. First, they are machine specific, and tend to perform poorly on architectures other than the one for they were designed. Second, the quality of results degrades significantly during parallel execution. In this paper, we address these two problems for an important CAD application: test generation for sequential circuits, We have developed a new parallel test generator, ProperTEST, that is portable across a range of MIMD parallel architectures. This work is part of the ProperCAD project which aims to develop CAD algorithms that run unchanged on shared and nonshared memory machines. We present performance data for ProperTEST on ISCAS 89 sequential circuits on a Sequent Symmetry, an Intel i860 hypercube, an NCUBE/2 hypercube, a network of Sun workstations, and an Encore Multimax. Parallel processing can also be used to improve on the fault coverage possible on one processor in a given amount of time. This was not possible in earlier approaches due to search anomalies. Using ProperTEST, we provide results on ISCAS 89 benchmark programs demonstrating the improvements in fault coverage as the number of processors is increased.

Explore More