Praveen K. Murthy | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Praveen K. Murthy is active.

Explore More

Publication

Featured researches published by Praveen K. Murthy.

Archive | 1996

Software Synthesis from Dataflow Graphs

Shuvra S. Battacharyya; Edward A. Lee; Praveen K. Murthy

From the Publisher: Software Synthesis from Dataflow Graphs addresses the problem of generating efficient software implementations from applications specified as synchronous dataflow graphs for programmable digital signal processors (DSPs) used in embedded real-time systems. Software Synthesis from Dataflow Graphs reviews the state-of-the-art in constructing static, memory-optimal schedules for programs expressed as SDF graphs. Code size reduction is obtained by the careful organization of loops in the target code. Data buffering is optimized by constructing the loop hierarchy in provably optimal ways for many classes of SDF graphs. The central result is a uniprocessor scheduling framework that provably synthesizes the most compact looping structures, called single appearance schedules, for a certain class of SDF graphs. In addition, algorithms and heuristics are presented that generate single appearance schedules optimized for data buffering usage. Numerous practical examples and extensive experimental data are provided to illustrate the efficacy of these techniques.

signal processing systems | 1999

Synthesis of Embedded Software from Synchronous Dataflow Specifications

Shuvra S. Bhattacharyya; Praveen K. Murthy; Edward A. Lee

The implementation of software for embedded digital signal processing (DSP) applications is an extremely complex process. The complexity arises from escalating functionality in the applications; intense time-to-market pressures; and stringent cost, power and speed constraints. To help cope with such complexity, DSP system designers have increasingly been employing high-level, graphical design environments in which system specification is based on hierarchical dataflow graphs. Consequently, a significant industry has emerged for the development of data-flow-based DSP design environments. Leading products in this industry include SPW from Cadence, COSSAP from Synopsys, ADS from Hewlett Packard, and DSP Station from Mentor Graphics. This paper reviews a set of algorithms for compiling dataflow programs for embedded DSP applications into efficient implementations on programmable digital signal processors. The algorithms focus primarily on the minimization of code size, and the minimization of the memory required for the buffers that implement the communication channels in the input dataflow graph. These are critical problems because programmable digital signal processors have very limited amounts of on-chip memory, and the speed, power, and cost penalties for using off-chip memory are often prohibitively high for embedded applications. Furthermore, memory demands of applications are increasing at a significantly higher rate than the rate of increase in on-chip memory capacity offered by improved integrated circuit technology.

IEEE Transactions on Signal Processing | 2002

Multidimensional synchronous dataflow

Praveen K. Murthy; Edward A. Lee

Signal flow graphs with dataflow semantics have been used in signal processing system simulation, algorithm development, and real-time system design. Dataflow semantics implicitly expose function parallelism by imposing only a partial ordering constraint on the execution of functions. One particular form of dataflow called synchronous dataflow (SDF) has been quite popular in programming environments for digital signal processing (DSP) since it has strong formal properties and is ideally suited for expressing multirate DSP algorithms. However, SDF and other dataflow models use first-in first-out (FIFO) queues on the communication channels and are thus ideally suited only for one-dimensional (1-D) signal processing algorithms. While multidimensional systems can also be expressed by collapsing arrays into 1-D streams, such modeling is often awkward and can obscure potential data parallelism that might be present. SDF can be generalized to multiple dimensions; this model is called multidimensional synchronous dataflow (MDSDF). This paper presents MDSDF and shows how MDSDF can be efficiently used to model a variety of multidimensional DSP systems, as well as other types of systems that are not modeled elegantly in SDF. However, MDSDF generalizes the FIFO queues used in SDF to arrays and, thus, is capable only of expressing systems sampled on rectangular lattices. This paper also presents a generalization of MDSDF that is capable of handling arbitrary sampling lattices and lattice-changing operations such as nonrectangular decimation and interpolation. An example of a practical system is given to show the usefulness of this model. The key challenge in generalizing the MDSDF model is preserving static schedulability, which eliminates the overhead associated with dynamic scheduling, and preserving a model where data parallelism, as well as functional parallelism, is fully explicit.

formal methods | 1997

Joint Minimization of Code and Data for Synchronous DataflowPrograms

Praveen K. Murthy; Shuvra S. Bhattacharyya; Edward A. Lee

In this paper, we formally develop techniques that minimize the memory requirements of a target program when synthesizing software from dataflow descriptions of multirate signal processing algorithms. The dataflow programming model that we consider is the synchronous dataflow (SDF) model [21], which has been used heavily in DSP design environments over the past several years. We first focus on the restricted class of well-ordered SDF graphs. We show that while extremely efficient techniques exist for constructing minimum code size schedules for well-ordered graphs, the number of distinct minimum code size schedules increases combinatorially with the number of vertices in the input SDF graph, and these different schedules can have vastly different data memory requirements. We develop a dynamic programming algorithm that computes the schedule that minimizes the data memory requirement from among the schedules that minimize code size, and we show that the time complexity of this algorithm is cubic in the number of vertices in the given well-ordered SDF graph. We present several extensions to this dynamic programming technique to more general scheduling problems, and we present a heuristic that often computes near-optimal schedules with quadratic time complexity. We then show that finding optimal solutions for arbitrary acyclic graphs is NP-complete, and present heuristic techniques that jointly minimize code and data size requirements. We present a practical example and simulation data that demonstrate the effectiveness of these techniques.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2001

Shared buffer implementations of signal processing systems using lifetime analysis techniques

Praveen K. Murthy; Shuvra S. Bhattacharyya

There has been a proliferation of block-diagram environments for specifying and prototyping digital signal processing (DSP) systems. These include tools from academia such as Ptolemy and commercial tools such as DSPCanvas from Angeles Design Systems, signal processing work system (SPW) from Cadence, and COSSAP from Synopsys. The block diagram languages used in these environments are usually based on dataflow semantics because various subsets of dataflow have proven to be good matches for expressing and modeling signal processing systems. In particular, synchronous dataflow (SDF) has been found to be a particularly good match for expressing multirate signal processing systems. One of the key problems that arises during synthesis from an SDF specification is scheduling. Past work on scheduling from SDF has focused on optimization of program memory and buffer memory under a model that did not exploit sharing opportunities. In this paper, we build on our previously developed analysis and optimization framework for looped schedules to formally tackle the problem of generating optimally compact schedules for SDF graphs. We develop techniques for computing these optimally compact schedules in a manner that also attempts to minimize buffering memory under the assumption that buffers will be shared. This results in schedules whose data memory usage is drastically lower than methods in the past have achieved. The method we use is that of lifetime analysis; we develop a model for buffer lifetimes in SDF graphs and develop scheduling algorithms that attempt to generate schedules that minimize the maximum number of live tokens under the particular buffer lifetime model. We develop several efficient algorithms for extracting the relevant lifetimes from the SDF schedule. We then use the well-known first-fit heuristic for packing arrays efficiently into memory. We report extensive experimental results on applying these techniques to several practical SDF systems and show improvements that average 50% over previous techniques, with some systems exhibiting up to an 83% improvement over previous techniques.

Design Automation for Embedded Systems | 1997

APGAN and RPMC: Complementary Heuristics for Translating DSP Block Diagrams into Efficient Software Implementations

Shuvra S. Bhattacharyya; Praveen K. Murthy; Edward A. Lee

Dataflow has proven to be an attractive computational model for graphical DSP design environments that support the automatic conversion of hierarchical signal flow diagrams into implementations on programmable processors. The synchronous dataflow (SDF) model is particularly well-suited to dataflow-based graphical programming because its restricted semantics offer strong formal properties and significant compile-time predictability, while capturing the behavior of a large class of important signal processing applications. When synthesizing software for embedded signal processing applications, critical constraints arise due to the limited amounts of memory. In this paper, we propose a solution to the problem of jointly optimizing the code and data size when converting SDF programs into software implementations.We consider two approaches. The first is a customization to acyclic graphs of a bottom-up technique, called pairwise grouping of adjacent nodes (PGAN), that was proposed earlier for general SDF graphs. We show that our customization to acyclic graphs significantly reduces the complexity of the general PGAN algorithm, and we present a formal study of our modified PGAN technique that rigorously establishes its optimality for a certain class of applications. The second approach that we consider is a top-down technique, based on a generalized minimum-cut operation, that was introduced recently in [14]. We present the results of an extensive experimental investigation on the performance of our modified PGAN technique and the top-down approach and on the trade-offs between them. Based on these results, we conclude that these two techniques complement each other, and thus, they should both be incorporated into SDF-based software implementation environments in which the minimization of memory requirements is important. We have implemented these algorithms in the Ptolemy software environment [5] at UC Berkeley.

ACM Transactions on Design Automation of Electronic Systems | 2004

Buffer merging—a powerful technique for reducing memory requirements of synchronous dataflow specifications

Praveen K. Murthy; Shuvra S. Bhattacharyya

We develop a new technique called buffer merging for reducing memory requirements of synchronous dataflow (SDF) specifications. SDF has proven to be an attractive model for specifying DSP systems, and is used in many commercial tools like System Canvas, SPW, and Cocentric. Good synthesis from an SDF specification depends crucially on scheduling, and memory is an important metric for generating efficient schedules. Previous techniques on memory minimization have either not considered buffer sharing at all, or have done so at a fairly coarse level (the meaning of this will be made more precise in the article). In this article, we develop a buffer overlaying strategy that works at the level of an input/output edge pair of an actor. It works by algebraically encapsulating the lifetimes of the tokens on the input/output edge pair, and determines the maximum amount of the input buffer space that can be reused by the output. We develop the mathematical basis for performing merging operations, and develop several algorithms and heuristics for using the merging technique for generating efficient implementations. We show improvements of up to 48% over previous techniques.

VLSI Signal Processing, VIII | 1995

Optimal parenthesization of lexical orderings for DSP block diagrams

Shuvra S. Bhattacharyya; Praveen K. Murthy; Edward A. Lee

Minimizing memory requirements for program and data are critical objectives when synthesizing software for embedded DSP applications. Previously, it has been demonstrated that for graphical programs based on the widely-used synchronous dataflow model an important class of minimum code size implementations can be viewed as parenthesizations of lexical orderings of the computational blocks. Such a parenthesization corresponds to the hierarchy of loops in the software implementation. In this paper, we present a dynamic programming technique for constructing a parenthesization that minimizes data memory cost from a given lexical ordering of a synchronous dataflow graph. For graphs that do not contain delays, this technique always constructs a parenthesization that has minimum data memory cost from among all parenthesizations for the given lexical ordering. When delays are present, the technique may make refinements to the lexical ordering while it is computing the parenthesization, and the data memory cost of the result is guaranteed to be less than or equal to the data memory cost of all valid parenthesizations for the initial lexical ordering.

software and compilers for embedded systems | 2004

Compact Procedural Implementation in DSP Software Synthesis Through Recursive Graph Decomposition

Ming-Yung Ko; Praveen K. Murthy; Shuvra S. Bhattacharyya

Synthesis of digital signal processing (DSP) software from dataflow-based formal models is an effective approach for tackling the complexity of modern DSP applications. In this paper, an efficient method is proposed for applying subroutine call instantiation of module functionality when synthesizing embedded software from a dataflow specification. The technique is based on a novel recursive decomposition of subgraphs in a cluster hierarchy that is optimized for low buffer size. Applying this technique, one can achieve significantly lower buffer sizes than what is available for minimum code size inlined schedules, which have been the emphasis of prior software synthesis work. Furthermore, it is guaranteed that the number of procedure calls in the synthesized program is polynomially bounded in the size of the input dataflow graph, even though the number of module invocations may increase exponentially. This recursive decomposition approach provides an efficient means for integrating subroutine-based module instantiation into the design space of DSP software synthesis. The experimental results demonstrate a significant improvement in buffer cost, especially for more irregular multi-rate DSP applications, with moderate code and execution time overhead.

design, automation, and test in europe | 2000

Shared memory implementations of synchronous dataflow specifications

Praveen K. Murthy; Shuvra S. Bhattacharyya

There has been a proliferation of block-diagram environments for specifying and prototyping DSP systems. These include tools from academia like Ptolemy and GRAPE, and commercial tools like SPW from Cadence Design Systems, Cossap from Synopsys, and the HP ADS tool from HP. The block diagram languages used in these environments are usually based on dataflow semantics because various subsets of dataflow have proven to be good matches for expressing and modeling signal processing systems. In particular synchronous dataflow (SDF) has been found to be a particularly good match for expressing multirate signal processing systems. One of the key problems that arises during synthesis from an SDF specification is scheduling. Past work on scheduling from SDF has focused on optimization of program memory and buffer memory. However, no attempt was made for overlaying or sharing buffers. In this paper we formally tackle the problem of generating optimally compact schedules for SDF graphs, that also attempt to minimize buffering memory under the assumption that buffers will be shared. This will result in schedules whose data memory usage is drastically lower (up to 83%) than methods in the past have achieved.

Explore More