Anup Hosangadi
University of California, Santa Barbara
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anup Hosangadi.
international conference on computer design | 2006
Shahnam Mirzaei; Anup Hosangadi; Ryan Kastner
We present a method for implementing high speed finite impulse response (FIR) filters using just registered adders and hardwired shifts. We extensively use a modified common subexpression elimination algorithm to reduce the number of adders. We target our optimizations to Xilinx Virtex II devices where we compare our implementations with those produced by Xilinx CoregenTM using Distributed Arithmetic. We observe up to 50% reduction in the number of slices and up to 75% reduction in the number of LUTs for fully parallel implementations. We also observed up to 50% reduction in the total dynamic power consumption of the filters. Our designs perform significantly faster than the MAC filters, which use embedded multipliers.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006
Anup Hosangadi; Farzan Fallah; Ryan Kastner
Polynomial expressions are frequently encountered in many application domains, particularly in signal processing and computer graphics. Conventional compiler techniques for redundancy elimination such as common subexpression elimination (CSE) are not suited for manipulating polynomial expressions, and designers often resort to hand optimizing these expressions. This paper leverages the algebraic techniques originally developed for multilevel logic synthesis to optimize polynomial expressions by factoring and eliminating common subexpressions. The proposed algorithm was tested on a set of benchmark polynomial expressions where savings of 26.7% in latency and 26.4% in energy consumption were observed for computing these expressions on the StrongARM SA1100 processor core. When these expressions were synthesized in custom hardware, average energy savings of 63.4% for minimum hardware constraints and 24.6% for medium hardware constraints over CSE were observed
international conference on computer aided design | 2004
Anup Hosangadi; Farzan Fallah; Ryan Kastner
Polynomial expressions are used to compute a wide variety of mathematical functions commonly found in signal processing and graphics applications, which provide good opportunities for optimization. However existing compiler techniques for reducing code complexity such as common subexpression elimination and value numbering are targeted towards general purpose applications and are unable to fully optimize these expressions. This work presents algorithms to reduce the number of operations to compute a set of polynomial expression by factoring and eliminating common subexpressions. These algorithms are based on the algebraic techniques for multi-level logic synthesis. Experimental results on a set of benchmark applications with polynomial expressions showed an average of 42.5% reduction in the number of multiplications and 39.6% reduction in the number of clock cycles for computation of these expressions on the ARM processor core, compared to common subexpression elimination.
International Journal of Reconfigurable Computing | 2010
Shahnam Mirzaei; Ryan Kastner; Anup Hosangadi
We present a method for implementing high speed finite impulse response (FIR) filters on field programmable gate arrays (FPGAs). Our algorithm is a multiplierless technique where fixed coefficient multipliers are replaced with a series of add and shift operations. The first phase of our algorithm uses registered adders and hardwired shifts. Here, a modified common subexpression elimination (CSE) algorithm reduces the number of adders while maintaining performance. The second phase optimizes routing delay using prelayout wire length estimation techniques to improve the final placed and routed design. The optimization target platforms are Xilinx Virtex FPGA devices where we compare the implementation results with those produced by Xilinx Coregen, which is based on distributed arithmetic (DA). We observed up to 50&% reduction in the number of slices and up to 75% reduction in the number of look up tables (LUTs) for fully parallel implementations compared to DA method. Also, there is 50% reduction in the total dynamic power consumption of the filters. Our designs perform up to 27% faster than the multiply accumulate (MAC) filters implemented by Xilinx Coregen tool using DSP blocks. For placement, there is a saving up to 20% in number of routing channels. This results in lower congestion and up to 8% reduction in average wirelength.
asia and south pacific design automation conference | 2005
Anup Hosangadi; Farzan Fallah; Ryan Kastner
This paper presents a novel technique to reduce the number of operations in multiplierless implementations of linear DSP transforms, by iteratively eliminating two-term common subexpressions. Our method uses a polynomial transformation of linear systems that enables us to eliminate common subexpressions consisting of multiple variables. Our algorithm is fast and produces the least number of additions/subtractions compared to all known techniques. The synthesized examples show significant reductions in the area and power consumption.
application-specific systems, architectures, and processors | 2004
Anup Hosangadi; Farzan Fallah; Ryan Kastner
Common subexpression elimination is commonly employed to reduce the number of operations in DSP algorithms after decomposing constant multiplications into shifts and additions. Conventional optimization techniques for finding common subexpressions can optimize constant multiplications with only a single variable at a time, and hence cannot fully optimize the computations with multiple variables found in matrix form of linear systems like DCT, DFT etc. We transform these computations such that all common subexpressions involving any number of variables can be detected. We then present heuristic algorithms to select the best set of common subexpressions. Experimental results show the superiority of our technique over conventional techniques for common subexpression elimination.
international conference on vlsi design | 2005
Anup Hosangadi; Ryan Kastner; Farzan Fallah
Polynomial expressions are used to approximate a wide variety of functions commonly found in signal processing and computer graphics applications. Computing these polynomial expressions in hardware consumes a lot of energy and therefore careful optimization of these expressions is important in order to achieve low energy consumption. Unfortunately, current optimization techniques for reducing complexity of expressions such as common subexpression elimination (CSE) cannot do a good optimization. In this paper, we present an algebraic technique to reduce the energy consumption of custom datapath implementation of polynomials by reducing the number of energy intensive operations. Our techniques can handle polynomial expressions of any order and containing any number of variables. Synthesis of a set of benchmark polynomials verified the advantages of our technique in reducing energy consumption, where we observed up to 58% improvement over CSE.
design, automation, and test in europe | 2006
Anup Hosangadi; Farzan Fallah; Ryan Kastner
Carry save adder (CSA) trees are commonly used for high speed implementation of multi-operand additions. We present a method to reduce the number of the adders in CSA trees by extracting common three-term subexpressions. Our method can optimize multiple CSA trees involving any number of variables. This optimization has a significant impact on the total area of the synthesized circuits, as we show in our experiments. To the best of our knowledge, this is the only known method for eliminating common subexpressions in CSA structures. Since extracting common subexpressions can potentially increase delay, we also present a delay aware extraction algorithm that takes into account the different arrival times of the signals
signal processing systems | 2007
Anup Hosangadi; Farzan Fallah; Ryan Kastner
Constant multiplications can be efficiently implemented in hardware by converting them into a sequence of nested additions and shift operations. They can be optimized further by finding common subexpressions among these operations. In this work, we present algebraic methods for eliminating common subexpressions. Algebraic techniques are established in multi-level logic synthesis for the minimization of the number of literals and hence gates to implement Boolean logic. In this work we use the concepts of two of these methods, namely rectangle covering and fast extract (FX) and adapt them to the problem of optimizing linear arithmetic expressions. The main advantage of using such methods is that we can optimize systems consisting of multiple variables, which is not possible using the conventional optimization techniques. Our optimizations are aimed at reducing the area and power consumption of the hardware, and experimental results show up to 30.3% improvement in the number of operations over conventional techniques. Synthesis and simulation results show up to 30% area reduction and up to 27% power reduction. We also modified our algorithm to perform delay aware optimization, where we perform common subexpression elimination such that the delay is not exceeded beyond a particular value.
Archive | 2010
Ryan Kastner; Anup Hosangadi; Farzan Fallah
Obtain better system performance, lower energy consumption, and avoid handcoding arithmetic functions with this concise guide to automated optimization techniques for hardware and software design. High-level compiler optimizations and high-speed architectures for implementing FIR filters are covered, which can improve performance in communications, signal processing, computer graphics, and cryptography. Clearly explained algorithms and illustrative examples throughout make it easy to understand the techniques and write software for their implementation. Background information on the synthesis of arithmetic expressions and computer arithmetic is also included, making the book ideal for new-comers to the subject. This is an invaluable resource for researchers, professionals, and graduate students working in system level design and automation, compilers, and VLSI CAD.