Hosahalli R. Srinivas

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hosahalli R. Srinivas is active.

Explore More

Publication

Featured researches published by Hosahalli R. Srinivas.

IEEE Journal of Solid-state Circuits | 1992

A fast VLSI adder architecture

Hosahalli R. Srinivas; Keshab K. Parhi

An architecture for performing fixed-point, high-speed, twos-complement, bit-parallel addition by using the carry-free property of redundant arithmetic and a fast parallel redundant-to-binary conversion scheme is presented. The internal numbers are represented in radix-2 redundant digit form, and the inputs and the output of the adder are represented in twos-complement binary form. The adder operands are added first in a radix-2 redundant adder to produce the result in radix-2 digit (-1, 0, 1) form. This result is converted to twos-complement binary form using the parallel conversion scheme. The high-speed conversion for long words is achieved through the use of a novel sign-select operation. The proposed adder, referred to as the sign-select conversion adder, is faster than all previous high-speed twos-complement binary adders for large word lengths. The implementation is highly regular with repeated modules and is very well suited for VLSI implementation. >

international conference on computer design | 1991

High-speed VLSI arithmetic processor architectures using hybrid number representation

Hosahalli R. Srinivas; Keshab K. Parhi

This paper addresses design of high speed architectures for fixed-point, twos-complement, bit-parallel division, square-root, and multiplication operations. These architectures make use of hybrid number representations (i.e. the input and output numbers are represented using twos complement representation, and the internal numbers are represented using radix-2 redundant representation). We propose newshifted remainder conditioning, andsign multiplexing techniques in combination with novel circuit architecture approaches to obtain efficient divider and square-root architectures. Our divider exploits full dynamic range of operands and eliminates the need for on-line or off-line conversion of the result to binary (this is because our nonrestoring division and square-root operators output binary quotient). Furthermore, since the binary input set is a subset of the redundant digit set, no binary-to-redundant number conversion is necessary at the input of the divider and square-root operators. We also present a fast, new conversion scheme for converting radix-2 redundant numbers to twos complement binary numbers, and use this to design a bit-parallel multiplier. This multiplier architecture requires fewer pipelining latches than conventional twos complement multipliers, and reduces the latency of the multiplication operation from (2W–1) to aboutW (whereW is the word-length), when pipelined at the bit-level.

international symposium on circuits and systems | 2000

Design and implementation of a 16 by 16 low-power two's complement multiplier

Alexander Goldovsky; Bimal Patel; Michael Schulte; Ravi Kolagotla; Hosahalli R. Srinivas; Geoffrey Francis Burns

This paper describes the design and implementation of a high-speed low-power 16 by 16 twos complement parallel multiplier. The multiplier uses optimized radix-4 Booth encoders to generate the partial products, and an array of strategically placed (3,2), (5,3), and (7,4) counters to reduce the partial products to sum and carry vectors. The more significant bits of the product are computed from left to right using a modified Ercegovac-Lang converter. An implementation of the multiplier in 0.25- /spl mu/m static CMOS technology has an area of 0.126 mm/sup 2/, a measured delay of 4.39 ns, and a average power dissipation of 0.110 mW/MHz at 2.5 Volts and 100/spl deg/C.

IEEE Transactions on Computers | 1995

A fast radix-4 division algorithm and its architecture

Hosahalli R. Srinivas; Keshab K. Parhi

In this paper we present a fast radix-4 division algorithm for floating point numbers. This method is based on Svobodas division algorithm and the radix-4 redundant number system. The algorithm involves a simple recurrence with carry-free addition and employs prescaling of the operands. In the proposed divider implementation, each radix-4 digit (belonging to set {-3,...,+3}) of the quotient and partial remainder is encoded using two radix-2 digits (belonging to the set {-1,0,+1}) and this leads to hardware simplicity. The quotient digits are determined by observing three most-significant radix-2 digits of the partial remainder and independent of the divisor. The architecture presented for the proposed algorithm is faster than previously proposed radix-4 dividers, which require at least four digits of the partial remainder to be observed to determine quotient digits. >

international symposium on circuits and systems | 1994

A fast radix 4 division algorithm

Hosahalli R. Srinivas; Keshab K. Parhi

In this paper we present a fast radix 4 division algorithm for floating point numbers based on Svobodas division algorithm. The algorithm involves a simple recurrence with carry-free addition and employs pre-scaling of the operands, The quotient digits are determined by observing three most-significant radix 2 digits (msds) of the partial remainder and independent of the divisor. The proposed algorithm is faster than previously proposed radix 4 and radix 2 division algorithms, which require at least four digits of the partial remainder to be observed to determine a quotient digit. The speedup is achieved at cost of increase in area.<<ETX>>

IEEE Transactions on Computers | 1997

Radix 2 division with over-redundant quotient selection

Hosahalli R. Srinivas; Keshab K. Parhi; Luis A. Montalvo

In this paper we present a new radix 2 division algorithm that uses a recurrence employing simple 3-to-2 digit carry-free adders to perform carry-free addition/subtraction for computing the partial remainders in radix 2 signed-digit form. The quotient digit, during any iteration of the division recursion, is generated from the two most-significant radix 2 digits of the partial remainder and independent of the divisor in over-redundant radix 2 digit form (i.e., with digits which belong to the digit set {-2, -1, 0, +1, +2}). The over-redundant quotient digits are then converted to the conventional radix 2 digits (belonging to the set {-1, 0, +1}) by using a reduction technique. This division algorithm is well suited for IEEE 754 standard operands belonging to the range (1, 2) and is slightly faster than previously proposed radix 2 designs (such as the radix 2 SRT), which do not employ input scaling, since the quotient selection for such algorithms is a function of more than two most-significant radix 2 digits of the partial remainder. In comparison with the designs that employ input scaling, the proposed design although slightly slower saves hardware required for scaling purposes.

IEEE Transactions on Very Large Scale Integration Systems | 1994

A C-testable carry-free divider

Hosahalli R. Srinivas; Bapiraju Vinnakota; Keshab K. Parhi

In this paper, the design of a C-testable, high-performance carry-free array divider is presented. A radix-2 redundant number based carry-free divider is considered and is modified to make it C-testable, i.e., it can be exhaustively tested using a constant number of test vectors irrespective of its word-length. Previous C-testable designs considered dividers which used carry-propagate adders/subtractors. These dividers are slow because of their O(W/sup 2/) computation time (where W is the word-length of the divider). High-performance carry-free dividers use carry-free redundant arithmetic adders/subtractors. Due to this feature, they have O(W) computation time. The on-the-fly converter used by carry-free dividers to convert the redundant quotient to twos-complement form is shown to be not C-testable. It is modified to be linear-testable (in word-length) instead of exponential time required for exhaustive testing of all possible combinations at its inputs. We conclude that the number of test vectors needed is 99 for C-testing of the divider array and (3W+10) for linear testing of the converter. The hardware overhead required to make the divider C-testable and the on-the-fly converter linear testable is also shown to be nominal. >

asilomar conference on signals, systems and computers | 1994

Computer arithmetic architectures with redundant number systems

Hosahalli R. Srinivas; Keshab K. Parhi

Redundant arithmetic number systems are gaining popularity in computationally intensive environments particularly because of the carry-free addition/subtraction properties they possess. This property has enabled arithmetic operations such as addition, multiplication, division, square root, etc., to be performed much faster than with conventional binary number systems. In this paper, some of the recent contributions to the area of design of redundant arithmetic based addition, multiplication, division, and square root algorithms and architectures are briefly discussed. Also, only the use of bit/digit-parallel implementation for architectures is discussed so that the enhancement in speed through the use of redundant arithmetic becomes immediately apparent as opposed to the use of bit/digit-serial architectures, where the primary justification for their use is to conserve area. A new radix 2 division algorithm using over-redundant radix 2 quotient digits and requiring a 2 digit quotient selection function is also presented.<<ETX>>

international conference on computer design | 1995

A floating point radix 2 shared division/square root chip

Hosahalli R. Srinivas; Keshab K. Parhi

This paper presents the architecture and implementation of a full-custom 1.2 micron CMOS VLSI chip that executes a shared division/square root algorithm operating on mantissas (23-b in length) of single precision IEEE 754 std. floating point numbers. The division and square root algorithms used in this implementation are the radix 2 signed digit based digit-by-digit schemes. These two algorithms perform quotient/root digit selection using two most-significant digits of the partial remainder and are hence faster than other similar previously proposed radix 2 shared division/square root schemes. This chip runs at a clock rate of about 66 MHz at 5.0 V (from simulations) and requires 29 cycles per divide/square root operation from the time the operands are provided at its pin inputs.

international conference on vlsi design | 1995

A 16-bit/spl times/16-bit 1.2 /spl mu/ CMOS multiplier with low latency vector merging

W. Amendola; Hosahalli R. Srinivas; Keshab K. Parhi

This paper presents the VLSI architecture and implementation of a 16/spl times/16-bit, bit-level pipelined, twos-complement binary array multiplier. This multiplier architecture employs signed-digit radix 2 carry free adders to perform multiplication. A fast conversion scheme for converting the final product, available from the multiplier array, in radix 2 signed-digit form to twos-complement binary form is employed to reduce the latency of the multiplier, furthermore, it results in savings in area in the form of reduced number or pipelining registers and half adders required for conversion, also called vector merging. This pipelined multiplier uses positive edge triggered registers and employs a single phase clocking scheme. It has been fabricated and tested to perform correctly at 50 MHz clock frequency for a supply voltage of 3.0 V. It may be noted that the speed of this multiplier is limited by 1 binary adder cell time and our test equipment imposed a limit of 50 MHz.

Explore More