Sridhar Rajagopal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sridhar Rajagopal is active.

Explore More

Publication

Featured researches published by Sridhar Rajagopal.

IEEE Transactions on Wireless Communications | 2002

Real-time algorithms and architectures for multiuser channel estimation and detection in wireless base-station receivers

Sridhar Rajagopal; Srikrishna Bhashyam; Joseph R. Cavallaro; Behnaam Aazhang

This paper presents algorithms and architecture designs that can meet real-time requirements of multiuser channel estimation and detection in future code-division multiple-access-based wireless base-station receivers. Sophisticated algorithms proposed to implement multiuser channel estimation and detection make their real-time implementation difficult on current digital signal processor-based receivers. A maximum-likelihood based multiuser channel estimation scheme requiring matrix inversions is redesigned from an implementation perspective for a reduced complexity, iterative scheme with a simple fixed-point very large scale integration (VLSI) architecture. A reduced-complexity, bit-streaming multiuser detection algorithm that avoids the need for multishot detection is also developed for a simple, pipelined VLSI architecture. Thus, we develop real-time solutions for multiuser channel estimation and detection for third-generation wireless systems by: (1) designing the algorithms from a fixed-point implementation perspective, without significant loss in error rate performance; (2) task partitioning; and (3) designing bit-streaming fixed-point VLSI architectures that explore pipelining, parallelism, and bit-level computations to achieve real-time with minimum area overhead.

midwest symposium on circuits and systems | 2002

A programmable baseband processor design for software defined radios

Sridhar Rajagopal; Scott Rixner; Joseph R. Cavallaro

Future wireless systems need extremely fast and flexible architectures to support varying standards, algorithms and protocols with data rates in the range of 10-100 Mbps. Software Defined Radios (SDRs) based on DSP-FPGAs are a widely proposed solution for these systems. However, these SDR solutions have not been able to meet real-time requirements. We propose a programmable architecture solution for SDRs using a stream-based architecture based on the Imagine media processor. The configurable Imagine simulator allows us to investigate issues such as memory bottlenecks, number and type of functional units needed, and the utilization of those functional units. To evaluate stream-based architectures for baseband processing, we parallelize and implement sophisticated baseband algorithms including multiuser estimation, multiuser detection and Viterbi decoding on this simulator. We present the bottlenecks in such a stream-based architecture for efficient communications processing. Comparisons with current generation DSP-based solutions show orders-of-magnitude performance improvements, both due to the stream-based nature of computations as well as the increase in the number of functional units having a high utilization factor. The result is a baseband processor designed with broad system functionality and flexibility that approaches real-time performance for future wireless systems.

signal processing systems | 2002

VLSI Implementation of the Multistage Detector for Next Generation Wideband CDMA Receivers

Gang Xu; Sridhar Rajagopal; Joseph R. Cavallaro; Behnaam Aazhang

The multistage detection algorithm has been proposed as an effective interference cancellation scheme for next generation Wideband Code Division Multiple Access (W-CDMA) base stations. In this paper, we propose a real-time VLSI implementation of this detection algorithm in the uplink system, where we have achieved both high performance in interference cancellation and computational efficiency. When interference cancellation converges, the difference of the detection vectors between two consecutive stages is mostly zero. Under the assumption of BPSK modulation, the differences between the bit estimates from consecutive stages are 0 and ±2. Bypassing the zero terms saves computations. Multiplication by ±2 can be easily implemented in hardware as arithmetic shifts. However, the convergence of the algorithm is dependent on the number of users, the interference and the signal to noise ratio and hence, the detection has a variable execution time. By using just two stages of the differencing detector, we achieve predictable execution time with performance equivalent to at least eight stages of the regular multistage detector. A VLSI implementation of the differencing multistage detector is built to demonstrate the computational savings and the real-time performance potential. The detector, handling up to eight users with 12-bit fixed point precision, was fabricated using a 1.2 μm CMOS technology and can process 190 Kbps/user for 8 users.

international symposium on circuits and systems | 2001

A bit-streaming, pipelined multiuser detector for wireless communication receivers

Sridhar Rajagopal; Joseph R. Cavallaro

This paper presents a bit-streaming, pipelined and reduced complexity architecture to meet real-time requirements for asynchronous multiuser detection in wireless communication CDMA receivers. Typically asynchronous multiuser detection involves multishot detection, which involves block-based computations and matrix inversions. Hence, iterative based suboptimal schemes have been studied to decrease the computational complexity and eliminate the need for matrix inversions. However, we show that such low-complexity schemes can have an added advantage of avoiding multishot detection if they start from a matched filter estimate. The stages of the iteration can be pipelined and bits processed in a streaming fashion. We show that such an implementation scheme reduces the latency of the bits by the detection window length D and eliminates the storage requirements for block computation, which helps in DSP implementations. We also avoid edge-bit computation effects, which reduces the computation by 2/D per detection stage. This scheme also results in a simple, bit-streaming and pipelined architecture. DSP simulations show that data rates of 800 Kbps for a single user to 50 Kbps for 32 users can be processed in real-time with additional FPGAs in a pipelined fashion for a spreading gain of 31, giving at least a 4/spl times/ speedup over a single DSP implementation.

symposium on computer arithmetic | 2001

On-line arithmetic for detection in digital communication receivers

Sridhar Rajagopal; Joseph R. Cavallaro

This paper demonstrates the advantages of using online arithmetic for traditional and advanced detection algorithms for communication systems. Detection is one of the core computationally-intensive physical layer operations in a communication receiver and determines the communication data rates. Detection algorithms typically involve hard decisions (sign based testing) to find the sign of the transmitted information bit. This results in extraneous computations in a conventional number system as the sign is obtained only at the end due to the least significant digit first (LSDF) nature of computations. Online arithmetic, based on a signed digit number representation, provides most significant digit first (MSDF) computation. Hence, the computations can stop after the first non-zero MSD (sign) is computed and additional computations for the successive digits can be avoided. Back-conversion to a conventional number system is not required as the sign of the digit represents the detected bit. A comparison of a radix-4 serial digit on-line multiuser detector with an 8-bit parallel conventional arithmetic multiuser detector shows a decrease in latency by 1.79X, a 3X increase in throughput, and possible savings in area.

signal processing systems | 2002

Efficient VLSI Architectures for Multiuser Channel Estimation in Wireless Base-Station Receivers

Sridhar Rajagopal; Srikrishna Bhashyam; Joseph R. Cavallaro; Behnaam Aazhang

This paper presents a reduced-complexity, fixed-point algorithm and efficient real-time VLSI architectures for multiuser channel estimation, one of the core baseband processing operations in wireless base-station receivers for CDMA. Future wireless base-station receivers will need to use sophisticated algorithms to support extremely high data rates and multimedia. Current DSP implementations of these algorithms are unable to meet real-time requirements. However, there exists massive parallelism and bit level arithmetic present in these algorithms than can be revealed and efficiently implemented in a VLSI architecture. We re-design an existing channel estimation algorithm from an implementation perspective for a reduced complexity, fixed-point hardware implementation. Fixed point simulations are presented to evaluate the precision requirements of the algorithm. A dependence graph of the algorithm is presented and area-time trade-offs are developed. An area-constrained architecture achieves low data rates with minimum hardware, which may be used in pico-cell base-stations. A time-constrained solution exploits the entire available parallelism and determines the maximum theoretical data processing rates. An area-time efficient architecture meets real-time requirements with minimum area overhead.

asilomar conference on signals, systems and computers | 1999

Arithmetic acceleration techniques for wireless communication receivers

Suman Das; Sridhar Rajagopal; Chaitali Sengupta; Joseph R. Cavallaro

We develop techniques to accelerate the implementation of the next generation wireless communication algorithms in hardware. We discuss an implementation of a key computationally intensive baseband algorithm for joint multiuser channel estimation and detection for this purpose and study its real-time requirements. An analysis of the bottlenecks present in the algorithm is made. We present an acceleration technique using task decomposition to take advantage of the existing pipelining and parallelism flow in the algorithm. We show that an application specific system design with multiple processing elements is more effective than the conventional single processor approach as it can satisfy the high data rate requirements of the next generation wireless communication systems. Our analysis is done independent of the final mapping of the processing elements in hardware.

IEEE Transactions on Computers | 2006

Truncated Online Arithmetic with Applications to Communication Systems

Sridhar Rajagopal; Joseph R. Cavallaro

Truncation in digit-precision is a very important and common operation in embedded system design for bounding the required finite precision and for area-time-power savings. In this paper, we present the use of online arithmetic to provide truncated computations with communication systems as one of the applications. In contrast to truncation in conventional arithmetic, online arithmetic can truncate dynamically and produce both area and time benefits due to the digit-serial nature of computations. This is of great advantage in communication systems where the precision requirements can change dynamically with the environment. While truncation in conventional arithmetic can have significant truncation errors, especially when the output precision is less than the input precision, the redundancy and most significant digit first nature of online arithmetic restricts the truncation error to only the least significant digit of the truncated result. As an application that uses significant truncation in precision, a code matched filter detector for wireless systems is designed using truncated online arithmetic. The detector can provide both hard decisions and soft(er) decisions dynamically as well as interface with other conventional arithmetic circuits or act as a DSP coprocessor. Thus, optimized communication receivers with coexisting conventional arithmetic for saturation and online arithmetic for truncation can now be built. The truncated online arithmetic detector was also verified with a VLSI implementation in an AMI 0.5 mu MOSIS tiny chip process

application-specific systems, architectures, and processors | 2000

Efficient VLSI architectures for baseband signal processing in wireless base-station receivers

Sridhar Rajagopal; Srikrishna Bhashyam; Joseph R. Cavallaro; Behnaam Aazhang

A real-time VLSI architecture is designed for multiuser channel estimation, one of the core baseband processing operations in wireless base-station receivers. Future wireless base-station receivers will need to use sophisticated algorithms to support extremely high data rates and multimedia. Current DSP architectures are unable to fully exploit the parallelism and bit level arithmetic present in these algorithms. These features can be revealed and efficiently implemented by task partitioning the algorithms for a VLSI solution. We modify the channel estimation algorithm for a reduced complexity fixed-point hardware implementation. We show the complexity and hardware required for three different area-time tradeoffs: an area-constrained, a time-constrained and an area-time efficient architecture. The area-constrained architecture achieves low data rates with minimum hardware, which may be used in pico-cell base-stations. The time-constrained solution exploits the entire available parallelism and determines the maximum theoretical data rates. The area-time efficient architecture meets real-time requirements with minimum area overhead. The orders-of-magnitude difference between area and time constrained solutions reveals significant inherent parallelism in the algorithm. All proposed VLSI solutions exhibit better time performance than a previous DSP implementation.

signal processing systems | 2016

Full Dimension MIMO (FD-MIMO) - Reduced Complexity System Design and Real-Time Implementation

Gary Xu; Yang Li; Sridhar Rajagopal; Robert W. Monroe; Jin Yuan; Sudhir Ramakrishna; Young Han Nam; Jianzhong Zhang

Full-Dimension MIMO (FD-MIMO) technology has been shown to increase spectral efficiency 2-4X compared to current LTE systems by exploiting a large number of antennas to support high order multiuser MIMO. High order multiuser MIMO with large number of antennas increase design and implementation complexity significantly. Furthermore, practical challenges such as antenna calibration for RF mismatches and failure need to be considered. In this paper, reduced-complexity precoding algorithm is introduced and optimized for real-time FPGA based implementation. Novel antenna calibration architecture is designed for FD-MIMO large 2-dimension array with the consideration of RF failure in practice. Field experimental results are also presented based on a proof of concept (PoC) FD-MIMO base-station (BS) with 32 antennas that supports up to 12 users and achieves spectral efficiency of ~21 bits/sec/Hz.

Explore More