Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where G. Lakshminarayanan is active.

Publication


Featured researches published by G. Lakshminarayanan.


IEEE Transactions on Very Large Scale Integration Systems | 2005

Optimization techniques for FPGA-based wave-pipelined DSP blocks

G. Lakshminarayanan; B. Venkataramani

In this paper, techniques for efficient implementation of field-programmable gate-array (FPGA)-based wave-pipelined (WP) multipliers, accumulators, and filters are presented. A comparison of the performance of WP and pipelined systems has been made. Major contributions of this paper are development of an on-chip clock generation scheme which permits finer tuning of the frequency, a synthesis technique that reduces the area and latency by 25%, a placement utility that results in 10%-40% increase in speed and proposal of an interleaving scheme for filters that reduces the number of multipliers required by 50%. WP multipliers of size 2 /spl times/ 6 and the filters using them are found to be 11% faster and require lower power than those using pipelined multipliers. Filters with higher order WP multipliers also operate with lower power at the cost of speed. The delay-register products of such filters are found to be about 60% lower than those using the pipelined multipliers. The paper also outlines applications of these techniques for the Spartan II FPGAs and a self-tuning scheme for optimizing the speed.


ieee region 10 conference | 2003

Design and FPGA implementation of image block encoders with 2D-DWT

G. Lakshminarayanan; B. Venkataramani; J. Senthil Kumar; A.K. Yousuf; G. Sriram

In this paper schemes for computation of 2D DWT of 32 /spl times/ 32 subimages using both lifting and DAA technique with Baugh-Wooley multiplier (BWM) is proposed and implemented on Xilinx XC2S150PQ208-5 FPGAs. The implementation results show that the lifting scheme with BWM requires about 20% less area but is 1.55 times faster than that using conventional 2s complement multiplier (C2M). For larger word sizes, the DAA with BWM is found to be 1.2 times faster than that using C2M. An overlap method for processing 128 /spl times/ 128 image using subimages of size 32 /spl times/ 32 is proposed and implemented. The 2D DWT of the image is also computed using a C program. The LL1 component of the image obtained using all the above schemes are found to be matching well with the original image. FPGA implementation of higher level 2D DWT is under progress.


international conference on computer communications and networks | 2008

Design and implementation of pipelined MB-OFDM UWB transmitter backend modules on FPGA

M. Santhi; Maski Shravan Kumar; G. Lakshminarayanan; T.N. Prabakar

In this paper, novel ideas have been proposed for designing and implementing the pipelined MB-OFDM UWB transmitter Digital Backend Modules on FPGA for a data rate of 200 Mbps. The various digital backend modules are scrambler, convolutional encoder, puncturer, interleaver, QPSK mapping, and OFDM. The most critical block is the OFDM block because it consists of 128 point IFFT, that to work at a speed of 528 MHz. This is achieved in the proposed OFDM module by using modified radix-24 SDF algorithm with extensive pipelining of LPM without using parallel architecture. By the way the speed 528 MHz can be obtained with minimum hardware. Also the hardware complexity has been significantly reduced by usage of constant coefficient canonical signed digit (CSD) multipliers and accuracy has been improved by the internal word length maintained at 13 bits which is 7 bits more than the input. For designing the interleaver, the initial problem faced was that the amount of registers that has to be used in designing the interleaver using bit mapping. This leads to thousands of registers in use. In the proposed interleaver, two different RAM banks which are working in tandem with different write and read addresses and clock rates are used to provide optimum results. The implementation has been performed on ALTERA STRATIX III EP3SL50F484C2 FPGA and results obtained are compliance to IEEE 802.15.3a standard.


ieee region 10 conference | 2009

A low power 700MSPS 4bit time interleaved SAR ADC in 0.18um CMOS

Sanjay G. Talekar; S. Ramasamy; G. Lakshminarayanan; B. Venkataramani

The design and implementation details of a 4-bit time interleaved Successive approximation register (SAR) analog to digital converter (ADC) for UWB application is presented in this paper. Low latency SAR ADC has been implemented by detecting two bits per clock cycle. Major contribution of this paper is that it uses only two capacitive DACs instead of three capacitive DACs. This is achieved by using Gilbert cell preamplifier in one of comparator detectors. It reduces the power consumption by approximately 33%. The ADC is implemented in 0.18μm CMOS technology and has total power consumption of 23.3mw at sampling frequency of 700MSPS for an input swing of 1V peak to peak. The proposed SAR ADC gives SNDR of 23.9dB, SFDR of 32.6dB and THD of −37.8dB at nyquist rate. The proposed ADC enables a larger input swing with Figure of Merit of 2.5 which is higher than that of SAR ADCs reported in the literature.


Iete Journal of Research | 2006

Design and FPGA Implementation of Self Tuned Wavepipelined Filters

G. Seetharaman; B. Venkataramani; G. Lakshminarayanan

In this paper, a novel scheme is proposed for FPGA implementation of a wavepipelined filter using Distributed Arithmetic Algorithm (DAA). To make the circuit independent of fabrication variations in the parameters, a sub-optimal wavepipelined scheme is proposed for the various combinational blocks of the DA filter. A self tuning FSM is in-built to choose the clock skew and clock period for I/O registers between the wavepipelined blocks. To test the efficacy of the scheme proposed, three filters with 4, 8 and 10 taps respectively are implemented using DAA approach on Xilinx Spartan II XC2S100-5PQ208 device. The filters are implemented using three schemes: synchronous pipelining, sub-optimal wavepipelining and no pipelining (i.e., using neither synchronous pipelining nor wavepipelining). From the implementation results, it is observed that wavepipelined DA filters are faster by a factor of 1.31–1.61 compared to non-pipelined DA filters. The synchronous pipelined DA filters are in turn faster by a factor of 1.73-2.06 compared to the wavepipelined DA filters. The increased speeds are achieved by increasing the number of slices by 25%-33%, the number of registers by 350-530% and power dissipation by 107-167%. The delay-register product of the wavepipelined DA filters are reduced by a factor of 2.64-3.06 compared to the pipelined DA filters. The technique proposed in this paper is also applicable for ASICs and FPGAs from other vendors.


canadian conference on electrical and computer engineering | 2013

Design of a low power network interface for Network on chip

K. Swaminathan; G. Lakshminarayanan; Frank Lang; Maher Nihad Fahmi; Seok-Bum Ko

In this paper, a low power flexible Network Interface (NI) Architecture for Network on chip (NoC) is proposed. The flexible run time configuration controller in the proposed NI plays a vital role to reduce the power by enabling and disabling the entire asynchronous First In First Outs (FIFOs) based on the traffic conditions between the router and the processing elements (PE). The NI has been implemented in Xilinx Virtex-5 XC5VLX110T FPGA. Experimental results show that the proposed low power NI offers a power improvement of 37%, when both FIFOs are inactive and 32%, when only one FIFO is active.


international symposium on electronic system design | 2012

High Speed Generic Network Interface for Network on Chip Using Ping Pong Buffers

K. Swaminathan; G. Lakshminarayanan; Seok-Bum Ko

Connecting different Intellectual Property (IP) cores with Network on Chip (NoC) router using Network Interface (NI) is a challenging task due to its asynchronous nature and data width. In this paper, a generic high-speed Network Interface for Network on Chip using Ping Pong Buffers is proposed in order to ensure the seamless high throughput between the router and processing core. The proposed scheme uses simple control logic to handle the read and write operations simultaneously in the memory modules. This proposed method is analyzed with the existing Asynchronous First in First Out (FIFO) based NIs with different encoding schemes like One-Hot encoding and Johnson encoding. The optimal depth of the asynchronous FIFOs is calculated based on router frequency, processing element frequency, packet size and flit size at router and processing element using Practical Extraction and Report Language (PERL) and the required Register Transfer Level (RTL) Verilog Hardware Description Language (HDL) and timing constrain is created by Perl scripting itself. The NI is implemented using the asynchronous FIFOs and ping pong - double buffering scheme using Altera Stratix III FPGA. The synthesis results show that the proposed architecture enhances the speed of NI by 30 % when memory depth is 8 and enhances speed by 11% when memory depth is 256.


international symposium on circuits and systems | 2012

Dynamic partial reconfigurable FFT/IFFT pruning for OFDM based Cognitive radio

C. Vennila; Kumar Palaniappan Ct; Kodati Vamsi Krishna; G. Lakshminarayanan; Seok-Bum Ko

Cognitive Radio is an application in which Spectrum utilization can be improved by allowing secondary users to use the spectrum when it is not used by licensed primary users. An adaptive OFDM system for Cognitive radio has the ability to nullify unnecessary individual carriers and avoid interference to licensed primary users. A Fast Fourier Transform (FFT) block forms the core of OFDM design. But, the zero valued inputs outnumber the non-zero valued inputs in the FFT block making the standard FFT algorithms computationally inefficient due to wasted operation on zero values. To overcome this problem, several pruning algorithms have been developed. But many of them are architecturally inefficient for FPGA implementation due to complexity of the overhead operations. Moreover, these algorithms are not suitable for applications like Cognitive radio which has zero inputs in arbitrary distributions making hardware implementation to be complex. This paper presents a novel and efficient dynamically partial reconfigurable (DPR) Transform Decomposition (TD) FFT and Radix 2 based IFFT pruning for OFDM based Cognitive Radio on FPGA. Tested FPGA results on XC2VP30 for the DPR method show the configuration time improvement, good area and power efficiency.


international conference on communication control and computing technologies | 2010

PAPR reduction for improving performance of OFDM system

C. Vennila Arasu; Puneet Hyanki; H. Lakshman Sharma; G. Lakshminarayanan; Moon Ho Lee; Seok-Bum Ko

Orthogonal frequency division multiplexing (OFDM) signals have a generic problem of high peak to average power ratio (PAPR) which is defined as the ratio of the peak power to the average power of the OFDM signal. The drawback of the high PAPR is that the dynamic range of the power amplifiers (PA) & digital-to-analog (D/A) converters required during the transmission and reception of the signal is higher. As a result, the total cost of the transceiver increases, with reduced efficiency. This paper includes an idea of the PAPR constraint, along with the implementation and analysis of some specific algorithms for PAPR reduction of the OFDM signals. All the PAPR reduction algorithms are implemented and tested in the OFDM transceiver designed using Matlab. Analyzing some specific algorithms, we propose a novel algorithm by combining the subcarrier phase adjustment and linear combination of the time domain OFDM signals. Simulation results for a OFDM system employing Quadrature Amplitude Shift Keying (QASK) symbols and Quadrature Amplitude Modulation (QAM) can achieve a PAPR reduction of 5 dB.


vlsi design and test | 2014

Pipelined FFT architectures for real-time signal processing and wireless communication applications

Antony Xavier Glittas; G. Lakshminarayanan

This paper proposes two-parallel pipelined fast Fourier transform (FFT) architectures for the discrete Fourier transform (DFT) computation of real-valued signals. The architectures are optimized with less number of registers for signal processing and wireless communication applications. The clock to registers is disabled to avoid storing of the redundant values and hence the registers actually storing those redundant values are eliminated. The proposed architectures requires 22% less registers than the prior architectures. The real-valued FFT (RFFT) processor is further optimized to process BPSK outputs in which case 43% of register is reduced.

Collaboration


Dive into the G. Lakshminarayanan's collaboration.

Top Co-Authors

Avatar

Seok-Bum Ko

University of Saskatchewan

View shared research outputs
Top Co-Authors

Avatar

B. Venkataramani

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

M. Santhi

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

G. Seetharaman

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

C. Vennila

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sowjanya Tungala

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

K. Swaminathan

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Antony Xavier Glittas

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

C. Balakrishna

National Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge