Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vinay Sriram is active.

Publication


Featured researches published by Vinay Sriram.


field-programmable logic and applications | 2007

A high throughput area time efficient pseudo uniform random number generator based on the TT800 algorithm

Vinay Sriram; David Kearney

Many computer simulations require large quantities of uncorrelated random numbers to be generated quickly. Examples include all forms of Monte Carlo simulation, generating phase screens to simulate the effects of atmospheric turbulence and the simulation of electrical noise in sensors. A flexible way to generate random numbers of arbitrary distribution is to modify the distribution of a source of uniform random numbers. Thus it is of interest to have a fast uniform random number generator implemented in reconfigurable hardware. In this paper we present multiple hardware implementations of the TT800 algorithm. The best implementation achieved a throughput of 4.6times109 uniform random numbers per second using 24 parallel generators by making use of 253 Xilinx Virtex XC2VP70 slices. It has an area time rating of 0.05times10-6 Xilinx slices x seconds per 32 bit random number. It has the lowest area time metric and only half the area requirement than the previously best published multi-port, single seed generator with at least a 2800 period.


field-programmable logic and applications | 2006

High Speed High Fidelity Infrared Scene Simulation Using Reconfigurable Computing

Vinay Sriram; David Kearney

The authors aim to accelerate these algorithms in hardware and develop a hardware acceleration platform consisting of arrays of field programmable logic coupled with standard microprocessors, to provide high speed high fidelity infrared scene simulations


parallel and distributed computing: applications and technologies | 2007

Implementing a Phase Screen Generator in Hardware

Vinay Sriram; David Kearney

The computation time required for the modelling of wavefront distortions over finite aperture has always been an important issue for applications like prediction of performance of laser designators and simulation of infrared scenes in the presence of atmospheric turbulence. In this paper, we show that the computation performance of the best previous algorithm that models this phenomenon can be substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation provides a overall speedup of more than 8 times the original algorithm.


Eurasip Journal on Embedded Systems | 2009

An FPGA implementation of a parallelized MT19937 uniform random number generator

Vinay Sriram; David Kearney

Recent times have witnessed an increase in use of high-performance reconfigurable computing for accelerating large-scale simulations. A characteristic of such simulations, like infrared (IR) scene simulation, is the use of large quantities of uncorrelated random numbers. It is therefore of interest to have a fast uniform random number generator implemented in reconfigurable hardware. While there have been previous attempts to accelerate the MT19937 pseudouniform random number generator using FPGAs we believe that we can substantially improve the previous implementations to develop a higher throughput and more area-time efficient design. Due to the potential for parallel implementation of random numbers generators, designs that have both a small area footprint and high throughput are to be preferred to ones that have the high throughput but with significant extra area requirements. In this paper, we first present a single port design and then present an enhanced 624 port hardware implementation of the MT19937 algorithm. The 624 port hardware implementation when implemented on a Xilinx XC2VP70-6 FPGA chip has a throughput of 32 bit random numbers per second which is more than 17x that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA-based pseudouniform random number generators.


parallel and distributed computing: applications and technologies | 2007

A FPGA Implementation of Variable Kernel Convolution

Vinay Sriram; David Kearney

Convolution is a basic signal and image processing application. In image processing, kernel coefficients of convolution commonly remain constant across the entire image. A less common situation is where the kernel coefficients change in value for each pixel in the image. We call this variable kernel convolution. In this paper we present what we believe are the first three FPGA implementations of variable kernel convolution. The first uses sequential streaming, the second uses pipelining and the third solution uses what we call convolve and gather and its hardware implementation has the highest area time rating (6.7 x better than streaming and 3.4x better than the pipelining solution). Both pipelining and convolve and gather have the same throughput (which is 25 x that of streaming), but convolve and gather has 71% smaller area footprint than the pipeline.


parallel and distributed computing: applications and technologies | 2007

High Throughput Multi-port MT19937 Uniform Random Number Generator

Vinay Sriram; David Kearney

A communication protocol is a fundamental component of a multi-agent system. The security requirements for a communication protocol should be articulated during the early stages of software development. However, there is no formal way provided for software developers to find out what makes a communication protocol secure and what are secure designs. In this paper we propose a method that defines security requirements, bridges security requirement analysis with security design, and integrates the security techniques into a communication protocol to fulfill the security requirements.There have been many previous attempts to accelerate MT19937 using FPGAs but we believe that we can substantially improve the previous implementations to develop a higher throughput and more area time efficient design. In this paper we first present a single port design and then present an enhanced 624 port hardware implementations of the MT19937 algorithm that has a throughput of 119.6 x 109 32 bit random numbers per second, which is more than 17 times that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA based pseudo uniform random number generators.


The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology | 2007

Towards A Multi-FPGA Infrared Simulator

Vinay Sriram; David Kearney

High speed infrared (IR) scene simulation is used extensively in defense and homeland security to test sensitivity of IR cameras and accuracy of IR threat detection and tracking algorithms used commonly in IR missile approach warning systems (MAWS). A typical MAWS requires an input scene rate of over 100 scenes/second. Infrared scene simulations typically take 32 minutes to simulate a single IR scene that accounts for effects of atmospheric turbulence, refraction, optical blurring and charge-coupled device (CCD) camera electronic noise on a Pentium 4 (2.8GHz) dual core processor [7]. Thus, in IR scene simulation, the processing power of modern computers is a limiting factor. In this paper we report our research to accelerate IR scene simulation using high performance reconfigurable computing. We constructed a multi Field Programmable Gate Array (FPGA) hardware acceleration platform and accelerated a key computationally intensive IR algorithm over the hardware acceleration platform. We were successful in reducing the computation time of IR scene simulation by over 36%. This research acts as a unique case study for accelerating large scale defense simulations using a high performance multi-FPGA reconfigurable computer.


Optics Express | 2007

An Ultra Fast Kolmogorov Phase Screen Generator Suitable For Parallel Implementation

Vinay Sriram; David Kearney

Modelling phase fluctuations due to Kolmogorov turbulence is important in many areas of applied optics such as simulating adaptive optics configurations, prediction of the performance of laser designators and simulation of infrared (IR) scenes in the presence of atmospheric turbulence. The computational performance of algorithms implementing this model is an important issue because in many situations a large number of phase screens is required. For example, in IR scene simulation a different phase screen is required for each pixel in the scene, and in other situations there exists a need for many thousands of phase screens to be calculated to obtain a statistical average. Whilst there have been previous attempts to increase the computational speed of these algorithms, the computation time required for a large number of phase screens still remains an issue. In this paper, we apply linear and statistical properties to improve the performance of the previous best published algorithm by 60 times when implemented on a sequential processor in software. Because the new algorithm is now trivially parallelizable, a further 20 times speedup can easily be achieved through a parallel software or hardware implementation.


Journal of Real-time Image Processing | 2008

Multiple parallel FPGA implementations of a Kolmogorov phase screen generator

Vinay Sriram; David Kearney

Modelling the effects of wavefront distortions over a finite aperture is an essential component in the simulation of adaptive optics configurations, prediction of performance of laser designators and atmospheric imaging simulations like generation of infrared (IR) scenes in the presence of atmospheric turbulence. In all of these applications many thousands of phase screens need to be generated. The computation time required for a large iterations of algorithms that model this effect is important an issue and for this reason there have been many previous attempts to improve the computation speed such algorithms. In this paper, the computation performance of the best previous algorithm that models this phenomenon is substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation can provide a speedup of more than 60 times the original algorithm.


digital image computing: techniques and applications | 2007

A Parallel Area Efficient Kolmogorov Phase Screen Generator Suitable for FPGA Implementation

Vinay Sriram; David Kearney

In infrared (IR) scene simulation, modelling of wave- front distortions over finite aperture is carried out to simulate the phenomenon of scintillating IR sources resulting from atmospheric turbulence. Computation time of algorithms used to model this effect has always been an important issue and for this reason there have been many previous attempts to accelerate these algorithms. In this paper, we expand on our previous work [12] to accelerate phase screen generation using field programmable gate arrays (FPGAs). We were able to develop a more area time efficient phase screen generator by two means. Firstly the key computationally intensive steps of the algorithm were identified and optimized. Secondly a parallel version of the algorithm for implementation on a FPGA was developed. Our best hardware implementation manages a throughput of 13,780 32bit, 201x201 outputs (called phase screen points) per second, which is over 8x the original algorithms implementation on a Pentium 4(2.8GHz) dual core processor.

Collaboration


Dive into the Vinay Sriram's collaboration.

Top Co-Authors

Avatar

David Kearney

University of South Australia

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge