Shrutisagar Chandrasekaran
Brunel University London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shrutisagar Chandrasekaran.
IEEE Transactions on Signal Processing | 2008
Pramod Kumar Meher; Shrutisagar Chandrasekaran; Abbes Amira
In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.
IEEE Transactions on Very Large Scale Integration Systems | 2007
Abbes Amira; Shrutisagar Chandrasekaran
Fast Hadamard transform (FHT) belongs to the family of discrete orthogonal transforms and is used widely in image and signal processing applications. In this paper, a parameterizable and scalable architecture for FHT with time and area complexities of O(2(W+1)) and O(2N2), respectively, has been proposed, where W and N are the word and vector lengths. A novel algorithmic transformation for the FHT based on sparse matrix factorization and distributed arithmetic (DA) principles has been presented. The architecture has been parallelized and pipelined in order to achieve high throughput rates. Efficient and optimized field-programmable gate array implementation of the proposed architecture that yield excellent performance metrics has been analyzed in detail. Additionally, a functional level power analysis and modeling methodology has been proposed to characterize the various power and energy metrics of the cores in terms of system parameters and design variables. The mathematical models that have been derived provide quick presilicon estimate of power and energy measures, allowing intelligent tradeoffs when incorporating the developed cores as subblocks in hardware-based image and video processing systems
symposium/workshop on electronic design, test and applications | 2008
Shrutisagar Chandrasekaran; Abbes Amira
Efficient generation of random and pseudorandom sequences is of great importance to a number of applications [4]. In this paper, an efficient implementation of the Mersenne Twister is presented. The proposed architecture has the smallest footprint of all published architectures to date and occupies only 330 FPGA slices. Partial pipelining and sub-expression simplification has been used to improve throughput per clock cycle. The proposed architecture is implemented on an RC1000 FPGA Development platform equipped with a Xilinx XCV2000E FPGA, and can generate 20 million 32 bit random numbers per second at a clock rate of 24.234 MHz. A through performance analysis has been performed, and it is observed that the proposed architecture clearly outperforms other existing implementations in key comparable performance metrics.
Neurocomputing | 2008
Abbes Amira; Shrutisagar Chandrasekaran; David W. G. Montgomery; Isa Servan Uzun
Positron emission tomography (PET) imaging is an emerging medical imaging modality. Due to its high sensitivity and ability to model function, it is effective in identifying active regions that may be associated with various types of tumours. Increasing numbers of patient scans have led to an urgent need for efficient data archival and the development of new image analysis techniques to aid clinicians in the diagnosis of disease. Additionally, to handle the large volumes of data generated using complex processing algorithms, it is becoming evident that co-processing solutions are essential. In this paper, an automated system for the segmentation of oncological PET data is developed. Initially, the Bayesian information criterion (BIC) is utilised for optimal segmentation level selection. Expectation maximisation (EM) based mixture modelling is then performed, using a k-means clustering procedure which varies voxel order for initialisation. A multiscale Markov model is then used to refine this segmentation by modelling spatial correlations between neighbouring image voxels. A field programmable gate array (FPGA) based co-processing solution is also proposed to offload the most complex computations onto hardware, in order to achieve high performance.
international conference on electronics, circuits, and systems | 2006
Minghua Shi; Amine Bermak; Shrutisagar Chandrasekaran; Abbes Amira
Gaussian mixture models (GMM)-based classifiers have shown increased attention in many pattern recognition applications. Improved performances have been demonstrated in many applications but using such classifiers can require large storage and complex processing units due to exponential calculations and large number of coefficients involved. This poses a serious problem for portable real-time pattern recognition applications. In this paper, first the performance of GMM and its hardware complexity are analyzed and compared with a number of benchmark algorithms. Next, an efficient digital hardware implementation based on distributed arithmetic (DA) is proposed. A novel exponential calculation circuit based on linear piecewise approximation is also developed to reduce hardware complexity. Implementation is carried out on the Celoxica-RC1000 board equipped with the Virtex-E FPGA. Maximum optimization has been achieved by means of manual placement and routing in order to achieve a compact core footprint. A detailed evaluation of the performance metrics of the GMM core is also presented.
Journal of Real-time Image Processing | 2008
Shrutisagar Chandrasekaran; Abbes Amira; Shi Minghua; Amine Bermak
In this paper, an efficient architecture for the Finite Ridgelet Transform (FRIT) suitable for VLSI implementation based on a parallel, systolic Finite Radon Transform (FRAT) and a Haar Discrete Wavelet Transform (DWT) sub-block, respectively is presented. The FRAT sub-block is a novel parametrisable, scalable and high performance core with a time complexity of O(p2), where p is the block size. Field Programmable Gate Array (FPGA) and Application Specific Integrated Circuit (ASIC) implementations are carried out to analyse the performance of the FRIT core developed.
2007 International Symposium on Integrated Circuits | 2007
Shrutisagar Chandrasekaran; Abbes Amira; Amine Bermak; Minghua Shi
As field programmable gate array (FPGA) based systems scale up in complexity, energy aware designs paradigms with strict power budgets require the designer to explore all viable options for minimising dynamic power consumption. The concepts of parallelism and pipelining have long been exploited in CMOS chips to reduce power and energy consumption. In this paper, a systematic empirical study of the tradeoffs between degree of parallelism, threshold voltage and power consumption under constant throughput conditions commercially available FPGAs has been presented. Results indicate that there is excellent scope for reduction in dynamic voltage by suitably applying the tradeoffs in FPGA based designs in order to achieve energy efficient implementations.
international conference on electronics, circuits, and systems | 2006
Faycal Bensaali; Abbes Amira; Shrutisagar Chandrasekaran
In this paper we present a design for an efficient FPGA implementation of a color space converter in video compression. The proposed architecture is based on distributed arithmetic principles has been implemented on the Xilinx Virtex-2000E FPGA using a hybrid design approach combining Handel-C and VHDL. Maximum optimization of performance metrics including frequency and power has been achieved by careful manual floor planning of the design, with particular attention paid to the critical paths and pin assignment. Additionally, a novel functional level power analysis and modeling using non-linear regression analysis has been developed using power and energy data obtained for different combinations of system parameters.
2007 International Symposium on Integrated Circuits | 2007
Minghua Shi; Shrutisagar Chandrasekaran; Amine Bermak; Abbes Amira
In this paper a gas discrimination system based on five classification algorithms including K nearest neighbors (KNN), multi-layer perceptron (MLP), radial basis function (RBF), Gaussian mixture model (GMM) and probabilistic principal component analysis (PPCA) has been presented. A Committee machine (CM) is used in which the results from each classifier are first transformed to confidences and then a weighted combination rule is used to generate the final decision result. In order to overcome the problem of very high computational complexity of the CM requiring large amount of hardware resources, we propose a novel time multiplexing hardware implementation using a dynamically reconfigurable field programmable gate array (FPGA) platform. The system is successfully tested for combustible gas identification application using our in-house tin-oxide gas sensors.
field-programmable logic and applications | 2006
Shrutisagar Chandrasekaran; Abbes Amira
In this paper we present a novel design for an efficient FPGA architecture of fast Walsh transform (FWT) for hardware implementation of pattern analysis techniques such as projection kernel calculation and feature extraction. The proposed architecture is based on distributed arithmetic (DA) principles using ROM accumulate (RAC) technique and sparse matrix factorisation. The implementation has been carried out using a hybrid design approach based on Celoxica Handel-C which is used as a wrapper for highly optimised VHDL cores. The algorithm has been implemented and verified on the Xilinx Virtex-2000E FPGA. An evaluation has also been reported based on maximum system frequency and chip area for different system parameters, and have been shown to outperform existing work in all key performance measures. Additionally, a novel functional level power analysis and modelling (FLPAM) methodology has been proposed to enable a high level estimation of power consumption.