Nandhini Chandramoorthy

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nandhini Chandramoorthy is active.

Explore More

Publication

Featured researches published by Nandhini Chandramoorthy.

design automation conference | 2012

Accelerating neuromorphic vision algorithms for recognition

Ahmed Al Maashri; Michael DeBole; Matthew Cotter; Nandhini Chandramoorthy; Yang Xiao; Vijaykrishnan Narayanan; Chaitali Chakrabarti

Video analytics introduce new levels of intelligence to automated scene understanding. Neuromorphic algorithms, such as HMAX, are proposed as robust and accurate algorithms that mimic the processing in the visual cortex of the brain. HMAX, for instance, is a versatile algorithm that can be repurposed to target several visual recognition applications. This paper presents the design and evaluation of hardware accelerators for extracting visual features for universal recognition. The recognition applications include object recognition, face identification, facial expression recognition, and action recognition. These accelerators were validated on a multi-FPGA platform and significant performance enhancement and power efficiencies were demonstrated when compared to CMP and GPU platforms. Results demonstrate as much as 7.6X speedup and 12.8X more power-efficient performance when compared to those platforms.

international electron devices meeting | 2014

Pairwise coupled hybrid vanadium dioxide-MOSFET (HVFET) oscillators for non-boolean associative computing

Nikhil Shukla; Abhinav Parihar; Matthew Cotter; Michael Barth; Xueqing Li; Nandhini Chandramoorthy; Hanjong Paik; Darrell G. Schlom; Vijay Narayanan; Arijit Raychowdhury; Suman Datta

Information processing applications related to associative computing like image / pattern recognition consume excessive computational resources in the Boolean processing framework. This motivates the exploration of a non-Boolean computing approach for such applications. In this work, we demonstrate, (i) novel hybrid set of pair-wise coupled oscillators comprising of vanadium dioxide (VO2) metal-insulator-transition (MIT) system integrated with MOSFET; (ii) degree of synchronization between oscillators based on input analog voltage difference; (iii) implementation of hardware platform for fast and efficient evaluation of Lk fractional distance norm (k<;1); (iv) improved quality of image processing and ~20X lower power consumption of the coupled oscillators over a CMOS accelerator.

Ipsj Transactions on System Lsi Design Methodology | 2012

System-On-Chip for Biologically Inspired Vision Applications

Sungho Park; Ahmed Al Maashri; Kevin M. Irick; Aarti Chandrashekhar; Matthew Cotter; Nandhini Chandramoorthy; Michael DeBole; Vijaykrishnan Narayanan

Neuromorphic vision algorithms are biologically-inspired computational models of the primate visual pathway. They promise robustness, high accuracy, and high energy efficiency in advanced image processing applications. Despite these potential benefits, the realization of neuromorphic algorithms typically exhibit low performance even when executed on multi-core CPU and GPU platforms. This is due to the disparity in the computational modalities prominent in these algorithms and those modalities most exploited in contemporary computer architectures. In essence, acceleration of neuromorphic algorithms requires adherence to specific computational and communicational requirements. This paper discusses these requirements and proposes a framework for mapping neuromorphic vision applications on a System-on-Chip, SoC. A neuromorphic object detection and recognition on a multi-FPGA platform is presented with performance and power efficiency comparisons to CMP and GPU implementations.

IEEE Transactions on Multi-Scale Computing Systems | 2016

Enabling New Computation Paradigms with HyperFET - An Emerging Device

Wei-Yu Tsai; Xueqing Li; Matthew Jerry; Baihua Xie; Nikhil Shukla; Huichu Liu; Nandhini Chandramoorthy; Matthew Cotter; Arijit Raychowdhury; Donald M. Chiarulli; Steven P. Levitan; Suman Datta; Jack Sampson; Nagarajan Ranganathan; Vijaykrishnan Narayanan

High power consumption has significantly increased the cooling cost in high-performance computation stations and limited the operation time in portable systems powered by batteries. Traditional power reduction mechanisms have limited traction in the post-Dennard Scaling landscape. Emerging research on new computation devices and associated architectures has shown three trends with the potential to greatly mitigate current power limitations. The first is to employ steep-slope transistors to enable fundamentally more efficient operation at reduced supply voltage in conventional Boolean logic, reducing dynamic power. The second is to employ brain-inspired computation paradigms, directly embodying computation mechanisms inspired by the brains, which have shown potential in extremely efficient, if approximate, processing with silicon-neuron networks. The third is “let physics do the computation”, which focuses on using the intrinsic operation mechanism of devices (such as coupled oscillators) to do the approximate computation, instead of building complex circuits to carry out the same function. This paper first describes these three trends, and then proposes the use of the hybrid-phase-transition-FET (Hyper-FET), a device that could be configured as a steep-slope transistor, a spiking neuron cell, or an oscillator, as the device of choice for carrying these three trends forward. We discuss how a single class of device can be configured for these multiple use cases, and provide in-depth examination and analysis for a case study of building coupled-oscillator systems using Hyper-FETs for image processing. Performance benchmarking highlights the potential of significantly higher energy efficiency than dedicated CMOS accelerators at the same technology node.

high-performance computer architecture | 2015

Exploring architectural heterogeneity in intelligent vision systems

Nandhini Chandramoorthy; Giuseppe Tagliavini; Kevin M. Irick; Antonio Pullini; Siddharth Advani; Sulaiman Al Habsi; Matthew Cotter; Jack Sampson; Vijaykrishnan Narayanan; Luca Benini

Limited power budgets and the need for high performance computing have led to platform customization with a number of accelerators integrated with CMPs. In order to study customized architectures, we model four customization design points and compare their performance and energy across a number of computer vision workloads. We analyze the limitations of generic architectures and quantify the costs of increasing customization using these micro-architectural design points. This analysis leads us to develop a framework consisting of low-power multi-cores and an array of configurable micro-accelerator functional units. Using this platform, we illustrate dataflow and control processing optimizations that provide for performance gains similar to custom ASICs for a wide range of vision benchmarks.

international conference on computer design | 2014

Refresh Enabled Video Analytics (REVA): Implications on power and performance of DRAM supported embedded visual systems

Siddharth Advani; Nandhini Chandramoorthy; Karthik Swaminathan; Kevin M. Irick; Yong Cheol Peter Cho; Jack Sampson; Vijaykrishnan Narayanan

Video applications are becoming ubiquitous in mobile and embedded systems. Wearable video systems such as Google Glasses require capabilities for real-time video analytics and prolonged battery lifetimes. Further, the increasing resolution of image sensors in these mobile systems places an increasing demand on both the memory storage as well as the computational power. In this work, we present the Refresh Enabled Video Analytics (REVA) system, an embedded architecture for multi-object scene understanding and tackle the unique opportunities provided by real-time embedded video analytics applications for reducing the DRAM memory refresh energy. We compare our design with the existing design space and show savings of 88% in refresh power and 15% in total power, as compared to a standard DRAM refresh scheme.

signal processing systems | 2014

Understanding the landscape of accelerators for vision

Nandhini Chandramoorthy; Karthik Swaminathan; Matthew Cotter; Xueqing Li; Indranil Palit; Michael Niemier; Kevin M. Irick

Visual analytics applications are becoming ubiquitous and embedded in various systems that we interact with daily. Limited power budgets and the need for high performance for cognitive visual analytics have led to a three-pronged approach of integrating advances in algorithms, architectures and technology towards designing next generation vision accelerators. Vision applications benefit from increasing processor customization, emerging devices and technologies such as Tunnel-FETs and Resistive- RAMs, and trends in non-Boolean computing such as Cellular Neural Networks (CNNs) and neuromorphic architectures. This paper provides an overview of the evolving landscape of vision accelerators.

high-performance computer architecture | 2017

BRAVO: Balanced Reliability-Aware Voltage Optimization

Karthik Swaminathan; Nandhini Chandramoorthy; Chen-Yong Cher; Ramon Bertran; Alper Buyuktosunoglu; Pradip Bose

Defining a processor micro-architecture for a targeted productspace involves multi-dimensional optimization across performance, power and reliability axes. A key decision in sucha definition process is the circuit-and technology-driven parameterof the nominal (voltage, frequency) operating point. This is a challenging task, since optimizing individually orpair-wise amongst these metrics usually results in a designthat falls short of the specification in at least one of the threedimensions. Aided by academic research, industry has nowadopted early-stage definition methodologies that considerboth energy-and performance-related metrics. Reliabilityrelatedenhancements, on the other hand, tend to get factoredin via a separate thread of activity. This task is typically pursuedwithout thorough pre-silicon quantifications of the energyor even the performance cost. In the late-CMOS designera, reliability needs to move from a post-silicon afterthoughtor validation-only effort to a pre-silicon definitionprocess. In this paper, we present BRAVO, a methodologyfor such reliability-aware design space exploration. BRAVOis supported by a multi-core simulation framework that integratesperformance, power and reliability modeling capability. Errors induced by both soft and hard fault incidence arecaptured within the reliability models. We introduce the notionof the Balanced Reliability Metric (BRM), that we useto evaluate overall reliability of the processor across soft andhard error incidences. We demonstrate up to 79% improvementin reliability in terms of this metric, for only a 6% dropin overall energy efficiency over design points that maximizeenergy efficiency. We also demonstrate several real-life usecaseapplications of BRAVO in an industrial setting.

signal processing systems | 2013

Hardware Acceleration for Neuromorphic Vision Algorithms

Ahmed Al Maashri; Matthew Cotter; Nandhini Chandramoorthy; Michael DeBole; Chi Li Yu; Vijaykrishnan Narayanan; Chaitali Chakrabarti

Neuromorphic vision algorithms are biologically inspired models that follow the processing that takes place in the primate visual cortex. Despite their efficiency and robustness, the complexity of these algorithms results in reduced performance when executed on general purpose processors. This paper proposes an application-specific system for accelerating a neuromorphic vision system for object recognition. The system is based on HMAX, a biologically-inspired model of the visual cortex. The neuromorphic accelerators are validated on a multi-FPGA system. Results show that the neuromorphic accelerators are 13.8× (2.6×) more power efficient when compared to CPU (GPU) implementation.

signal processing systems | 2014

Accelerating Multiresolution Gabor Feature Extraction for Real Time Vision Applications

Yong Cheol Peter Cho; Nandhini Chandramoorthy; Kevin M. Irick; Vijaykrishnan Narayanan

Multiresolution Gabor filter banks are used for feature extraction in a variety of applications as Gabor filters have shown to be exceptional feature extractors with a close correspondence to the simple cells in the primary visual cortex (V1) of the brain. Yet applying the Gabor filter is a computationally intensive task. Most applications that utilize the Gabor feature space require real time results; however, the large quantity of computations involved has hindered systems from achieving real time performance. The natural solution for such compute intensive tasks is parallelization. FPGAs have emerged as attractive platforms for compute intensive signal processing applications due to their massively parallel computation resources as well as low power consumption and affordability. We present a configurable architecture for Gabor feature extraction on FPGA that enhances the resource utilization of the FPGA hardware fabric while maintaining a streaming data flow to yield exceptional performance. The increased resource utilization resulting from configurability, optimizations, and resource sharing allows for higher levels of parallelism to achieve real time feature extraction of high resolution images. Two architectures are introduced. The first is an architecture for multiresolution feature extraction with extensive resource sharing for enhanced resource utilization. The second is an architecture for many-orientation applications using a coarse to fine grain method to enhance resource utilization by reducing the number of filters applied at different orientations. Our results show that our multiresolution implementation achieves real-time performance on 2048 × 1526 images and exhibits 6X speed up over a GPU implementation while exhibiting energy efficiency with 0.4fps/W compared to the GPU that achieves 0.036fps/W.[1] The implementation for many-orientation applications using the coarse to fine grain method exhibits resource saving of at most 2O

Explore More