Siddharth Advani | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Siddharth Advani is active.

Explore More

Publication

Featured researches published by Siddharth Advani.

international conference on acoustics, speech, and signal processing | 2013

A multi-resolution saliency framework to drive foveation

Siddharth Advani; John P. Sustersic; Kevin M. Irick; Vijaykrishnan Narayanan

The Human Visual System (HVS) exhibits multi-resolution characteristics, where the fovea is at the highest resolution while the resolution tapers off towards the periphery. Given enough activity at the periphery, the HVS is then capable to foveate to the next region of interest (ROI), to attend to it at full resolution. Saliency models in the past have focused on identifying features that can be used in a bottom-up manner to generate conspicuity maps, which are then combined together to provide regions of fixated interest. However, these models neglect to take into consideration the foveal relation of an object of interest. The model proposed in this work aims to compute saliency as a function of distance from a given fixation point, using a multi-resolution framework. Apart from computational benefits, significant motivation can be found from this work in areas such as visual search, robotics, communications etc.

high-performance computer architecture | 2015

Exploring architectural heterogeneity in intelligent vision systems

Nandhini Chandramoorthy; Giuseppe Tagliavini; Kevin M. Irick; Antonio Pullini; Siddharth Advani; Sulaiman Al Habsi; Matthew Cotter; Jack Sampson; Vijaykrishnan Narayanan; Luca Benini

Limited power budgets and the need for high performance computing have led to platform customization with a number of accelerators integrated with CMPs. In order to study customized architectures, we model four customization design points and compare their performance and energy across a number of computer vision workloads. We analyze the limitations of generic architectures and quantify the costs of increasing customization using these micro-architectural design points. This analysis leads us to develop a framework consisting of low-power multi-cores and an array of configurable micro-accelerator functional units. Using this platform, we illustrate dataflow and control processing optimizations that provide for performance gains similar to custom ASICs for a wide range of vision benchmarks.

field programmable logic and applications | 2015

A scalable architecture for multi-class visual object detection

Siddharth Advani; Yasuki Tanabe; Kevin M. Irick; Jack Sampson; Vijaykrishnan Narayanan

As high-fidelity small form-factor cameras become increasingly available and affordable, there will be a subsequent growth and emergence of vision-based applications that take advantage of this increase in visual information. The key challenge is for the embedded systems, on which the bulk of these applications will be deployed, to maintain real-time performance in the midst of the exponential increase in spatial and temporal visual data. For example, a useful vision-based driver assistance system needs to locate and identify critical objects such as pedestrians, other vehicles, pot-holes, animals, and street signs with latency small enough to allow a human driver to react accordingly. In this work, we propose a digital accelerator architecture for a high-throughput, robust, scalable, and tunable visual object detection pipeline based on Histogram of Oriented Gradients (HOG) features. From a systems perspective, efficacy can be measured in terms of speed, accuracy, energy efficiency and scalability in performing such visual tasks. Since each application dictates the criticality of any one of these dimensions, our proposed architecture exposes design-time parameters that can take advantage of domain-specific knowledge while supporting tune-ability through run-time configurations. To evaluate the effectiveness of our vision accelerator we map the architecture to a modern FPGA and demonstrate full HD video processing at 30 fps (frames per second) operating at a conservative 100 MHz clock. Evaluations on a single object class show throughput improvements of 2× and 5× over GPU and multi-threaded CPU implementations respectively. Further more we provide a pathway for enhanced scalability for the many-class problem and achieve over 20× improvement over an equivalent CPU implementation for 5 object classes.

embedded systems for real time multimedia | 2015

Visual co-occurrence network: using context for large-scale object recognition in retail

Siddharth Advani; Brigid Smith; Yasuki Tanabe; Kevin M. Irick; Matthew Cotter; Jack Sampson; Vijaykrishnan Narayanan

In any visual object recognition system, the classification accuracy will likely determine the usefulness of the system as a whole. In many real-world applications, it is also important to be able to recognize a large number of diverse objects for the system to be robust enough to handle the sort of tasks that the human visual system handles on an average day. These objectives are often at odds with performance, as running too large of a number of detectors on any one scene will be prohibitively slow for use in any real-time scenario. However, visual information has temporal and spatial context that can be exploited to reduce the number of detectors that need to be triggered at any given instance. In this paper, we propose a dynamic approach to encode such context, called Visual Co-occurrence Network (ViCoNet) that establishes relationships between objects observed in a visual scene. We investigate the utility of ViCoNet when integrated into a vision pipeline targeted for retail shopping. When evaluated on a large and deep dataset, we achieve a 50% improvement in performance and a 7% improvement in accuracy in the best case, and a 45% improvement in performance and a 3% improvement in accuracy in the average case over an established baseline. The memory overhead of ViCoNet is around 10KB, highlighting its effectiveness on temporal big data.

international conference on computer design | 2014

Refresh Enabled Video Analytics (REVA): Implications on power and performance of DRAM supported embedded visual systems

Siddharth Advani; Nandhini Chandramoorthy; Karthik Swaminathan; Kevin M. Irick; Yong Cheol Peter Cho; Jack Sampson; Vijaykrishnan Narayanan

Video applications are becoming ubiquitous in mobile and embedded systems. Wearable video systems such as Google Glasses require capabilities for real-time video analytics and prolonged battery lifetimes. Further, the increasing resolution of image sensors in these mobile systems places an increasing demand on both the memory storage as well as the computational power. In this work, we present the Refresh Enabled Video Analytics (REVA) system, an embedded architecture for multi-object scene understanding and tackle the unique opportunities provided by real-time embedded video analytics applications for reducing the DRAM memory refresh energy. We compare our design with the existing design space and show savings of 88% in refresh power and 15% in total power, as compared to a standard DRAM refresh scheme.

international conference on computer aided design | 2014

A hardware accelerated multilevel visual classifier for embedded visual-assist systems

Matthew Cotter; Siddharth Advani; Jack Sampson; Kevin M. Irick; Vijaykrishnan Narayanan

Embedded visual assist systems are emerging as increasingly viable tools for aiding visually impaired persons in their day-to-day life activities. Novel wearable devices with imaging capabilities will be uniquely positioned to assist visually impaired in activities such as grocery shopping. However, supporting such time-sensitive applications on embedded platforms requires an intelligent trade-off between accuracy and computational efficiency. In order to maximize their utility in real-world scenarios, visual classifiers often need to recognize objects within large sets of object classes that are both diverse and deep. In a grocery market, simultaneously recognizing the appearance of people, shopping carts, and pasta is an example of a common diverse object classification task. Moreover, a useful visual-aid system would need deep classification capability to distinguish among the many styles and brands of pasta to direct attention to a particular box. Exemplar Support Vector Machines (ESVMs) provide a means of achieving this specificity, but are resource intensive as computation increases rapidly with the number of classes to be recognized. To maintain scalability without sacrificing accuracy, we examine the use of a biologically-inspired classifier (HMAX) as a front-end filter that can narrow the set of ESVMs to be evaluated. We show that a hierarchical classifier combining HMAX and ESVM performs better than either of the two individually. We achieve 12% improvement in accuracy over HMAX and 4% improvement over ESVM while reducing computational overhead of evaluating all possible exemplars.

sensors applications symposium | 2011

Beam divergence calculation of an electromagnetic acoustic transducer for the Non-Destructive Evaluation of plate-like structures

Siddharth Advani; Jason K. Van Velsor; Joseph L. Rose

Ultrasonic guided waves are now proving to be a viable method for real-world long-range Non-Destructive Evaluation (NDE) applications. In order to generate a specific guided wave mode optimally, knowledge of the sensor parameters becomes imperative. This paper attempts to experimentally measure the beam divergence in an Electro-Magnetic Acoustic Transducer (EMAT) that is used to generate Shear Horizontal (SH) guided waves in a mild steel plate. The commercial finite element package ABAQUS™ is then used to run 3D simulations to validate these experimental results. Based on these results, a planar defect study is also carried out. From all these investigations, the minimum degree of rotation of the EMAT can be set when used in a real-time ultrasonic guided wave omni-directional inspection system.

REVIEW OF PROGRESS IN QUANTITATIVE NONDESTRUCTIVE EVALUATION VOLUME 29 | 2010

GUIDED WAVE THICKNESS MEASUREMENT TOOL DEVELOPMENT FOR ESTIMATION OF THINNING IN PLATELIKE STRUCTURES

Siddharth Advani; Luke J. Breon; Joseph L. Rose

The ability to model and simulate guided wave propagation provides an insight into the development of robust guided wave inspection systems. This paper presents the finite‐element method (FEM), as a powerful computational technique used in the context of ultrasonic guided wave nondestructive evaluation of inaccessible plate like structures. A commercially available FE package is used for this study. Corrosion type surface thinning with depths from 20% to 50% are fabricated in a steel plate. A series of computer runs are made using “Lamb type” guided waves and the results of the parametric study are then analyzed. Based on the analysis, a quantitative method to estimate average thickness of the plate is suggested. Experimental validation is also provided.

Robotics and Autonomous Systems | 2016

Towards a unified multiresolution vision model for autonomous ground robots

John P. Sustersic; Brad Wyble; Siddharth Advani; Vijaykrishnan Narayanan

While remotely operated unmanned vehicles are increasingly a part of everyday life, truly autonomous robots capable of independent operation in dynamic environments have yet to be realized - particularly in the case of ground robots required to interact with humans and their environment. We present a unified multiresolution vision model for this application designed to provide the wide field of view required to maintain situational awareness and sufficient visual acuity to recognize elements of the environment while permitting feasible implementations in real-time vision applications. The model features a kind of color-constant processing through single-opponent color channels and contrast invariant oriented edge detection using a novel implementation of the Combination of Receptive Fields model. The model provides color and edge-based salience assessment, as well as a compressed color image representation suitable for subsequent object identification. We show that bottom-up visual saliency computed using this model is competitive with the current state-of-the-art while allowing computation in a compressed domain and mimicking the human visual system with nearly half (45%) of computational effort focused within the fovea. This method reduces storage requirement of the image pyramid to less than 5% of the full image, and computation in this domain reduces model complexity in terms of both computational costs and memory requirements accordingly. We also quantitatively evaluate the model for its application domain by using it with a camera/lens system with a 185?field of view capturing 3.5M pixel color images by using a tuned salience model to predict human fixations. Generalize the CORF operator for color images for contrast invariant edge detection.Unify center-surround differencing with Serres color image descriptor.Cropped Gaussian Pyramid as a piece-wise linear approximation for foveated vision.Shown competitive performance in visual salience at reduced computational costs.Enabling more complex image processing in real-time, ideal for FPGA implementation.

IEEE Transactions on Very Large Scale Integration Systems | 2016

A Saliency-Driven LCD Power Management System

Yang Xiao; Siddharth Advani; Donghwa Shin; Naehyuck Chang; Jack Sampson; Vijaykrishnan Narayanan

Large liquid crystal display (LCD) technology is being widely used in every corner of our modern life, ranging from personal laptops to flat-panel televisions. Among all the components in an LCD display system, the backlight panel is the dominant power consumer, irrespective of lighting technology or class. In this paper, a saliency-based field-programmable gate array accelerator for revolutionary LCD power management is proposed that allows dynamic modulation of the different zones of the backlight panel. This hardware accelerator-based system is capable of processing a high-definition video stream in real-time and uses less than 50% of the power that a normal LCD display system consumes, with minimum overhead. We also compare our proposed approach with other state-of-the-art power-aware methods and show numerous advantages using our data-driven strategy.

Explore More