Hrishikesh Jayakumar
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hrishikesh Jayakumar.
international symposium on low power electronics and design | 2014
Hrishikesh Jayakumar; Kangwoo Lee; Woo Suk Lee; Arnab Raha; Younghyun Kim; Vijay Raghunathan
Various industry forecasts project that, by 2020, there will be around 50 billion devices connected to the Internet of Things (IoT), helping to engineer new solutions to societal-scale problems such as healthcare, energy conservation, transportation, etc. Most of these devices will be wireless due to the expense, inconvenience, or in some cases, the sheer infeasibility of wiring them. Further, many of them will have stringent size constraints. With no cord for power and limited space for a battery, powering these devices (to achieve several months to possibly years of unattended operation) becomes a daunting challenge. This paper highlights some promising directions for addressing this challenge, focusing on three main building blocks: (a) the design of ultra-low power hardware platforms that integrate computing, sensing, storage, and wireless connectivity in a tiny form factor, (b) the development of intelligent system-level power management techniques, and (c) the use of environmental energy harvesting to make IoT devices self-powered, thus decreasing - in some cases, even eliminating - their dependence on batteries. We discuss these building blocks in detail and illustrate case-studies of systems that use them judiciously, including the QUBE wireless embedded platform, which exploits the characteristics of emerging non-volatile memory technologies to seamlessly and efficiently enable long-running computations in systems that experience frequent power loss (i.e., intermittently powered systems).
international conference on vlsi design | 2014
Hrishikesh Jayakumar; Arnab Raha; Vijay Raghunathan
Transiently Powered Computers (TPCs) are a new class of batteryless embedded systems that depend solely on energy harvested from external sources for performing computations. Enabling long-running computations on TPCs is a major challenge due to the highly intermittent nature of the power supply (often bursts of <; 100ms), resulting in frequent system reboots. Prior work seeks to address this issue by frequently checkpointing system state in flash memory, preserving it across power cycles. However, this involves a substantial overhead due to the high erase/write times of flash memory. This paper proposes the use of FRAM, an emerging non-volatile memory technology that combines the benefits of SRAM and flash, to seamlessly enable long-running computations in TPCs. We propose a lightweight, in-situ checkpointing technique for TPCs using FRAM that decreases the time taken for saving and restoring a checkpoint to only 12.6μs, which is over two orders of magnitude lower than the corresponding overhead using flash. We have implemented and evaluated our technique, QUICKRECALL, using the TI MSP430FR5739 FRAM-enabled microcontroller. Experimental results show that our highly-efficient checkpointing translates to a significant speedup (1.4x - 4.5x) in program execution time.
compilers, architecture, and synthesis for embedded systems | 2015
Arnab Raha; Hrishikesh Jayakumar; Soubhagya Sutar; Vijay Raghunathan
Approximate computing is an emerging design paradigm that leverages the inherent error tolerance present in many applications to optimize their power consumption and performance. Due to the forgiving nature of these error-resilient applications, highly precise input data is not always necessary for them to produce outputs of acceptable quality. This makes memory, the place where data is stored, a suitable component for introducing errors or approximations in return for considerable energy savings. Towards this end, this paper proposes, for the first time, a systematic way for constructing a quality-aware approximate DRAM system. Our design is based upon an extensive experimental characterization of memory errors as a function of the DRAM refresh rate. Leveraging the insights gathered from this characterization, we propose four novel strategies for partitioning the DRAM into a number of quality bins based on the frequency, location, and nature of bit errors in each of the physical pages. During allocation, critical data is placed in the highest quality bin containing only accurate pages and approximate data is allocated to bins sorted in descending order of quality. We validate our proposed scheme on several error-resilient applications implemented using an Altera Stratix IV GX FPGA based Terasic TR4-230 development board containing a 1GB DDR3 DRAM module. Experimental results demonstrate a significant improvement in the energy-quality trade-off compared to previous work and show a reduction in DRAM refresh power of up to 73% with minimal loss in output quality.
ACM Journal on Emerging Technologies in Computing Systems | 2015
Hrishikesh Jayakumar; Arnab Raha; Woo Suk Lee; Vijay Raghunathan
Transiently Powered Computers (TPCs) are a new class of batteryless embedded systems that depend solely on energy harvested from external sources for performing computations. Enabling long-running computations on TPCs is a major challenge due to the highly intermittent nature of the power supply (often bursts of uick R ecall , using the TI MSP430FR5739 FRAM-enabled microcontroller. Experimental results show that our highly-efficient checkpointing translate to significant speedup (1.25x - 8.4x) in program execution time and reduction (∼3x) in application-level energy consumption.
asia and south pacific design automation conference | 2016
Hrishikesh Jayakumar; Arnab Raha; Younghyun Kim; Soubhagya Sutar; Woo Suk Lee; Vijay Raghunathan
It is projected that, within the coming decade, there will be more than 50 billion smart objects connected to the Internet of Things (IoT). These smart objects, which connect the physical world with the world of computing infrastructure, are expected to pervade all aspects of our daily lives and revolutionize a number of application domains such as healthcare, energy conservation, transportation, etc. In this paper, we present an overview of the challenges involved in designing energy-efficient IoT edge devices and describe recent research that has proposed promising solutions to address these challenges. First, we outline the challenges involved in efficiently supplying power to an IoT device. Next, we discuss the role of emerging memory technologies in making IoT devices energy-efficient. Finally, we discuss the potential impact that approximate computing can have in increasing the energy-efficiency of wearables and other compute-intensive IoT devices.
international conference on vlsi design | 2014
Arnab Raha; Hrishikesh Jayakumar; Vijay Raghunathan
The field of approximate computing has received significant attention from the research community in the past few years, especially in the context of various signal processing applications. Image and video compression algorithms such as JPEG, MPEG, etc., are particularly attractive candidates for approximate computing since they are tolerant of computing imprecision due to human imperceptibility, which can be exploited to realize highly power-efficient implementations of these algorithms. However, existing approximate architectures typically fix the level of hardware approximation statically and are not adaptive to input data. For example, if a fixed approximate hardware configuration is used for an MPEG encoder (i.e., a fixed level of approximation), the output quality varies greatly for different input videos. This paper addresses this issue by proposing a reconfigurable approximate architecture for MPEG encoders that optimizes power consumption while maintaining a particular PSNR threshold for any video. Experimental results show that our approach of dynamically adjusting the degree of hardware approximation based on the input video respects the given quality bound (PSNR degradation of 5-20%) across different videos while achieving a power savings of 13-18% over a conventional non-approximated MPEG encoder architecture. Although the proposed reconfigurable approximate architecture is presented for the specific case of an MPEG encoder, it can be easily extended to other DSP applications.
international conference on hardware/software codesign and system synthesis | 2014
Hrishikesh Jayakumar; Arnab Raha; Vijay Raghunathan
In heavily duty-cycled embedded systems, the energy consumed by the microcontroller in idle mode is often the bottleneck for battery lifetime. Existing solutions address this problem by placing the microcontroller in a low power (sleep) state when idle, and preserving application state either by retaining the data in-situ in SRAM, or by checkpointing it to FLASH. However, both these approaches have notable drawbacks. In-situ data retention requires the SRAM to remain powered in sleep mode, while checkpointing to FLASH involves significant energy and time overheads. This paper proposes a new ultra-low power sleep mode for micro-controllers that overcomes the limitations of both these ap- proaches. Our technique, HYPNOS, is based on the key observation that the on-chip SRAM in a microcontroller exhibits 100% data retention even at a much lower supply voltage (as much as 10x lower) than the typical operating voltage of the microcontroller. HYPNOS exploits this observation by performing extreme voltage scaling when the microcontroller is in sleep mode. We implement and evaluate HYPNOS for the TI MSP430G2452 microcontroller and show that the MCU draws only 26nA in the proposed sleep mode, which is 4× lower than any existing sleep mode that preserves SRAM contents. Further, we show that a complete wireless sensing system using HYPNOS only depletes battery capacity by 42.6nAh in an hour. By decreasing the average power consumption to such minuscule levels, HYPNOS takes a significant step forward in making perpetual systems a reality through the use of energy harvesting.
IEEE Transactions on Very Large Scale Integration Systems | 2016
Arnab Raha; Hrishikesh Jayakumar; Vijay Raghunathan
The field of approximate computing has received significant attention from the research community in the past few years, especially in the context of various signal processing applications. Image and video compression algorithms, such as JPEG, MPEG, and so on, are particularly attractive candidates for approximate computing, since they are tolerant of computing imprecision due to human imperceptibility, which can be exploited to realize highly power-efficient implementations of these algorithms. However, existing approximate architectures typically fix the level of hardware approximation statically and are not adaptive to input data. For example, if a fixed approximate hardware configuration is used for an MPEG encoder (i.e., a fixed level of approximation), the output quality varies greatly for different input videos. This paper addresses this issue by proposing a reconfigurable approximate architecture for MPEG encoders that optimizes power consumption with the goal of maintaining a particular Peak Signal-to-Noise Ratio (PSNR) threshold for any video. Toward this end, we design reconfigurable adder/subtractor blocks (RABs), which have the ability to modulate their degree of approximation, and subsequently integrate these blocks in the motion estimation and discrete cosine transform modules of the MPEG encoder. We propose two heuristics for automatically tuning the approximation degree of the RABs in these two modules during runtime based on the characteristics of each individual video. Experimental results show that our approach of dynamically adjusting the degree of hardware approximation based on the input video respects the given quality bound (PSNR degradation of 1%-10%) across different videos while achieving a power saving up to 38% over a conventional nonapproximated MPEG encoder architecture. Note that although the proposed reconfigurable approximate architecture is presented for the specific case of an MPEG encoder, it can be easily extended to other DSP applications.
custom integrated circuits conference | 2013
Vinay K. Chippa; Hrishikesh Jayakumar; Debabrata Mohapatra; Kaushik Roy; Anand Raghunathan
A domain-specific processor for energy-efficient execution of Recognition and Data Mining (RM) workloads is presented. The processor consists of a 2-D array of processing elements and a streaming memory hierarchy and interconnect network that are customized to efficiently execute dominant computational kernels (matrix-vector multiplication, vector dot product, L1 norm, and L2 norm) from a wide range of RM algorithms. To achieve further energy efficiency, the RM processor utilizes scalable effort design, a technique that exploits the inherent resilience of algorithms to inexactness in their constituent computations. The scalable effort RM processor adopts a cross-layer approach by combining scaling mechanisms at the algorithm, architecture, and circuit levels, to create a desirable trade off between energy consumption and output quality. Measurements from the implemented chip in 65nm CMOS indicate processing efficiencies of 569 GOPS/W-4.68 TOPS/W. The use of scalable effort design achieves energy savings of 1.2-2.3X with no loss in output quality, and 2X-20X with modest reduction in quality.
IEEE Transactions on Computers | 2017
Arnab Raha; Soubhagya Sutar; Hrishikesh Jayakumar; Vijay Raghunathan
Approximate computing is an emerging design paradigm that leverages the inherent error tolerance present in many applications to improve their power consumption and performance. Due to the forgiving nature of these error-resilient applications, precise input data is not always necessary for them to produce outputs of acceptable quality. This makes the memory subsystem (i.e., the place where data is stored), a suitable component for introducing approximations in return for substantial energy savings. Towards this end, this paper proposes a systematic methodology for constructing a quality configurable approximate DRAM system. Our design is based upon an extensive experimental characterization of memory errors as a function of the DRAM refresh-rate. Leveraging the insights gathered from this characterization, we propose four novel strategies for partitioning the DRAM in a system into a number of quality bins based on the frequency, location, and nature of bit errors in each of the physical pages, while also taking into account the property of variable retention time exhibited by DRAM cells. During data allocation, critical data is placed in the highest quality bin (that contains only accurate pages) and approximate data is allocated to bins sorted in descending order of quality, with the refresh rate serving as the quality control knob. We validate our proposed scheme on several error-resilient applications implemented using an Altera Stratix IV GX FPGA based Terasic TR4-230 development board containing a 1GB DDR3 DRAM module. Experimental results demonstrate a significant improvement in the energy-quality trade-off compared to previous work and show a reduction in DRAM refresh power of up to 73 percent on average with minimal loss in output quality.