Karthik Swaminathan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karthik Swaminathan is active.

Explore More

Publication

Featured researches published by Karthik Swaminathan.

high-performance computer architecture | 2015

Architecture exploration for ambient energy harvesting nonvolatile processors

Kaisheng Ma; Yang Zheng; Shuangchen Li; Karthik Swaminathan; Xueqing Li; Yongpan Liu; Jack Sampson; Yuan Xie; Vijaykrishnan Narayanan

Energy harvesting has been widely investigated as a promising method of providing power for ultra-low-power applications. Such energy sources include solar energy, radio-frequency (RF) radiation, piezoelectricity, thermal gradients, etc. However, the power supplied by these sources is highly unreliable and dependent upon ambient environment factors. Hence, it is necessary to develop specialized systems that are tolerant to this power variation, and also capable of making forward progress on the computation tasks. The simulation platform in this paper is calibrated using measured results from a fabricated nonvolatile processor and used to explore the design space for a nonvolatile processor with different architectures, different input power sources, and policies for maximizing forward progress.

IEEE Micro | 2013

Steep-Slope Devices: From Dark to Dim Silicon

Karthik Swaminathan; Emre Kultursay; Vinay Saripalli; Vijaykrishnan Narayanan; Mahmut T. Kandemir; Suman Datta

Although the superior subthreshold characteristics of steep-slope devices can help power up more cores, researchers still need CMOS technology to accelerate sequential applications, because it can reach higher frequencies. Device-level heterogeneous multicores can give the best of both worlds, but they need smart resource management to realize this promise. In this article, the authors discuss device-level heterogeneous multicores and various resource-management schemes for reaching higher energy efficiency.

international conference on hardware/software codesign and system synthesis | 2012

Performance enhancement under power constraints using heterogeneous CMOS-TFET multicores

Emre Kultursay; Karthik Swaminathan; Vinay Saripalli; Vijaykrishnan Narayanan; Mahmut T. Kandemir; Suman Datta

Device level heterogeneity promises high energy efficiency over a larger range of voltages than a single device technology alone can provide. In this paper, starting from device models, we first present ground-up modeling of CMOS and TFET cores, and verify this model against existing processors. Using our core models, we construct a 32-core TFET-CMOS heterogeneous multicore. We then show that it is a very challenging task to identify the ideal runtime configuration to use in such a heterogeneous multicore, which includes finding the best number/type of cores to activate and the corresponding voltages/frequencies to select for these cores. In order to effectively utilize this heterogeneous processor, we propose a novel automated runtime scheme. Our scheme is designed to automatically improve the performance of applications running on heterogeneous CMOS-TFET multicores operating under a fixed power budget, without requiring any effort from the application programmer or the user. Our scheme combines heterogeneous thread-to-core mapping, dynamic work partitioning, and dynamic power partitioning to identify energy efficient operating points. With simulations we show that our runtime scheme can enable a CMOS-TFET multicore to serve a diversity of workloads with high energy efficiency and achieve 21% average speedup over the best performing equivalent homogeneous multicore.

international symposium on low power electronics and design | 2011

Improving energy efficiency of multi-threaded applications using heterogeneous CMOS-TFET multicores

Karthik Swaminathan; Emre Kultursay; Vinay Saripalli; Vijaykrishnan Narayanan; Mahmut T. Kandemir; Suman Datta

Energy-Delay-Product-aware DVFS is a widely-used technique that improves energy efficiency by dynamically adjusting the frequencies of cores. Further, for multithreaded applications, barrier-aware DVFS is a method that can dynamically tune the frequencies of cores to reduce barrier stall times and achieve higher energy efficiency. In both forms of DVFS, frequencies of cores are reduced from the maximum value to achieve better energy efficiency. TFET devices operate at energy efficiencies that cannot be achieved by CMOS devices. This advantage of TFET devices can be exploited in the context of multicore processors by replacing some of the CMOS cores with energy efficient TFET alternatives. However, the energy benefits of TFET devices are observed at relatively lower voltages, which results in a degradation in performance due to executing at lower frequencies. Although applications cannot be limited to run always at such lower frequencies, it can be significantly beneficial from an energy efficiency perspective to make use of energy efficient TFET cores during the times applications spend at these frequencies. In this paper, we show that due to EDP-aware DVFS and barrier-aware DVFS, multithreaded applications run for a significant portion of their execution time at frequencies at which TFET cores are more energy efficient. We further show that, at those frequencies, dynamically migrating threads to TFET cores can achieve average leakage and dynamic energy savings of 30% and 17%, respectively, with a performance degradation of less than 1%.

IEEE Micro | 2016

Nonvolatile Processor Architectures: Efficient, Reliable Progress with Unstable Power

Kaisheng Ma; Xueqing Li; Karthik Swaminathan; Yang Zheng; Shuangchen Li; Yongpan Liu; Yuan Xie; Jack Sampson; Vijaykrishnan Narayanan

Nonvolatile processors (NVPs) have integrated nonvolatile memory to preserve task-intermediate on-chip state during power emergencies. NVPs hide data backup and restoration from the executing software to provide an execution mode that will always eventually complete the current task. NVPs are emerging as a promising solution for energy-harvesting scenarios, in which the available power supply is unstable and intermittent, because of their ability to ensure that even short periods of sufficient power, on the order of tens of instructions, will result in net forward progress. This article explores the design space for an NVP across different architectures, input power sources, and policies for maximizing forward progress in a framework calibrated using measured results from a fabricated NVP. The authors propose a heterogeneous microarchitecture solution that more efficiently capitalizes on ephemeral power surpluses.

design automation conference | 2014

Steep Slope Devices: Enabling New Architectural Paradigms

Karthik Swaminathan; Huichu Liu; Xueqing Li; Moon Seok Kim; Jack Sampson; Vijaykrishnan Narayanan

The existence of domains where traditional CMOS processors are inefficient has been well-documented in the current literature. In particular, the inefficiency of general purpose CMOS designs operating at very low supply voltages is well-known, and steep sub-threshold slope technologies, such as Tunneling Field Effect Transistors (TFETs), have been demonstrated as a viable alternative for the low-voltage operation domain. However, restricting the design space of steep slope technology-based processors to near-threshold or sub-threshold general purpose processors does the technology a disservice. Steep slope (SS) architectures can simultaneously expand the frontiers of viable computers at both ends of the energy scale: On the one hand, SS architectures enable ultra-low power sensor nodes and wearable technology, while on the other, they are applicable to high powered servers and high performance computing engines. We demonstrate the benefits of adapting this technology in such non-conventional domains, while attempting to address the major challenges encountered. We explore the effect of noise and variations at various levels of abstraction, ranging from the device to the architecture, and examine various techniques to overcome them.

international symposium on computer architecture | 2014

An examination of the architecture and system-level tradeoffs of employing steep slope devices in 3D CMPs

Karthik Swaminathan; Huichu Liu; Jack Sampson; Vijaykrishnan Narayanan

For any given application, there is an optimal throughput point in the space of per-processor performance and the number of such processors given to that application. However, due to thermal, yield, and other constraints, not all of these optimal points can plausibly be constructed with a given technology. In this paper, we look at how emerging steep slope devices, 3D circuit integration, and trends in process technology scaling will combine to shift the boundaries of both attainable performance, and the optimal set of technologies to employ to achieve it. We propose a heterogeneous-technology 3D architecture capable of operating efficiently at an expanded number of points in this larger design space and devise a heterogeneity and thermal aware scheduling algorithm to exploit its potential. Our heterogeneous mapping techniques are capable of producing speedups ranging from 17% for a high end server workloads running at around 90°C to over 160% for embedded systems running below 60°C.

compound semiconductor integrated circuit symposium | 2014

Enabling Power-Efficient Designs with III-V Tunnel FETs

Moon Seok Kim; Huichu Liu; Karthik Swaminathan; Xueqing Li; Suman Datta; Vijaykrishnan Narayanan

III-V Tunnel FETs (TFET) possess unique characteristics such as steep slope switching, high gm/IDS, uni-directional conduction, and low voltage operating capability. These characteristics have the potential to result in energy savings in both digital and analog applications. In this paper, we provide an overview of the power efficient properties of III-V TFETs and designs at the device, circuit and architectural level.

asia and south pacific design automation conference | 2012

When to forget: A system-level perspective on STT-RAMs

Karthik Swaminathan; Raghav Pisolkar; Cong Xu; Vijaykrishnan Narayanan

The benefits of using STT-RAMs as an alternative to SRAMs are being examined in great detail. However their comparatively higher write latencies and energies continue to be roadblocks for migrating to MRAM based technology in memory hierarchies. In this paper, we present a novel method by which we demonstrate significant energy reduction in writing to the STT-RAM cell by relaxing its non-volatility property. We exploit this characteristic for optimizing system-level properties such as garbage collection. By categorizing the objects based on their lifetimes it is possible to tune the data retention time of the STT-RAM to minimize the write energy. Our scheme yielded 37% reduction in dynamic energy, 88% reduction in leakage and 85% improvement in the Energy-Delay Product over a corresponding SRAM based memory structure.

international conference on computer design | 2015

Resilient mobile cognition: Algorithms, innovations, and architectures

Raphael Viguier; Chung-Ching Lin; Karthik Swaminathan; Augusto Vega; Alper Buyuktosunoglu; Sharathchandra U. Pankanti; Pradip Bose; H. Akbarpour; Filiz Bunyak; Kannappan Palaniappan

The importance of the internet-of-things (IOT) is now an established reality. With that backdrop, the phenomenal emergence of cameras/sensors mounted on unmanned aerial, ground and marine vehicles (UAVs, UGVs, UMVs) and body worn cameras is a notable new development. The swarms of cameras and real-time computing thereof are at the heart of new technologies like connected cars, drone-based city-wide surveillance and precision agriculture, etc. Smart computer vision algorithms (with or without dynamic learning) that enable object recognition and tracking, supported by baseline video content summarization or 2D/3D image reconstruction of the scanned environment are at the heart of such new applications. In this article, we summarize our recent innovations in this space. We focus primarily on algorithms and architectural design considerations for video summarization systems.

Explore More