Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Supreet Jeloka is active.

Publication


Featured researches published by Supreet Jeloka.


IEEE Journal of Solid-state Circuits | 2016

A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push-Rule 6T Bit Cell Enabling Logic-in-Memory

Supreet Jeloka; Naveen Bharathwaj Akesh; Dennis Sylvester; David T. Blaauw

Conventional content addressable memory (BCAM and TCAM) uses specialized 10T/16T bit cells that are significantly larger than 6T SRAM cells. A new BCAM/TCAM is proposed that can operate with standard push-rule 6T SRAM cells, reducing array area by 2-5× and allowing reconfiguration of the SRAM as a CAM. In this way, chip area and overall capacitance can be reduced, leading to higher energy efficiency for search operations. In addition, the configurable memory can perform bit-wise logical operations: “AND” and “NOR” on two or more words stored within the array. Thus, the configurable memory with CAM and logical function capability can be used to off-load specific computational operations to the memory, improving system performance and efficiency. Using a 6T 28 nm FDSOI SRAM bit cell, the 64×64 (4 kb) BCAM achieves 370 MHz at 1 V and consumes 0.6 fJ/search/bit. A logical operation between two 64 bit words achieves 787 MHz at 1 V.


design automation conference | 2014

VIX: Virtual Input Crossbar for Efficient Switch Allocation

Supriya Rao; Supreet Jeloka; Reetuparna Das; David T. Blaauw; Ronald G. Dreslinski; Trevor N. Mudge

Separable allocators in on-chip routers perform switch allocation in two stages that often make uncoordinated decisions resulting in sub-optimal switch allocation. We propose Virtual Input Crossbars (VIX), where more than one virtual channel (VC) of an input port is connected to the crossbar. VIX improves switch allocation by allowing more than one input VC of an input port to transmit flits in the same cycle. Also, more input VCs can participate in the output arbitration, reducing the chances of uncoordinated decisions. VIX improves network throughput by more than 15% for the topologies studied without affecting the router critical path.


design automation conference | 2016

Near-threshold computing in FinFET technologies: opportunities for improved voltage scalability

Nathaniel Ross Pinckney; Lucian Shifren; Brian Cline; Saurabh Sinha; Supreet Jeloka; Ronald G. Dreslinski; Trevor N. Mudge; Dennis Sylvester; David T. Blaauw

In recent years, operating at near-threshold supply voltages has been proposed to improve energy efficiency in circuits, yet decreased efficacy of dynamic voltage scaling has been observed in recent planar technologies. However, foundries have introduced a shift from planar to FinFET fabrication processes. In this paper, we study 7nm FinFETs ability to voltage scale and compare it to planar technologies across three dynamic voltage scaling scenarios. The switch to FinFET allows for a return to strong voltage scalability. We find up to 8.6 × higher energy efficiency at NT compared to nominal supply voltage (vs. 4.8 × gain in 20nm planar).


symposium on vlsi circuits | 2015

A configurable TCAM/BCAM/SRAM using 28nm push-rule 6T bit cell

Supreet Jeloka; Naveen Bharathwaj Akesh; Dennis Sylvester; David T. Blaauw

Conventional Content Addressable Memory (BCAM and TCAM) uses specialized 10T / 16T bit cells that are significantly larger than 6T SRAM cells. We propose a new BCAM/TCAM that can operate with standard push-rule 6T SRAM cells, reducing array area by 2-5× and allowing reconfiguration of the CAM as an SRAM. Using a 6T 28nm FDSOI SRAM bit cell, the 64×64 (4kb) BCAM achieves 370 MHz at 1V and consumes 0.6fJ/search/bit.


IEEE Transactions on Computers | 2016

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems

Hsing Min Chen; Supreet Jeloka; Akhil Arunkumar; David T. Blaauw; Carole Jean Wu; Trevor N. Mudge; Chaitali Chakrabarti

Most server-grade systems provide Chipkill-Correct error protection at the expense of power and performance. In this paper we present a low overhead solution to improving the reliability of commodity DRAM systems with no change in the existing memory architecture. Specifically, we propose five erasure and error correction (E-ECC) schemes that provide at least Chipkill-Correct protection for x4 (Schemes 1, 2 and 3), x8 (Scheme 4) and x16 (Scheme 5) DRAM systems. All schemes have superior error correction performance due to the use of strong symbol-based codes. Synthesis results in 28 nm node show that the decoding latency of these codes is negligible compared to the DRAM access latency. In addition, we make use of erasure codes to extend the lifetime of the DRAM systems. Specifically, once a chip is marked faulty due to persistent errors, all E-ECC schemes correct erasures due to that faulty chip and also correct an additional random error in a second chip. Evaluation with SPEC2006 workloads show that compared to x4 Chipkill-Correct schemes, Scheme 5 has the highest IPC improvement (mean of 7 percent) and Scheme 4 has the largest power reduction (mean of 18 percent) and the largest increase in energy efficiency (mean of 25 percent).


symposium on vlsi circuits | 2017

Recryptor: A reconfigurable in-memory cryptographic Cortex-M0 processor for IoT

Yiqun Zhang; Li Xu; Kaiyuan Yang; Qing Dong; Supreet Jeloka; David T. Blaauw; Dennis Sylvester

This paper proposes Recryptor, an energy efficient and compact ARM Cortex-M0 based reconfigurable cryptographic processor using in-memory computing. Recryptor is capable of accelerating a wide range of cryptography algorithms and standards, including public/private key cryptography and hash functions, by augmenting the memory of a commercial general purpose IoT processor resulting in a highly compact implementation. The wide bit-width of memory is ideally suited for high bitwidth (64–512b) arithmetic operations common in cryptographic functions. Recryptor (28.8 MHz at 0.7 V) achieves 6.8× average speedup and 12.8× average energy improvements over state-of-the-art software and hardware-accelerated implementations with only 0.128 mm2 area overhead in 40nm CMOS.


symposium on vlsi circuits | 2017

A sequence dependent challenge-response PUF using 28nm SRAM 6T bit cell

Supreet Jeloka; Kaiyuan Yang; Michael Orshansky; Dennis Sylvester; David T. Blaauw

Conventionally, SRAM PUFs are only used for chip ID. The proposed sequence dependent PUF expands the challenge-response space of an SRAM PUF by an order of rows(sequence length-1), making it suitable for authentication. In addition, it has a sequence dependent non-linear behavior making it more immune to machine learning attacks. In 28nm, the 64×64 SRAM-based PUF has a bit area of 388F2 with energy ranging from 30fJ/bit–88fJ/bit at 0.6V. It also provides high throughput, from 2.2Gbps to 6.8Gbps at 0.9V.


symposium on vlsi circuits | 2017

A 0.3V VDDmin 4+2T SRAM for searching and in-memory computing using 55nm DDC technology

Qing Dong; Supreet Jeloka; Mehdi Saligane; Yejoong Kim; Masaru Kawaminami; Akihiko Harada; Satoru Miyoshi; David T. Blaauw; Dennis Sylvester

A 4+2T SRAM is proposed that offers searching and logic functions. The cell uses the N-well as the write wordline (WL) and eliminates the access transistors. Decoupled read paths enable reliable multi-word activation for in-memory Boolean logic functions. The SRAM can reconfigure to BCAM/TCAM for searching operations, with 0.13fJ/search/bit at 0.35V. Forty test chips in 55nm deeply depleted channel (DDC) technology achieve worst-case 0.3 V VDDmin.


symposium on vlsi circuits | 2016

A 66pW discontinuous switch-capacitor energy harvester for self-sustaining sensor applications

Xiao Wu; Yao Shi; Supreet Jeloka; Kaiyuan Yang; Inhee Lee; Dennis Sylvester; David T. Blaauw

We present a discontinuous harvesting approach for switch capacitor DC-DC converters that enables ultra-low power energy harvesting. By slowly accumulating charge on an input capacitor and then transferring it to a battery in burst-mode, switching and leakage losses in the DC-DC converter can be optimally traded-off with the loss due to non-ideal MPPT operation. The harvester uses a 15pW mode controller, an automatic conversion ratio modulator, and a moving sum charge pump for low startup energy upon a mode switch. In 180nm CMOS, the harvester achieves >40% end-to-end efficiency from 113pW to 1.5μW with 66pW minimum input power, marking a >10× improvement over prior ultra-low power harvesters.


symposium on vlsi circuits | 2017

An ultra-wide program, 122pJ/bit flash memory using charge recycling

Supreet Jeloka; Jeongsup Lee; Ziyun Li; Jinal Shah; Qing Dong; Kaiyuan Yang; Dennis Sylvester; David T. Blaauw

Embedded flash for low power sensing systems require very low write energy and peak power. This work proposes a 130nm, 1024×260 SONOS flash with an ultra-wide 1Kb program cycle, using efficient FN tunneling based programing and a dedicated, multi-output transition pump with charge sharing and charge recycling. Combined with energy efficient charge pumps, the proposed flash program energy is 122pJ/bit with a 1Mbps throughput.

Collaboration


Dive into the Supreet Jeloka's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yejoong Kim

University of Michigan

View shared research outputs
Researchain Logo
Decentralizing Knowledge