Anuj Grover
STMicroelectronics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anuj Grover.
IEEE Journal of Solid-state Circuits | 2015
Edith Beigne; Alexandre Valentian; Ivan Miro-Panades; Robin Wilson; Philippe Flatresse; Thomas Benoist; Christian Bernard; Sébastien Bernard; Olivier Billoint; Sylvain Clerc; Bastien Giraud; Anuj Grover; Julien Le Coz; Jean-Philippe Noel; O. Thomas; Yvain Thonnart
Wide voltage range operation for DSPs brings more versatility to achieve high energy efficiency in mobile applications. This paper describes a 32 bits DSP fabricated in 28 nm Ultra Thin Body and Box FDSOI technology. Body Biasing Voltage (VBB) scaling from 0 V up to ±2 V decreases the core VDDMIN to 397 mV and increases clock frequency by +400%@500 mV and +114%@1.3 V. The DSP frequency measurements show 2.6 [email protected] V(VDD)@2 V(VBB) and 460 MHz@397 mV(VDD)@2 V(VBB). The lowest peak energy efficiency is measured at 62 pJ/op at 0.53 V. In addition to technological gains, maximum frequency tracking design techniques are proposed for wide voltage range operation. On silicon, at 0.6 V, those techniques allow high energy gain of 40.6% w.r.t. a worst case corner approach.
international solid-state circuits conference | 2014
Robin Wilson; Edith Beigne; Philippe Flatresse; Alexandre Valentian; Thomas Benoist; Christian Bernard; Sébastien Bernard; Olivier Billoint; Sylvain Clerc; Bastien Giraud; Anuj Grover; Julien Le Coz; Ivan Miro Panades; Jean-Philippe Noel; Bertrand Pelloux-Prayer; Philippe Roche; O. Thomas; Yvain Thonnart; David Turgis; Fabien Clermidy; Philippe Magarshack
Wide-voltage-range-operation DSPs bring more versatility to achieve high energy efficiency in mobile applications to increase signal processing complexity and handle a large range of performance specifications. This paper describes a 32b DSP fabricated in 28nm UTBB FDSOI technology [1]. Body-bias-voltage (VBB) scaling from 0V up to ±2V (Pwell/Nwell) decreases the DSP core VDDMIN to 397mV and increases clock frequency by +400% at 500mV and +114% at 1.3V. In addition to technology gains, dedicated design features are included to increase frequency over the full VDD range, considering parameter variations. As depicted in Fig. 27.1.1, the 32b datapath VLIW DSP is organized around a MAC dedicated to complex arithmetic and two dedicated operators: a cordic/divider and a compare/select. Data enters the circuit through a serial interface and code is run from a 64×32b register file. It has been shown in [1] that a given operating frequency can be achieved at a lower VDD in UTBB FDSOI compared to bulk by applying a forward-body bias. An additional design step is achieved in this work by (1) increasing the frequency at low VDD thanks to a specific selection and design of standard cells with respect to power vs. performance and (2) dynamically tracking the maximum frequency to cope with variations.
vlsi design and test | 2015
Jitendra Yadav; Pallavi Das; Abhinav Jain; Anuj Grover
High performance SOC contains considerable amount of SRAM memory occupying more than 60% of total SOC area. In CMOS process scaling down of feature size enables higher density and lower cost but high density array has significant impact on manufacturing yield and performance parameters of conventional 6T SRAM cell. In this paper we have presented an alternate area compact 5 transistor portless SRAM cell in 65nm CMOS technology. Various performance and reliability issues of 5T cell have been addressed. This 5T cell has shown to have 20-30% area reduction without any significant performance degradation as compared to the conventional 6T SRAM cell.
international conference on ic design and technology | 2013
Guillaume Moritz; Bastien Giraud; Jean-Philippe Noel; David Turgis; Anuj Grover
Advanced SoC designs regularly use Dynamic Voltage and Frequency Scaling (DVFS) to achieve high performance and low power targets of portable systems. In this paper, we focus on optimization of a Voltage Sense Amplifier (VSA) in 28nm Ultra-Thin Body and BOX Fully Depleted SOI (UTBB FD-SOI) technology to achieve high performance operations over the Ultra Wide Voltage Range (UWVR) from 1.3V to 0.4V. We use Flip-Well design methodology along with forward body bias modulation to extend operation range of the VSA and also reduce sense amplifier read time by 28%, while saving power consumption by up to 59% compared to Bulk technology.
vlsi design and test | 2016
Nidhi Batra; Anil Kumar Gundu; Mohammad S. Hashmi; G. S. Visweswaran; Anuj Grover
In advanced technology nodes, device variations limit the SRAM performance and yield. Cell stability defined by the Static Noise Margin (SNM) of the SRAM cell primarily governs the performance with respect to yield in SRAMs. Variations in the scaled SRAMs increase the probability of cells becoming weak. To ensure reliability of SRAMs it is important to identify such cells post silicon. In this work, we propose a correlation based test methodology to detect the weak bits in SRAMs with respect to SNM. We present a case study for 64×64 SRAM in 28nm FDSOI technology. The proposed methodology targets high speed testing and lower test costs. It enables to perform the test at nominal operating voltage and room temperature. Suitable read stress is induced by boosting the Word Line (WL) voltage of the 6T SRAM cell. To validate the effectiveness of the test and find appropriate test stress we propose correlation methodology. With this test we could detect the weak cells possessing SNM upto 60mV across various process corners for stress voltage ranging from 1.14V to 1.16V. Moreover, it requires minimal area penalty and test time compared to standard tests.
great lakes symposium on vlsi | 2016
Nidhi Batra; Pawan Sehgal; Shashwat Kaushik; Mohammad S. Hashmi; Sudesh Bhalla; Anuj Grover
In advanced technology nodes, the process variations deteriorate SRAM performance and greatly affect yield. It is necessary to formulate yield estimation models to optimize SRAMs and effectively trade-off area, performance and robustness. We propose models that in addition to enabling yield estimates also enable evaluation of lowering minimum operational voltage (VDDMIN). We present a quantitative analysis for SNM limited SRAM yield using Design of Experiments (DOE) method. The proposed framework for yield based design can also utilize recovery techniques like Error Correcting Codes (ECC) and redundancy and quantifies yield, area, and VDDmin improvements. We also present a case study that trades-off ECC recovery budget, VDDmin and area gain. We show 25% improvement in area and VDDmin lowering by 300mV at constant yield levels by using 50% of ECC recovery budget.
system on chip conference | 2015
Gaurav Narang; Alexander Fell; Prakhar Raj Gupta; Anuj Grover
Embedded memories are the key contributor to the chip area, dynamic power dissipation and also form a significant part of critical path for high performance advanced SoCs. Therefore, optimal selection of memory instances becomes imperative for SoC designers. While EDA tools have evolved over the past years to optimally select standard logic cells depending on the timing and the power constraints, optimal memory selection is largely a manual process. We propose a framework to optimize power, performance, and area (PPA) of a memory subsystem (MSS) by including floorplan dependent delays and power consumption in interconnects and glue logic of the MSS in the pre-RTL stage. Through this framework, we demonstrate that for a 4 Mb assembly of SRAM instances, dynamic power is reduced by 44%, area by 49%, and leakage by 71% with the floorplan aware selection. The framework has the capability to use different estimates, when routing congestion is important (for example, in low cost processes with less number of metal layers). We also show that the interconnect delays are reduced by about 68% and dynamic power by 58%, if additional metal layers are available for routing compared to a low cost 6 metal process.
international conference on ic design and technology | 2015
Anuj Grover; Promod Kumar; Mohammad Daud; G.S. Visweswaran; C. Parthasarathy; Jean-Philippe Noel; David Turgis; Bastien Giraud; Guillaume Moritz
Dual Rail SRAMs are widely used to enable Dynamic Voltage and Frequency Scaling (DVFS) in SRAMs where array voltage cannot be scaled down. DVFS operating points are limited by maximum differential supported between two supplies of the SRAM. To extend gains of DVFS, we propose a Low Standby Power - Capacitively Coupled Sense Amplifier (LSTP-C2SA) that enables further lowering of periphery supply in Dual Rail SRAMs without leading to SRAM cell instability. We present a design method to optimally size the coupling capacitance in LSTP-C2SAs. Designs with LSTP-C2SA are shown to consume 43% lesser read power in DVFS operation at 0.4V in 28nm UTBB FD-SOI when compared to an implementation with standard latch sense amplifier. Silicon measurements confirm LSTP-C2SA functionality at 0.35V.
vlsi design and test | 2017
Anand Ilakal; Anuj Grover
The impact of high energy particles in digital memory elements becomes important as technology scales down. The memory elements hold high density latches to store data and these latches are susceptible to disturbs due to particle strikes. The alpha particles, neutrons from cosmic rays may cause Single Event Upset (SEU) in memory cells. In this paper, we propose a method to estimate and compare SER robustness of different layout topologies of SRAM cell. We demonstrate that the radiation hardened layout topologies offer much better Soft Error Rate (SER) robustness compared to conventional layout of the 6-T SRAM cell in 28FDSOI and 40 nm technology. The analysis is done using ELDO simulator for a wide range of Linear Energy Transfer (LET) profiles of particle strikes.
IEEE Transactions on Very Large Scale Integration Systems | 2017
Shalini Pathak; Anuj Grover; Mausumi Pohit; Nitin Bansal
Power dissipation during scan testing of a system-on-chip can be significantly higher than that during functional mode, causing reliability and yield concerns. This paper proposes a logic cluster controllability (LoCCo)-based scan chain stitching methodology to achieve low-power testing. The scan chain stitching is made power aware by placing flip-flops with higher test combination requirements at the beginning of scan chains, while flip-flops with lower test combination requirements are put toward the end of scan chains. The test combination requirements are estimated through a simple logic cluster and flip-flop controllability identification algorithm. This method helps in consolidating care bits toward the beginning of scan chains. Hence, a significantly lower shift-in transition is achieved in the test patterns. The results from ITC’99 and industrial designs in 28FDSOI and 40-nm CMOS technologies show a total shift-in transition reduction of up to 23.1% and average shift power reduction of up to 21.6% using the proposed method. The use of LoCCo methodology posed a negligible routing congestion overhead in the layout compared to the conventional method. LoCCo is also used as a base to apply other vector reordering low-power methods and gain
Collaboration
Dive into the Anuj Grover's collaboration.
French Alternative Energies and Atomic Energy Commission
View shared research outputs