Stephen T. Kim
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stephen T. Kim.
IEEE Journal of Solid-state Circuits | 2014
Rinkle Jain; Bibiche M. Geuskens; Stephen T. Kim; Muhammad M. Khellah; Jaydeep P. Kulkarni; James W. Tschanz; Vivek De
A fully integrated switched capacitor voltage regulator (SCVR) with on-die high density MIM capacitor, distributed across a 14 KB register file (RF) load is demonstrated in 22 nm tri-gate CMOS. The multi-conversion-ratio SCVR provides a wide output voltage range of 0.45-1 V from a fixed input voltage of 1.225 V. It achieves 63-84% conversion efficiency and supports a maximum load current density of 0.88 A/mm2. The area overhead of the dedicated SCVR on the load is 3.6%. Measured data is presented on various performance indices in detail. Subsequent learning on tradeoffs between various factors like capacitance characteristics, conversion efficiency and current density are delineated and, correlated with theoretical estimates. Performance of RF array shows comparable results when powered with the SCVR and the external rail. The all-digital, modular design allows efficient spatial distribution across the load and hence robust power delivery. The extremely fast response times in the order of few nanoseconds is targeted to benefit agile power management. This work evinces voltage regulator technology as a standard homogenous CMOS component, which can proliferate DVFS domains for maximum energy and area benefits.
international solid-state circuits conference | 2014
Joseph F. Ryan; Charles Augustine; Jaydeep P. Kulkarni; Yi-Chun Shih; Stephen T. Kim; Rinkle Jain; Keith A. Bowman; Arijit Raychowdhury; Muhammad M. Khellah; James W. Tschanz; Vivek De
In this paper, we present a low-power graphics processing core that achieves a 40% improvement in peak energy efficiency using dual-VCC arrays, adaptive clocking for voltage droop mitigation, and state retention capability with an integrated retention clamping circuit for low-power sleep mode. The 22nm testchip includes a graphics execution core connected to an SRAM array and test controller used for storage and delivery of at-speed test vectors. Correct execution of the tests is validated through a multiple-input signature register (MISR), which accumulates key signals in the core and generates a 32b signature at test completion.
international solid-state circuits conference | 2015
Stephen T. Kim; Yi-Chun Shih; Kaushik Mazumdar; Rinkle Jain; Joseph F. Ryan; Charles Augustine; Jaydeep P. Kulkarni; Krishnan Ravichandran; James W. Tschanz; Muhammad M. Khellah; Vivek De
A graphics execution core in 22nm improves energy efficiency across a wide DVFS range, from the near-threshold voltage (NTV) region, where circuit assist lowers intrinsic VM!N, to the turbo region, where adaptive clocking reduces the voltage-droop guard-band [1]. When powered with a shared rail, however, energy is wasted in the core if other blocks demand higher voltage and performance. Alternately, a per-core fully-integrated voltage regulator (VR) provides a cost-effective means to realize autonomous DVFS [2-4]. In this paper, we present a graphics core that is supplied by a fully integrated and digitally controlled hybrid low-drop-out (LDO)/switched-capacitor voltage regulator (SCVR) with fast droop response (Fig. 8.6.1). While the LDO VR enables high power density and is area efficient, as it can use existing power headers originally employed for bypass/sleep modes, it suffers from efficiency loss at low VOUT. An SCVR, on the other hand, has improved conversion efficiency across a wide VOUT range. In an area-constrained design, however, the limited size of the SCVRs fly capacitors and associated configurable power stages sets an upper bound on the SCVRs maximum power density, restricting its use to lower VOUT. This LDO/SCVR combination delivers the power required by the core at a high VOUT of 0.92V with 84% LDO efficiency, while extending to a low VOUT of 0.38V with 52% SCVR efficiency from a 1.05V VIN. Compared to a shared-rail scheme, the hybrid VR enables 26% to 82% reduction in core energy versus 26% to 67% if solely the LDO is used.
IEEE Journal of Solid-state Circuits | 2016
Stephen T. Kim; Yi-Chun Shih; Kaushik Mazumdar; Rinkle Jain; Joseph F. Ryan; Charles Augustine; Jaydeep P. Kulkarni; Krishnan Ravichandran; James W. Tschanz; Muhammad M. Khellah; Vivek De
A digitally-controlled fully integrated voltage regulator (IVR) enables wide autonomous DVFS in a 22 nm graphics execution core. Part of the original power header is converted into a hybrid power stage to support digital low-dropout (DLDO), and switched-capacitor voltage regulator (SCVR) modes, in addition to the original bypass and sleep modes. Using voltage sensing, tunable replica circuit, or a core warning signal, the IVR detects and quickly responds to fast voltage droops to support fast dynamic workload changes without performance degradation. In a prototype, a 3D graphics execution core is powered up by the proposed hybrid IVR demonstrating measured 26% and 82% reduction in core energy in the turbo and the near-threshold voltage (NTV) modes, respectively. The total area overhead of the proposed hybrid IVR is 4% of the core compared to 2% from the original power header. Our digitally assisted control for the droop response shows ~ 75% core frequency improvement at 0.84 V.
IEEE Journal of Solid-state Circuits | 2015
Rinkle Jain; Stephen T. Kim; Vaibhav Vaidya; Krishnan Ravichandran; James W. Tschanz; Vivek De
Active conduction modulation techniques are demonstrated in a fully integrated multi-ratio switched-capacitor voltage regulator with hysteretic control, implemented in 22nm tri-gate CMOS with high-density MIM capacitor. We present (i) an adaptive switching frequency and switch-size scaling scheme for maximum efficiency tracking across a wide range voltages and currents, governed by a frequency-based control law that is experimentally validated across multiple dies and temperatures, and (ii) a simple active ripple mitigation technique to modulate gate drive of select MOSFET switches effectively in all conversion modes. Efficiency boosts upto 15% at light loads are measured under light load conditions. Load-independent output ripple of <;50mV is achieved, enabling fewer interleaving. Testchip implementations and measurements demonstrate ease of integration in SoC designs, power efficiency benefits and EMI/RFI improvements.
custom integrated circuits conference | 2014
Rinkle Jain; Stephen T. Kim; Vaibhav Vaidya; James W. Tschanz; Krishnan Ravichandran; Vivek De
Switch conductance modulation techniques are demonstrated in a fully integrated multi-ratio switched-capacitor voltage regulator with hysteretic control, in 22 nm tri-gate CMOS with high-density MIM capacitor. We present (i) an adaptive switch-size scaling scheme for maximum efficiency tracking across a wide range of voltages and currents, governed by a frequency-based control law that is experimentally validated across multiple dies and temperatures and, (ii) a simple active ripple mitigation technique that modulates the gate drive of select MOSFET switches effectively in all conversion modes. Efficiency improvements up to 15% are measured under low output voltage and load conditions. Load-independent output ripple of <;50 mV is achieved, enabling reduced interleaving. Test chip implementations and measurements demonstrate ease of integration in SoC designs, power efficiency benefits and EMI/RFI improvements.
international solid-state circuits conference | 2016
Minki Cho; Stephen T. Kim; Charles Augustine; Jaydeep P. Kulkarni; Krishnan Ravichandran; James W. Tschanz; Muhammad M. Khellah; Vivek De
A graphics execution core in 22nm combines SRAM array-assist circuits to lower intrinsic VMIN, retention flops to reduce leakage power during stall periods, and a fully integrated hybrid digital LDO/SCVR regulator to provide a cost-effective means to realize autonomous DVFS under a shared-rail scenario [1-2]. In a conventional design, a conservative voltage guard band (VGB) is added to the nominal supply of each die to guarantee its correct operation at target frequency in the presence of worst-case delay degradation induced by inverse-temperature dependency (ITD) [3], device aging, and voltage droop. This VGB is determined from post-silicon characterization as the voltage shift needed by the worst-case die, assuming extreme aging usage conditions, while running a power virus load. In this paper, we present a graphics execution core that uses an in-situ tunable replica circuit (TRC) [4-5] to monitor critical timing margin and trigger adaptive voltage scaling (AVS) as needed to dynamically adjust VCC during run time (Fig. 8.4.1). The TRC monitors slow variations in temperature and aging and provides a time-to-digital converter (TDC) code, representing the timing margin measurement, to the AVS controller. Based on the TDC code, the AVS controller communicates a new voltage ID (VID) to the external voltage regulator module (VRM) to maintain minimum VCC necessary to meet a given performance level.
IEEE Journal of Solid-state Circuits | 2017
Minki Cho; Stephen T. Kim; Charles Augustine; Jaydeep P. Kulkarni; Krishnan Ravichandran; James W. Tschanz; Muhammad M. Khellah; Vivek De
In high volume manufacturing, conventional approach to deal with inverse-temperature dependence (ITD) and aging is to add a post silicon flat voltage guard band to all dies based on testing a small random sample of dies. Although this scheme guarantees error-free operation, it significantly degrades energy efficiency, as it penalize all dies for the maximum delay degradation due to ITD and aging as seen by the worst case die, while also assuming maximum aging condition. In this paper, a graphics execution core implemented in 22 nm trigate process uses per-die tunable replica circuit (TRC) to monitor delay degradation due to ITD and actual aging conditions. TRC triggers adaptive voltage scaling to dynamically adjust VCC as needed during run time to maintain correct operation at minimum additional voltage. Measured data show up to 33% (14%) energy savings at 0.4 V (0.8 V) compared with baseline scheme. The TRC is also utilized in a dynamic power gating (DPG) scheme to lower energy overhead due to fast droop guard band. DPG introduces a load line effect during normal operation, thus saving energy, while deactivating this load line upon droop detection by the TRC to maintain ISO performance as baseline. Silicon data show that DPG can improve energy efficiency by 14.5% (7%) at 0.8 V (0.6 V).
symposium on vlsi circuits | 2016
Minki Cho; Stephen T. Kim; James W. Tschanz; Muhammad M. Khellah; Vivek De
Combining adaptive clocking with dynamic power gating in an optimal manner mitigates energy efficiency and performance impacts of fast supply voltage droop in a 22nm graphics execution core more effectively than adaptive clocking alone. Measurements show that there is an optimal VMIN where the combination provides the best improvement - 14% lower energy at 890MHz vs. 4% with adaptive clocking.
custom integrated circuits conference | 2015
Pavan Kumar; Vaibhav Vaidya; Harish K. Krishnamurthy; Stephen T. Kim; George E. Matthew; Sheldon Weng; Bharani Thiruvengadam; Wayne Proefrock; Krishnan Ravichandran; Vivek De
Monolithic integration of Voltage Regulators (VR) is challenging given the inherent lack of scalability of inductor. Circuit techniques to reduce inductor size are attractive to increase power density and scalability. This paper presents a 70~72% efficient, 500MHz digitally controlled 3-level Buck VR with a fully on-die spiral inductor implemented on 22nm Tri-Gate CMOS with MIM capacitors. The advantages of the 3-level converter for wide range Dynamic Voltage & Frequency Scaling (DVFS) over traditional solutions like linear regulators & Buck VRs are demonstrated.