Mohit Kapur
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohit Kapur.
field programmable gate arrays | 2012
Sameh W. Asaad; Ralph Bellofatto; Bernard Brezzo; Chuck Haymes; Mohit Kapur; Benjamin D. Parker; Thomas Roewer; Proshanta Saha; Todd E. Takken; Jose A. Tierno
Software based tools for simulation are not keeping up with the demands for increased chip and system design complexity. In this paper, we describe a cycle-accurate and cycle-reproducible large-scale FPGA platform that is designed from the ground up to accelerate logic verification of the Bluegene/Q compute node ASIC, a multi-processor SOC implemented in IBMs 45 nm SOI CMOS technology. This paper discusses the challenges for constructing such large-scale FPGA platforms, including design partitioning, clocking & synchronization, and debugging support, as well as our approach for addressing these challenges without sacrificing cycle accuracy and cycle reproducibility. The resulting fullchip simulation of the Bluegene/Q compute node ASIC runs at a simulated processor clock speed of 4 MHz, over 100,000 times faster than the logic level software simulation of the same design. The vast increase in simulation speed provides a new capability in the design cycle that proved to be instrumental in logic verification as well as early software development and performance validation for Bluegene/Q.
IEEE Journal of Solid-state Circuits | 2005
Sergey V. Rylov; Scott K. Reynolds; Daniel W. Storaska; Brian A. Floyd; Mohit Kapur; Thomas Zwick; Sudhir Gowda; Michael A. Sorna
We report a 10+ Gb/s serial link demo chip with NRZ signaling in 90-nm CMOS. It consists of a full-rate 4:1 MUX with 8-tap feed-forward equalizer, a half-rate 1:4 DEMUX with programmable peaking pre-amplifier, and a parallel port interface. All coefficients of the 8-tap FIR filter have programmable polarity and magnitude. The chip is housed in CBGA package and has ESD protection devices on all pins. All clock signals are supplied externally. The measured maximum speeds of stand-alone transmitter and receiver are 11.7 Gb/s and 13.3 Gb/s, respectively, and maximum back-to-back operation speed (transmitter + receiver) is 11.4 Gb/s. The chip operates at 10 Gb/s over 20 ft of lossy cable with 20 dB attenuation at 5 GHz. All circuits in the chip use a single 1.0 V power supply, except TX output driver and RX input termination network, which use 1.4 V supply. Total power consumption of TX and RX from the two supplies is 280 mW.
custom integrated circuits conference | 2004
Sergey V. Rylov; Scott K. Reynolds; Daniel W. Storaska; Brian A. Floyd; Mohit Kapur; Thomas Zwick; Sudhir Gowda; Michael A. Sorna
We report a 10+ Gb/s serial link demo chip in 90-nm CMOS. It consists of a full-rate 4:1 MUX with 8-tap feed-forward equalizer, a half-rate 1:4 DEMUX with programmable peaking pre-amplifier, and a parallel port interface. The chip is housed in CBGA package and uses ESD devices on all pins. The measured maximum speed of stand-alone transmitter and receiver was 11.7 Gb/s and 13.3 Gb/s respectively, and maximum back-to-back operation speed (transmitter+receiver) was 11.4 Gb/s.
custom integrated circuits conference | 2003
Seongwon Kim; Mohit Kapur; Mounir Meghelli; Alexander V. Rylyakov; Young H. Kwark; Daniel J. Friedman
A 2/sup 7/-1 pseudorandom bit sequence (PRBS) generator chip set that operates up to 45 Gb/s was fabricated and tested. The circuits are implemented using bipolar transistors in a SiGe BiCMOS technology and operate from a single -3.3 V power supply. The PRBS generator, which consumes 1.32 W, uses a high speed 4:1 multiplexer to produce the final output from four quarter-rate streams. The automatic synchronizing PRBS checker consumes 1.2 W and uses a half-rate architecture, demultiplexing the full-rate data stream to lower rate streams that are checked in parallel.
field programmable gate arrays | 2012
Proshanta Saha; Chuck Haymes; Ralph Bellofatto; Bernard Brezzo; Mohit Kapur; Sameh W. Asaad
FPGAs have become indispensible in processor design, bring-up and debug. Traditionally FPGAs have been used in prototyping, allowing end-users to emulate functionality of a specific component of a processor. However, as the complexity of processors grows, another aspect of processor design, RTL verification, has become a prime target for acceleration using FPGAs. Software-only RTL simulation and verification tools are no longer sufficient for many verification tasks as they often incur long execution time penalties. Software simulation time for a basic Linux kernel bring-up on a BlueGene/Q [1] processor, with 16 user PowerPC A2 cores, for example, could easily exceed several years. An important feature of RTL verification acceleration using FPGAs is its fast debugging capabilities. The ability to quickly and accurately pinpoint the location of an anomaly in an RTL source is highly desirable. This paper proposes efficient in-system debugging techniques on FPGAs for RTL verification. We show how a network of over 45 Virtex 5 LX330 FPGAs can be efficiently used to read out state information of the BlueGene/Q processor. We also demonstrate how the new in-system debugging technique is 250x faster than comparable methods.
Archive | 2003
Seongwon Kim; Mohit Kapur; Mounir Meghelli; Alexander V. Rylyakov; Young H. Kwark; Daniel J. Friedman
Archive | 2004
Mohit Kapur
Archive | 2012
Sameh W. Asaad; Mohit Kapur; Benjamin D. Parker
Archive | 2008
Mohit Kapur; Seongwon Kim
Archive | 2005
Mohit Kapur; Jose A. Tierno