Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Russell P. Kraft is active.

Publication


Featured researches published by Russell P. Kraft.


IEEE Design & Test of Computers | 2005

Predicting the performance of a 3D processor-memory chip stack

Philip Jacob; Okan Erdogan; Aamir Zia; Paul M. Belemjian; Russell P. Kraft; John F. McDonald

We are exploring a 3D processor-memory stack for use with the message passing interface (MPI). The communication among processors in huge servers wastes several thousands of cycles. Most of these wasted cycles do not come from the communication link among the processors across the system, but rather in handling the message packets. A processor that could handle this message packing and communication at a much faster rate could significantly increase this tasks efficiency and thus increase the utilization of such supercomputers, currently a very low 1%. However, at such high clock rates, the memory wall would become a significant problem. Tackling this problem requires innovative technologies, such as 3D memories, which alleviate some problems with long on-chip interconnects. The importance of interconnection wires to circuit performance is on a chip. The need for shorter interconnection delays suggests shorter interconnection wires. Shorter interconnections are more likely in 3D architectures than in equivalent 2D systems. This article explores the advantages of 3D in a processor-memory stack system. We conducted simulations using simple tools like Dinero IV and the cache access and cycle time information (Cacti) to evaluate the performances of various memory architectures.


Proceedings of the IEEE | 2009

Mitigating Memory Wall Effects in High-Clock-Rate and Multicore CMOS 3-D Processor Memory Stacks

Philip Jacob; Aamir Zia; Okan Erdogan; Paul M. Belemjian; Jin Woo Kim; Michael Chu; Russell P. Kraft; John F. McDonald; Kerry Bernstein

Three-dimensional chip (3-D) stacking technology provides a new approach to address the so-called memory wall problem. Memory processor chip stacking reduces this memory wall problem, permitting faster clock rates (with suitable processor logic) or permitting multicore access to shared memory using a large number of vertical vias between tiers in the stack, for ultrawide bit path transfer of data and address information to and from various levels of cache. Although a limited amount of parallel access is possible using conventional two-dimensional (2-D) chip memory-processor approaches, 3-D memory-processor stacking greatly extends this to much larger capacity memories. We evaluate high-clock-rate processors as well as shared memory processors with a large number of cores. Various architectural design options to reduce the impact of the memory wall on the processor performance are explored and validated through simulations. Certain architectural features can be implemented in a 3-D chip, such as an ultrawide, ultrashort vertical bus with low parasitic resistance and the elimination of conventional electrostatic discharge, and packaging parasitics required in multiple package 2-D solutions. The objective is to reduce the clocks per instruction figure of merit for high clock speeds in order to deliver significant performance levels. High-clock-rate processors can be designed with SiGe heterostructure bipolar transistors to obtain processors operating on the order of 16 or 32 GHz.


IEEE Journal of Solid-state Circuits | 2010

A 40 Gs/s Time Interleaved ADC Using SiGe BiCMOS Technology

Michael Chu; Philip Jacob; Jin Woo Kim; Mitchell R. LeRoy; Russell P. Kraft; John F. McDonald

The search for high speed, high bandwidth A/D converters is ongoing, and techniques to push the envelope are constantly being developed. In this paper an open loop, scalable, time-interleaved ADC architecture is presented, as well as a 60 GHz Colpitts oscillator. With the use of double-sampling, the timing skew requirements between channels is greatly relaxed, allowing sampling rates of up to 40 Gs/s at 4-bits of accuracy. This circuit is implemented using the IBM 8HP SiGe technology, with fT of 210 GHz. The performance of the 8HP ADC is validated by measurement. In addition, simulations with an experimental 8XP transistor model provided by IBM with a 350 GHz fT suggest that 30% more circuit speed is possible by just swapping the transistors.


international interconnect technology conference | 2001

Stacked chip-to-chip interconnections using wafer bonding technology with dielectric bonding glues

Jian-Qiang Lu; Y. Kwon; Russell P. Kraft; R.J. Gutmann; John F. McDonald; T.S. Gale

Three-dimensional (3D) interconnects offer the potential of reducing fabrication and performance limitations of future generations of planar ICs. This paper describes a specific approach, incorporating wafer alignment and wafer bonding of two 200-mm silicon wafers, along with subsequent processing steps. Our approach using dielectrics as the bonding glue layer provides a monolithic 3D interconnect process, which is fully compatible with back-end-of-the-line processing. This 3D technology enables heterogeneous systems, such as future electronic and photonic systems using a mix-and-match hard IP core design approach, and provides a high-density pin-out alternative to stacked chip-scale packages today.


IEEE Transactions on Components, Packaging, and Manufacturing Technology: Part C | 1998

Improvements to X-ray laminography for automated inspection of solder joints

Vijay Sankaran; Andrew R. Kalukin; Russell P. Kraft

With the increased usage of fine-pitch assemblies and ball grid array (BGA) packages, there is a dramatic increase in demand for automated defect detection techniques such as X-ray laminography. However, the limitations of this imaging medium are not well understood by the industry. This article addresses the need for improving the imaging resolution of X-ray laminography, particularly for accurate three-dimensional (3-D) measurement of solder joint structures. The authors have developed a new method for reconstruction of the laminographs which improves the signal-to-noise ratio (SNR) of the laminographs significantly and enables better 3-D visualization of solder shape. Application of automated solder joint defect classification using neural networks has also been studied. Components with BGA, gull-wing and J-lead joints were imaged and several neural network methods were used to identify different classes of defects particularly significant to each type of joint. A novel probabilistic neural network approach for two-dimensional (2-D) image classification has been developed which performs as well as or better than a conventional backpropagation network.


great lakes symposium on vlsi | 2003

3D direct vertical interconnect microprocessors test vehicle

John Mayega; Okan Erdogan; Paul M. Belemjian; Kuan Zhou; John F. McDonald; Russell P. Kraft

The current trends in high performance integrated circuits are towards faster and more powerful circuits in the giga-hertz range and even further. As the more complex Integrated Circuits (IC) such as microprocessors have been entering the giga-hertz operating frequency range, various speed related roadblocks have become increasingly difficult to overcome. The migration to smaller devices has raised serious challenges. The major impediment to fulfill Moores Law effectively in the years to come is increasingly becoming the interconnect. ICs are using a greater fraction of their clock cycles charging interconnect wires. IC interconnect related speed degradation has stimulated much research effort in the area of low dielectric constant materials. A relatively novel approach, wafer scale 3-dimensional (3D) integration attempts to by-pass the large wire parasitics by shortening wires. This paper is going to elaborate on a 3D microprocessor test vehicle. We intend to demonstrate the speed advantages, which may be derived from 3D integration, through a combination of fabrication, testing and simulation.


international interconnect technology conference | 2002

A wafer-scale 3D IC technology platform using dielectric bonding glues and copper damascene patterned inter-wafer interconnects

Jian-Qiang Lu; Y. Kwon; G. Rajagopalan; M. Gupta; J.J. McMahon; K.-W. Lee; Russell P. Kraft; John F. McDonald; T.S. Cale; R.J. Gutmann; B. Xu; E. Eisenbraun; J. Castracane; A. Kaloyeros

A viable approach for a monolithic wafer-scale three-dimensional (3D) IC technology platform is presented, focusing on wafer bonding, wafer thinning and inter-wafer damascene-patterned interconnects. Principal results include successful wafer alignment, wafer bonding with both BCB and Flare, post bonding wafer thinning using grinding and polishing to 35-50 /spl mu/m, and via etch through the required material stack.


IEEE Transactions on Applied Superconductivity | 2003

Integration of cryocooled superconducting analog-to-digital converter and SiGe output amplifier

Deepnarayan Gupta; Alan M. Kadin; Robert J. Webber; Irwin Rochwarger; Daniel Bryce; William J. Hollander; Young Uk Yim; Channakeshav; Russell P. Kraft; Jin Woo Kim; John F. McDonald

HYPRES is developing a prototype digital system comprising a Nb RSFQ analog-to-digital converter (ADC) and SiGe amplifiers on a commercial two-stage cryocooler. This involves the detailed thermal, electrical, and mechanical design of the ADC chip mount, input/output (I/O) cables, and electromagnetic shielding. Our objective is to minimize the heat load on the second (4 K) stage of the cryocooler, in order to ensure stable ADC operation. The design incorporates thermal radiation shields and magnetic shielding for the RSFQ circuit. For the I/O cables, the thermal design must be balanced against the acceptable attenuation of RF lines and resistance of DC bias lines. SiGe heterojunction bipolar transistor (HBT) signal conditioning circuits, placed on the first (60 K) stage of the cryocooler, will amplify the mV-level ADC outputs to V-level (e.g., ECL) outputs for seamless transition to room-temperature electronics. Cooling these HBT circuits lowers noise and improves their high-frequency performance. Demonstration of this prototype should lead the way to commercialization of high-speed digital superconducting systems, for such applications as wireless communication, radars, and switching networks.


IEEE Transactions on Components, Packaging, and Manufacturing Technology: Part C | 1998

Designed experiments to investigate the solder joint quality output of a prototype automated surface mount replacement system

Ismail Fidan; Russell P. Kraft; Lawrence E. Ruff; Stephen Derby

A robotic remanufacturing system at Rensselaer has been developed to replace fine pitch surface mounted components on a populated printed circuit board (PCB). The performance goal is to maximize the quality of the solder joint output to optimize the systems throughput. The purpose of this paper is to present the process parameters determined for obtaining a good solder joint with the rework cell. Developed here and to show an analysis of the solder joint quality obtained from the developed system. Maximizing the rate of correctly reworked components by the system is not chosen as a goal for measuring the systems performance. This is because rework is a low volume process and the cell being used, is an experimental prototype system.


IEEE Transactions on Very Large Scale Integration Systems | 2010

A 3-D Cache With Ultra-Wide Data Bus for 3-D Processor-Memory Integration

Aamir Zia; Philip Jacob; Jin Woo Kim; Michael Chu; Russell P. Kraft; John F. McDonald

Slow cache memory systems and low memory bandwidth present a major bottleneck in performance of modern microprocessors. 3-D integration of processor and memory subsystems provides a means to realize a wide data bus that could provide a high bandwidth and low latency on-chip cache. This paper presents a three-tier, 3-D 192-kB cache for a 3-D processor-memory stack. The chip is designed and fabricated in a 0.18 m fully depleted SOI CMOS process. An ultra wide data bus for connecting the 3-D cache with the microprocessor is implemented using dense vertical vias between the stacked wafers. The fabricated cache operates at 500 MHz and achieves up to 96 GB/s aggregate bandwidth at the output.

Collaboration


Dive into the Russell P. Kraft's collaboration.

Top Co-Authors

Avatar

John F. McDonald

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Michael Chu

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Chao You

North Dakota State University

View shared research outputs
Top Co-Authors

Avatar

Jong-Ru Guo

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Kuan Zhou

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Bryan S. Goda

United States Military Academy

View shared research outputs
Top Co-Authors

Avatar

Okan Erdogan

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Aamir Zia

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Jin Woo Kim

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Philip Jacob

Rensselaer Polytechnic Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge