Charles Luther Johnson

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Charles Luther Johnson is active.

Explore More

Publication

Featured researches published by Charles Luther Johnson.

custom integrated circuits conference | 1989

VLSI performance compensation for off-chip drivers and clock generation

Dennis Thomas Cox; David Leroy Guertin; Charles Luther Johnson; Bruce George Rudolph; Robert Russell Williams; Ronald A. Piro; Douglas W. Stout

A major problem in VLSI system design is controlling off-chip driver characteristics and skew in clock generation as process parameters, temperature, and supply voltage vary. A control circuit methodology has been developed that senses the relative performance of a CMOS chip and transmits a digitally encoded state to off-chip driver and clock generation circuits to control their operating characteristics

international symposium on microarchitecture | 2011

IBM Power Edge of Network Processor: A Wire-Speed System on a Chip

Jeffrey Douglas Brown; Sandra S. Woodward; Brian Mitchell Bass; Charles Luther Johnson

The IBM Power Edge of Network processor combines the attributes of a general-purpose processing subsystem with function accelerators and networking interfaces to create a system on a chip thats targeted for applications at the edge of network. This article discusses in detail the processing, accelerator, and network interface subsystems and explores applications well suited to the PowerEN processor.

international symposium on low power electronics and design | 2011

Analysis and mitigation of lateral thermal blockage effect of through-silicon-via in 3D IC designs

Yibo Chen; Eren Kursun; Dave Motschman; Charles Luther Johnson; Yuan Xie

The three-dimensional integrated circuits (3D ICs) offer performance advantages thanks to the increased bandwidth and reduced wire-length enabled by through-silicon-via structures (TSVs). Traditionally TSVs have been considered to improve the thermal conductivity in the vertical direction. However, the lateral thermal blockage effect becomes increasingly important for TSV via farms (a cluster of TSV vias used for signal bus connections between layers). TSV farms can cause different thermal effects on different layers due to the unequal x, y, z thermal conductivities. This can exhibit itself as thermal improvement in the vertical heat flow, at the same time lateral heat blockage effects in thinned pass-through layers. In this paper, we propose a thermal-aware via farm placement technique for 3D ICs to minimize lateral heat blockages caused by dense signal bus TSV structures. By incorporating thermal conductivity profile of via farm blocks in the design flow and enabling placement/aspect ratio optimization, the corresponding hotspots can be minimized within the wire-length and area constraints.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2013

Through Silicon Via Aware Design Planning for Thermally Efficient 3-D Integrated Circuits

Yibo Chen; Eren Kursun; Dave Motschman; Charles Luther Johnson; Yuan Xie

3-D integrated circuits (3-D ICs) offer performance advantages due to their increased bandwidth and reduced wire-length enabled by through-silicon-via structures (TSVs). Traditionally TSVs have been considered to improve the thermal conductivity in the vertical direction. However, the lateral thermal blockage effect becomes increasingly important for TSV via farms (a cluster of TSV vias used for signal bus connections between layers) because the TSV size and pitch continue to scale in μm range and the metal to insulator ratio becomes smaller. Consequently, dense TSV farms can create lateral thermal blockages in thinned silicon substrate and exacerbate the local hotspots. In this paper, we propose a thermal-aware via farm placement technique for 3-D ICs to minimize lateral heat blockages caused by dense signal bus TSV structures. By incorporating thermal conductivity profile of via farm blocks in the design flow and enabling placement/aspect ratio optimization, the corresponding hotspots can be minimized within the wire-length and area constraints.

design, automation, and test in europe | 2011

Early chip planning cockpit

Jeonghee Shin; John A. Darringer; Guojie Luo; Alan J. Weger; Charles Luther Johnson

The design of high-performance servers has always been a challenging art. Now, server designers are being asked to explore a much larger design space as they consider multicore heterogeneous architecture and the limits of advancing silicon technology. Bringing automation to the early stages of design can enable more rapid and accurate trade-off analysis. In this paper, we introduce an Early Chip Planner which allows designers to rapidly analyze microarchitecture, physical and package design trade-offs for 2D and 3D VLSI chips and generates an attributed netlist to be carried on to the implementation stage. We also describe its use in planning a 3D special-purpose server processor.

high performance computer architecture | 2012

Architectural perspectives of future wireless base stations based on the IBM PowerEN™ processor

Augusto Vega; Pradip Bose; Alper Buyuktosunoglu; Jeff H. Derby; Michele M. Franceschini; Charles Luther Johnson; Robert K. Montoye

In wireless networks, base stations are responsible for operating on large amounts of traffic at high speed rates. With the advent of new standards, as 4G, further pressure is put in the hardware requirements to satisfy speeds of up to 1 Gbps. In this work, we study the applicability and potential benefits of the IBM PowerEN processor (a multi-core, massively multithreaded platform) in the realm of base stations for the 3G and 4G standards. The approach involves exploiting the throughput computation capabilities of the PowerEN processor, replacing the bus-attached special-function accelerators with a layer of in-line universal acceleration support, incorporated within the cores. A key feature of this in-line accelerator is a bank-based very-large register file, with embedded SIMD support. This processor-in-regfile (PIR) strategy is implemented as local computation elements (LCEs) attached to each bank, overcoming the limited number of register file ports. Because each LCE is a SIMD computation element, and all of them can proceed concurrently, the PIR approach constitutes a highly-parallel super-wide-SIMD device. To target a broad spectrum of applications for base stations, we also consider a PIR-based architecture built upon reconfigurable LCEs. In this paper, we evaluate the in-line universal accelerator and the PIR strategy focusing on two specific applications for base stations: FFT and Turbo Decoding.

international conference on acoustics, speech, and signal processing | 2013

Processor architecture for software implementation of multi-sector G-RAKE receivers for HSUPA wireless infrastructure

Dheeraj Sreedhar; Jeff H. Derby; Augusto Vega; B. Rogers; Charles Luther Johnson; Robert K. Montoye

The high speed uplink packet access (HSUPA) wireless standard requires extremely high-performance signal processing in the baseband receiver, the most challenging being the chip rate rake receiver. In this paper we describe the architectural enhancements on the IBMs PowerEN processor, to enable it to support the computational requirements of the rake receiver in a fully programmable and scalable fashion. A key feature of these enhancements is a bank-based very-large register file, with embedded single instruction multiple data (SIMD) support. This processor-in-regfile (PIR) strategy is implemented as local computation elements (LCEs) attached to each bank. This overcomes the limitation on the number of register file ports and at the same time enables high degree of parallelism. We show that these enhancements enable the integration of multi-sector HSUPA G-RAKE receivers on a single processor.

ieee international conference on high performance computing, data, and analytics | 2014

Matrix-matrix multiplication on a large register file architecture with indirection

Dheeraj Sreedhar; Jeff H. Derby; Robert K. Montoye; Charles Luther Johnson

Dense matrix-matrix multiply is an important kernel in many high performance computing applications including the emerging deep neural network based cognitive computing applications. Graphical processing units (GPU) have been very successful in handling dense matrix-matrix multiply in a variety of applications. However, recent research has shown that GPUs are very inefficient in using the available compute resources on the silicon for matrix multiply in terms of utilization of peak floating point operations per second (FLOPS). In this paper, we show that an architecture with a large register file supported by “indirection ” can utilize the floating point computing resources on the processor much more efficiently. A key feature of our proposed in-line accelerator is a bank-based very-large register file, with embedded SIMD support. This processor-in-regfile (PIR) strategy is implemented as local computation elements (LCEs) attached to each bank, overcoming the limited number of register file ports. Because each LCE is a SIMD computation element, and all of them can proceed concurrently, the PIR approach constitutes a highly-parallel super-wide-SIMD device. We show that we can achieve more than 25% better performance than the best known results for matrix multiply using GPUs. This is achieved using far lesser floating point computing units and hence lesser silicon area and power. We also show that architecture blends well with the Strassen and Winograd matrix multiply algorithms. We optimize the selective data parallelism that the LCEs enable for these algorithms and study the area-performance trade-offs.

Archive | 1991