Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Steven R. Carlough is active.

Publication


Featured researches published by Steven R. Carlough.


asilomar conference on signals, systems and computers | 2001

The IBM z900 decimal arithmetic unit

Fadi Y. Busaba; Christopher A. Krygowski; Wen He Li; Eric M. Schwarz; Steven R. Carlough

As the cost for adding functions to a processor continues to decline, processor designs are including many additional features. An example of this trend is the appearance of graphics engines and compression engines on midrange and even low end microprocessors. One area that has the potential to capture chip real estate is the decimal arithmetic engine because of its importance in financial and business applications. Studies show that 55% of the numeric data stored on commercial databases are in decimal format. Although decimal arithmetic is supported in many software languages it is not yet available on many microprocessors. This paper details the decimal arithmetic engine in the recently announced z900 microprocessor.


symposium on computer arithmetic | 2011

The IBM zEnterprise-196 Decimal Floating-Point Accelerator

Steven R. Carlough; Adam B. Collura; Silvia Melitta Mueller; Michael Kroener

Decimal floating-point Arithmetic is widely used in commercial computing applications, such as financial transactions, where rounding errors prevent the use of binary floating-point operations. The revised IEEE Standard for Floating-Point Arithmetic (IEEE-754-2008) defined standardized decimal floating-point (DFP) formats. As more software applications adopt the IEEE decimal floating-point standard, hardware accelerators that support it are becoming more prevalent. This paper describes the second generation decimal floating-point accelerator implemented on the IBM zEnterprise-196 processor. The 4-cycle deep pipeline was designed to optimize the latency of fixed-point decimal operations while significantly improving the bandwidth of DFP operations. A detailed description of the unit and a comparison to previous implementations found in literature is provided.


field programmable logic and applications | 2001

Gigahertz Reconfigurable Computing Using SiGe HBT BiCMOS FPGAs

Bryan S. Goda; Russell P. Kraft; Steven R. Carlough; Thomas W. Krawczyk; John F. McDonald

Field programmable gate arrays (FPGAs) are flexible programmable devices that are used in a wide variety of applications such as network routing, signal processing, pattern recognition and rapid prototyping. Unfortunately, the flexibility of the FPGA hinders its performance due to the additional logic resources required for the programmable hardware. Today?s fastest FPGAs run in the 250 MHz range. This paper proposes a new family of FPGAs utilizing a high-speed SiGe Heterojunction Bipolar Transistor (HBT) design, co-integrated with CMOS in an IBM BiCMOS process. This device is bit-wise compatible with the Xilinx 6200, with operating frequencies in the 1 to 20 GHz range. All logic and routing in this new design is multiplexer based, eliminating the need for pass transistors, the main roadblock to high speed in todays FPGAs.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1999

Accurate high-speed performance prediction for full differential current-mode logic: the effect of dielectric anisotropy

Atul Garg; Y. L. Le Coz; Hans J. Greub; R.B. Iverson; R. Philhower; Pete M. Campbell; Cliff A. Maier; Sam A. Steidl; Matthew W. Ernest; Russell P. Kraft; Steven R. Carlough; Janet Perry; Thomas W. Krawczyk; John F. McDonald

Integrated-circuit interconnect characterization is growing in importance as devices become faster and smaller. Along with this trend, interconnect geometry is becoming more complex, consisting of an increasing number of wiring levels. Accurate numerical extraction of three-dimensional (3-D) interconnect capacitance is essential for achieving design targets in the multigigahertz digital regime. Interconnect-capacitance extraction is complicated by the presence of inhomogeneous layers with differing dielectric constant. Dielectric anisotropy as well is common in many low-/spl kappa/ polymeric dielectrics used in high-performance ICs. A CAD procedure using the novel floating random-walk extractor QuickCAP is presented. Our procedure is efficient enough to extract a substantial amount of a chips 3-D wiring. We include as well dielectric anisotropy and inhomogeneity. The procedure is not based on effective conductor geometry or on a finite-sized conductor library but rather on the entire 3-D layout, accounting for actual local variations in conductor separations and shapes. We then apply our procedure to an experimental circuit vehicle implemented in AlGaAs-GaAs heterojunction bipolar transistor current-mode logic. This vehicle is used to validate the accuracy of our CAD procedure in predicting circuit speed. Measured and predicted test-capacitor values and ring-oscillator propagation times agreed generally to within 2-4%. To verify results on a larger digital circuit, we analyzed all interconnects in an adder carry-chain oscillator using our procedure. Predicted propagation delays were generally within 3% of measurement.


symposium on computer arithmetic | 2016

Quad Precision Floating Point on the IBM z13

Cedric Lichtenau; Steven R. Carlough; Silvia Melitta Mueller

When operating on a rapidly increasing amount of data, business analytics applications become sensitive to rounding errors, and profit from the higher stability and faster convergence of quad precision floating-point (FP-QP) arithmetic. The IBM z13TM supports this emerging trend around Big Data with an outstanding FP-QP performance. The paper details the vector and floating-point unit of IBM z13TM, with special focus on binary FP-QP. Except for divide and square root, these instructions are executed in the decimal engine. To operate such an 8-cycle decimal and quad precision pipeline at 5GHz required innovation around exponent handling, normalization, and rounding.


Archive | 2001

Process of operations with an interchangeable transmission device and apparatus for use therein for a common interface for use with digital cameras

Craig R. Walters; Scott M. Blackledge; Steven R. Carlough; Nathan Junsup Lee; Amy S. Purdy; Adrian O. Robinson


Archive | 2005

System and method for converting from scaled binary coded decimal into decimal floating point

Steven R. Carlough; Eric M. Schwarz; Sheryll H. Veneracion


Archive | 2007

Modular binary multiplier for signed and unsigned operands of variable widths

Fadi Y. Busaba; Steven R. Carlough; David S. Hutton; Christopher A. Krygowski; John G. Rell; Sheryll H. Veneracion


Archive | 2005

System and method for providing a decimal multiply algorithm using a double adder

Steven R. Carlough; Wen H. Li; Eric M. Schwarz


Archive | 2003

Multi-pipe dispatch and execution of complex instructions in a superscalar processor

Fadi Y. Busaba; Steven R. Carlough; Christopher A. Krygowski; John G. Rell; Timothy J. Slegel

Researchain Logo
Decentralizing Knowledge