Brian J. Hickmann
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brian J. Hickmann.
symposium on computer arithmetic | 2007
Mark A. Erle; Michael J. Schulte; Brian J. Hickmann
Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of a decimal floating-point multiplier that complies with specifications for decimal multiplication given in the draft revision of the IEEE 754 standard for floating-point arithmetic (IEEE 754R). This multiplier extends a previously published decimal fixed- point multiplier design by adding several features including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. The core of the decimal multiplication algorithm is an iterative scheme of partial product accumulation employing decimal carry-save addition to reduce the critical path delay. Novel features of the proposed multiplier include support for decimal floating-point numbers, on-the- fly generation of the sticky bit, early estimation of the shift amount, and efficient decimal rounding. Area and delay estimates are provided for a verified Verilog register transfer level model of the multiplier.
international conference on computer design | 2007
Brian J. Hickmann; Andrew Krioukov; Michael J. Schulte; Mark A. Erle
Decimal floating-point multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floating-point multiplier compliant with the recent draft of the IEEE P754 Standard for Floating-point Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floating-point multiplier offering low latency and high throughput. This design is based on a previously published parallel fixed-point decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixed-point design is extended to support floating-point multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floating-point multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floating-point multiplier that complies with IEEE P754.
IEEE Transactions on Computers | 2009
Mark A. Erle; Brian J. Hickmann; Michael J. Schulte
Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of two decimal floating-point multipliers: one whose partial product accumulation strategy employs decimal carry-save addition and one that employs binary carry-save addition. The multiplier based on decimal carry-save addition favors a nonpipelined iterative implementation. The multiplier utilizing binary carry-save addition allows for an efficient pipelined implementation when latency and throughput are considered more important than area. Both designs comply with specifications for decimal multiplication given in the IEEE 754 standard for floating-point arithmetic (IEEE 754-2008). The multipliers extend previously published decimal fixed-point multipliers by adding several features, including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. Novel features of the multipliers include support for decimal floating-point numbers, on-the-fly generation of the sticky bit in the iterative design, early estimation of the shift amount, and efficient decimal rounding. Iterative and parallel decimal fixed-point and floating-point multipliers are compared in terms of their area, delay, latency, and throughput based on verified Verilog register-transfer-level models.
application specific systems architectures and processors | 2009
Charles Tsen; Sonia Gonzalez-Navarro; Michael J. Schulte; Brian J. Hickmann; Katherine Compton
In this paper, we describe the first hardware design of a combined binary and decimal floating-point multiplier, based on specifications in the IEEE 754-2008 Floating-point Standard. The multiplier design operates on either (1) 64-bit binary encoded decimal floating-point (DFP) numbers or (2) 64-bit binary floating-point (BFP) numbers. It returns properly rounded results for the rounding modes specified in IEEE 754-2008. The design shares the following hardware resources between the two floating-point datatypes: a 54-bit by 54-bit binary multiplier, portions of the operand encoding/decoding, a 54-bit right shifter, exponent calculation logic, and rounding logic. Our synthesis results show that hardware sharing is feasible and has a reasonable impact on area, latency, and delay. The combined BFP and DFP multiplier occupies only 58% of the total area that would be required by separate BFP and DFP units. Furthermore, the critical path delay of a combined multiplier has a negligible increase over a standalone DFP multiplier, without increasing the number of cycles to perform either BFP or DFP multiplication.
international conference on computer design | 2008
Brian J. Hickmann; Michael J. Schulte; Mark A. Erle
Decimal multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents several combined binary/decimal fixed-point multipliers that use the BCD-4221 recoding for the decimal digits. This allows the use of binary carry-save hardware to perform decimal addition with a small correction. Our proposed designs contain several novel improvements over previously published designs. These include an improved reduction tree organization to reduce the area and delay of the multiplier and improved reduction tree components that leverage the redundant decimal encodings to help reduce delay. A novel split reduction tree architecture is also introduced that reduces the delay of the binary product with only a small increase in total area. Area and delay estimates are presented that show that the proposed designs have significant area improvements over separate binary and decimal multipliers while still maintaining similar latencies for both decimal and binary operations.
Archive | 2013
Brian J. Hickmann; Dennis R. Bradford; Thomas D. Fletcher
Archive | 2015
Andrew T. Forsyth; Brian J. Hickmann; Jonathan C. Hall; Christopher J. Hughes
Archive | 2014
Alex Pineiro; Thomas D. Fletcher; Brian J. Hickmann
Archive | 2012
Jesus Corbal; Dennis R. Bradford; Jonathan C. Hall; Thomas D. Fletcher; Brian J. Hickmann; Dror Markovich; Amit Gradstein
Archive | 2017
Wei Wu; Brian J. Hickmann; Dennis R. Bradford