Mark A. Erle | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark A. Erle is active.

Explore More

Publication

Featured researches published by Mark A. Erle.

symposium on computer arithmetic | 2005

Decimal multiplication with efficient partial product generation

Mark A. Erle; Eric M. Schwarz; Michael J. Schulte

Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digit-by-digit multiplier on a word-by-digit basis, first in a signed-digit form with two digits per position, and then combined via a combinational circuit. As the signed-digit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signed-digit adder. This work is significantly different from other work employing digit-by-digit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process.

international conference on computer design | 2004

A high-frequency decimal multiplier

Robert D. Kenney; Michael J. Schulte; Mark A. Erle

Decimal arithmetic is regaining popularity in the computing community due to the growing importance of commercial, financial, and Internet-based applications, which process decimal data. This paper presents an iterative decimal multiplier, which operates at high clock frequencies and scales well to large operand sizes. The multiplier uses a new decimal representation for intermediate products, which allows for a very fast two-stage iterative multiplier design. Decimal multipliers, which are synthesized using a 0.11 micron CMOS standard cell library, operate at clock frequencies close to 2 GHz. The latency of the proposed design to multiply two n-digit BCD operands is (n+8) cycles with a new multiplication able to begin every (n+1) cycles.

symposium on computer arithmetic | 2007

Decimal Floating-Point Multiplication Via Carry-Save Addition

Mark A. Erle; Michael J. Schulte; Brian J. Hickmann

Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of a decimal floating-point multiplier that complies with specifications for decimal multiplication given in the draft revision of the IEEE 754 standard for floating-point arithmetic (IEEE 754R). This multiplier extends a previously published decimal fixed- point multiplier design by adding several features including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. The core of the decimal multiplication algorithm is an iterative scheme of partial product accumulation employing decimal carry-save addition to reduce the critical path delay. Novel features of the proposed multiplier include support for decimal floating-point numbers, on-the- fly generation of the sticky bit, early estimation of the shift amount, and efficient decimal rounding. Area and delay estimates are provided for a verified Verilog register transfer level model of the multiplier.

international conference on computer design | 2007

A parallel IEEE P754 decimal floating-point multiplier

Brian J. Hickmann; Andrew Krioukov; Michael J. Schulte; Mark A. Erle

Decimal floating-point multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floating-point multiplier compliant with the recent draft of the IEEE P754 Standard for Floating-point Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floating-point multiplier offering low latency and high throughput. This design is based on a previously published parallel fixed-point decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixed-point design is extended to support floating-point multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floating-point multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floating-point multiplier that complies with IEEE P754.

IEEE Transactions on Computers | 2009

Decimal Floating-Point Multiplication

Mark A. Erle; Brian J. Hickmann; Michael J. Schulte

Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of two decimal floating-point multipliers: one whose partial product accumulation strategy employs decimal carry-save addition and one that employs binary carry-save addition. The multiplier based on decimal carry-save addition favors a nonpipelined iterative implementation. The multiplier utilizing binary carry-save addition allows for an efficient pipelined implementation when latency and throughput are considered more important than area. Both designs comply with specifications for decimal multiplication given in the IEEE 754 standard for floating-point arithmetic (IEEE 754-2008). The multipliers extend previously published decimal fixed-point multipliers by adding several features, including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. Novel features of the multipliers include support for decimal floating-point numbers, on-the-fly generation of the sticky bit in the iterative design, early estimation of the shift amount, and efficient decimal rounding. Iterative and parallel decimal fixed-point and floating-point multipliers are compared in terms of their area, delay, latency, and throughput based on verified Verilog register-transfer-level models.

Ibm Journal of Research and Development | 2010

A survey of hardware designs for decimal arithmetic

Liang-Kai Wang; Mark A. Erle; Charles Tsen; Eric M. Schwarz; Michael J. Schulte

Decimal data and decimal arithmetic operations are ubiquitous in daily life. Although microprocessors normally use binary arithmetic for computations, decimal arithmetic is often required in financial and commercial applications. Due to the increasing importance of and demand for decimal arithmetic, decimal floating-point (DFP) formats and operations are specified in the revised IEEE Standard for Floating-Point Arithmetic (IEEE 754-2008). This paper provides a survey of hardware designs for decimal arithmetic. It gives an overview of DFP arithmetic in IEEE 754-2008, describes processors that provide hardware and instruction set support for decimal arithmetic, and provides a survey of hardware designs for decimal addition, subtraction, multiplication, and division. Finally, it describes potential areas for future research.

international conference on computer design | 2008

Improved combined binary/decimal fixed-point multipliers

Brian J. Hickmann; Michael J. Schulte; Mark A. Erle

Decimal multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents several combined binary/decimal fixed-point multipliers that use the BCD-4221 recoding for the decimal digits. This allows the use of binary carry-save hardware to perform decimal addition with a small correction. Our proposed designs contain several novel improvements over previously published designs. These include an improved reduction tree organization to reduce the area and delay of the multiplier and improved reduction tree components that leverage the redundant decimal encodings to help reduce delay. A novel split reduction tree architecture is also introduced that reduces the delay of the binary product with only a small increase in total area. Area and delay estimates are presented that show that the proposed designs have significant area improvements over separate binary and decimal multipliers while still maintaining similar latencies for both decimal and binary operations.

Archive | 1997