Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Money Harris is active.

Publication


Featured researches published by David Money Harris.


international solid-state circuits conference | 1997

Skew-tolerant domino circuits

David Money Harris; Mark Horowitz

As cycle time of chips shrinks and die size grows, clock skew measured as a fraction of the cycle time is increasing. Traditional domino circuits shown are especially sensitive because skew must be budgeted in both half-cycles. The problem with such domino pipelines is that evaluation starts when the clock connected to the first gate in the half-cycle rises but the output needs to be valid before the clock on the output latch falls. In the worst case, the evaluate clock is late and the latch clock is early, decreasing time for logic. Many designers realize that some of the overhead can be reduced by using differential domino (also called dual rail) designs. An SR latch or pipeline latch at the end of dual-rail circuits lessens sensitivity to the falling edge. Self-timed techniques eliminate clocks and clock skew, but raise new issues of control overhead, timing assumption verification, and testability. The methodology reported here boosts operating frequency by tolerating clock skew, eliminating latches from the critical path, and better balancing logic between phases of the pipeline.


PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication | 1994

The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers

William J. Dally; Larry R. Dennison; David Money Harris; Kinhong Kan; Thucydides Xanthopoulos

The Reliable Router (RR) is a network switching element targeted to two-dimensional mesh interconnection network topologies. It is designed to run at 100 MHz and reach a useful link bandwidth of 3.2 Gbit/sec. The Reliable Router uses adaptive routing coupled with link-level retransmission and a unique-token protocol to increase both performance and reliability. The RR can handle a single node or link failure anywhere in the network without interruption of service. Other unique features include a queueless low-latency plesiochronous channel interface, and simultaneous bidirectional signalling.


asilomar conference on signals, systems and computers | 2003

A taxonomy of parallel prefix networks

David Money Harris

Parallel prefix networks are widely used in high-performance adders. Networks in the literature represent tradeoffs between number of logic levels, fanout, and wiring tracks. This paper presents a three-dimensional taxonomy that not only describes the tradeoffs in existing parallel prefix networks but also points to a family of new networks. Adders using these networks are compared using the method of logical effort. The new architecture is competitive in latency and area for some technologies.


symposium on computer arithmetic | 2005

An improved unified scalable radix-2 Montgomery multiplier

David Money Harris; Ram K. Krishnamurthy; Mark A. Anders; Sanu K. Mathew; Steven K. Hsu

This paper describes an improved version of the Tenca-Koc unified scalable radix-2 Montgomery multiplier with half the latency for small and moderate precision operands and half the queue memory requirement. Like the Tenca-Koc multiplier, this design is reconfigurable to accept any input precision in either GF(p) or GF(2/sup n/) up to the size of the on-chip memory. An FPGA implementation can perform 1024-bit modular exponentiation in 16 ms using 5598 4-input lookup tables, making it the fastest unified scalable design yet reported.


IEEE Journal of Solid-state Circuits | 2013

Bubble Razor: Eliminating Timing Margins in an ARM Cortex-M3 Processor in 45 nm CMOS Using Architecturally Independent Error Detection and Correction

Matthew Fojtik; David Fick; Yejoong Kim; Nathaniel Ross Pinckney; David Money Harris; David T. Blaauw; Dennis Sylvester

We propose Bubble Razor, an architecturally independent approach to timing error detection and correction that avoids hold-time issues and enables large timing speculation windows. A local stalling technique that can be automatically inserted into any design allows the system to scale to larger processors. We implemented Bubble Razor on an ARM Cortex-M3 microprocessor in 45 nm CMOS without detailed knowledge of its internal architecture to demonstrate the techniques automated capability. The flip-flop based design was converted to two-phase latch timing using commercial retiming tools; Bubble Razor was then inserted using automatic scripts. This system marks the first published implementation of a Razor-style scheme on a complete, commercial processor. It provides an energy efficiency improvement of 60% or a throughput gain of up to 100% compared to operating with worst case timing margins.


international solid-state circuits conference | 2012

Bubble Razor: An architecture-independent approach to timing-error detection and correction

Matthew Fojtik; David Fick; Yejoong Kim; Nathaniel Ross Pinckney; David Money Harris; David T. Blaauw; Dennis Sylvester

Several methods that eliminate timing margins by detecting and correcting transient delay errors have been proposed. These Razor-style systems replace critical flip-flops with ones that detect late arriving signals, and use architectural replay to correct errors. However, none of these methods have been applied to a complete commercial processor due to their architectural invasiveness. In addition, these Razor techniques introduce significant hold time constraints that are difficult to meet given worsening timing variability. To address these two issues we propose Bubble Razor (B-Razor), which uses a novel error-detection technique based on two-phase latch timing and a local replay mechanism that can be inserted automatically in any design. The error detec tion technique breaks the dependency between minimum delay and speculation window, restoring hold-time constraints to conventional values and allowing timing speculation of up to 100% of nominal delay. The large timing specula tion makes Bubble Razor especially applicable to low-voltage designs where tim ing variation grows exponentially.


IEEE Transactions on Very Large Scale Integration Systems | 2001

Statistical clock skew modeling with data delay variations

David Money Harris; Sam Naffziger

Accurate clock skew budgets are important for microprocessor designers to avoid hold-time failures and to properly allocate resources when optimizing global and local paths. Many published clock skew budgets neglect voltage jitter and process variation, which are becoming dominant factors in otherwise balanced H-trees. However, worst-case process variation assumptions are severely pessimistic. This paper describes the major sources of clock skew in a microprocessor using a modified H-tree and applies the model to a second-generation Itanium-M processor family microprocessor currently under design. Monte Carlo simulation is used to develop statistical clock skew budgets for setup and hold time constraints in a four-level skew hierarchy. Voltage jitter through the phase locked loop (PLL) and clock buffers accounts for the majority of skew budgets. We show that taking into account the number of nearly critical paths between clocked elements at each level of the skew hierarchy and variations in the data delays of these paths reduces the difference between global and local skew budgets by more than a factor of two. Another insight is that data path delay variability limits the potential cycle-time benefits of active deskew circuits because the paths with the worst skew are unlikely to also be the paths with the longest data delays.


high performance interconnects | 1994

Architecture and implementation of the reliable router

William J. Dally; Larry R. Dennison; David Money Harris; Kinhong Kan; Thucydides Xanthopoulos

Abstract The Reliable Router (RR) is a network switching element targeted to two-dimensional mesh interconnection network topologies. It is designed to run at 100 MHz and reach a useful link bandwidth of 3.2 Gbitlsec. The Reliable Router uses adaptive routing coupled with link-level retransmission and a unique-token protocol to increase both performance and reliability. The RR can handle a single node or link failure anywhere in the network without in terruption of service. Other unique features include a queueless low-latency plesiochronous channel inter face and simultaneous bidirectional signalling.


asilomar conference on signals, systems and computers | 2003

Logical effort of carry propagate adders

David Money Harris; I. Sutherland

A wide assortment of carry propagate adders offer varying area-delay tradeoffs. Wiring and choice of circuit family also affect the size and performance. This paper uses the method of logical effort to characterize the effects of architecture, circuit family, and wire capacitance on adder delay. Domino logic offers about a 30% speedup on most valency-2 adders. Although Kogge-Stone adders are fastest in the absence of wire, other architectures such as variants on the Sklansky adder offer regular layouts and better delay in the presence of wiring capacitance.


international conference on computer design | 2005

Surfliner: a distortionless electrical signaling scheme for speed of light on-chip communications

Hongyu Chen; Rui Shi; Chung-Kuan Cheng; David Money Harris

We present a novel scheme to implement distortionless transmission lines for on-chip electrical signaling. By introducing intentional leakage conductance between the wires of a differential pair, the distortionless transmission line eliminates dispersion caused by the resistive nature of on-chip wires and achieves speed of light transmission. We show that it is feasible to construct distortionless transmission line with conventional silicon process. Simulation results show that using 65nm technology, the proposed scheme can achieve 15Gbits/s bandwidth over a 20mm on-chip serial link without any equalization. This approach offers a six times improvement in delay and 85% reduction in power consumption over a conventional RC wire with repeated buffers.

Collaboration


Dive into the David Money Harris's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge