Roger A. Golliver | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roger A. Golliver is active.

Explore More

Publication

Featured researches published by Roger A. Golliver.

high-performance computer architecture | 2013

Runnemede: An architecture for Ubiquitous High-Performance Computing

Nicholas P. Carter; Aditya Agrawal; Shekhar Borkar; Romain Cledat; Howard S. David; Dave Dunning; Joshua B. Fryman; Ivan Ganev; Roger A. Golliver; Rob C. Knauerhase; Richard Lethin; Benoît Meister; Asit K. Mishra; Wilfred R. Pinfold; Justin Teller; Josep Torrellas; Nicolas Vasilache; Ganesh Venkatesh; Jianping Xu

DARPAs Ubiquitous High-Performance Computing (UHPC) program asked researchers to develop computing systems capable of achieving energy efficiencies of 50 GOPS/Watt, assuming 2018-era fabrication technologies. This paper describes Runnemede, the research architecture developed by the Intel-led UHPC team. Runnemede is being developed through a co-design process that considers the hardware, the runtime/OS, and applications simultaneously. Near-threshold voltage operation, fine-grained power and clock management, and separate execution units for runtime and application code are used to reduce energy consumption. Memory energy is minimized through application-managed on-chip memory and direct physical addressing. A hierarchical on-chip network reduces communication energy, and a codelet-based execution model supports extreme parallelism and fine-grained tasks. We present an initial evaluation of Runnemede that shows the design process for our on-chip network, demonstrates 2-4x improvements in memory energy from explicit control of on-chip memory, and illustrates the impact of hardware-software co-design on the energy consumption of a synthetic aperture radar algorithm on our architecture.

symposium on computer arithmetic | 1999

Correctness proofs outline for Newton-Raphson based floating-point divide and square root algorithms

Marius A. Cornea-Hasegan; Roger A. Golliver; Peter Markstein

This paper describes a study of a class of algorithms for the floating-point divide and square root operations, based on the Newton-Raphson iterative method. The two main goals were. (1) Proving the IEEE correctness of these iterative floating-point algorithms, i.e. compliance with the IEEE-754 standard for binary floating-point operations. The focus was on software driven iterative algorithms, instead of the hardware based implementations that dominated until now. (2) Identifying the special cases of operands that require results. Assistance due to possible overflow, or loss of precision of intermediate This study was initiated in an attempt to prove the IEEE for a class of divide and square root based on the Newton-Rapshson iterative methods. As more insight into the inner workings of these algorithms was gained, it became obvious that a formal study and proof were necessary in order to achieve the desired objectives. The result is a complete and rigorous proof of IEEE correctness for floating-point divide and square root algorithms based on the Newton-Raphson iterative method. Even more, the method used in proving the IEEE correctness of the square root algorithm is applicable in principle to any iterative algorithm, not only based on the Newton-Raphson method. Conditions requiring Software Assistance (SWA) were also determined, and were used to identify cases when alternate algorithms are needed to generate correct results. Overall, this is one important step toward flawless implementation of these floating-point operations based on software implementations.

algorithmic number theory symposium | 1994

Lattice sieving and trial division

Roger A. Golliver; Arjen K. Lenstra; Kevin S. McCurley

This is a report on work in progress on our new implementation of the relation collection stage of the general number field sieve integer factoring algorithm. Our experiments indicate that we have achieved a substantial speed-up compared to other implementations that are reported in the literature. The main improvements are a new lattice sieving technique and a trial division method that is based on lattice sieving in a hash table. This also allows us to collect triple and quadruple large prime relations in an efficient manner. Furthermore we show how the computation can efficiently be shared among multiple processors in a high-band-width environment.

symposium on computer arithmetic | 2005

Pain versus Gain in the Hardware Design of FPUs and Supercomputers

Roger A. Golliver; Silvia Melitta Mueller; S. Oberman; M. Schmookler; D. DasSarma; A. Beaumont-Smith

In 1990 there was a dramatic change in the overall design of floating-point units (FPUs) with the introduction of the fused multiply-add dataflow. This design is common today due to its performance advantage over separated units. Recently the constraining parameters have been changing for sub 10 micron technologies and the resulting designs are focusing on increasing the frequency at the cost of pipeline depth. Wire lengths are a crucial design parameter and there is a great deal of effort spent in floorplanning the execution elements to be very close together. It is now typical that a signal sent across an FPU may take 1 or more clock cycles. Thus, the physical design is very important and requires global optimizations of placement of macros as well as complex power reduction. Additionally technology scaling continues to decrease feature sizes and more execution units or even processor cores can be placed on a chip. Execution units such as Decimal FPUs are in product plans. There are single chip designs with 8 vector processing units which are used to accelerate the video games we play. The processing power in these single chip game processors is the equivalent of supercomputers. What is the next trendsetting design or key problem in computer arithmetic? We have asked a panel of expert arithmetic unit hardware designers to discuss the current pain versus gain tradeoffs and to speculate on the future of arithmetic design. Panelists:

Archive | 2000