Is this you? Create Your Porfile

Andrew Waterman

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew Waterman is active.

Explore More

Publication

Featured researches published by Andrew Waterman.

Communications of The ACM | 2009

Roofline: an insightful visual performance model for multicore architectures

Samuel Williams; Andrew Waterman; David A. Patterson

The Roofline model offers insight on how to improve the performance of software and hardware.

Nature | 2015

Single-chip microprocessor that communicates directly using light

Chen Sun; Mark T. Wade; Yunsup Lee; Jason S. Orcutt; Luca Alloatti; Michael Georgas; Andrew Waterman; Jeffrey M. Shainline; Rimas Avizienis; Sen Lin; Benjamin R. Moss; Rajesh Kumar; Fabio Pavanello; Amir H. Atabaki; Henry Cook; Albert J. Ou; Jonathan Leu; Yu-Hsin Chen; Krste Asanovic; Rajeev J. Ram; Miloš A. Popović; Vladimir Stojanovic

Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems—from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic–photonic systems enabled by silicon-based nanophotonic devices8. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic–photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic–photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a ‘zero-change’ approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic–photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.

design automation conference | 2012

Chisel: constructing hardware in a Scala embedded language

Jonathan Bachrach; Huy Vo; Brian C. Richards; Yunsup Lee; Andrew Waterman; Rimas Avizienis; John Wawrzynek; Krste Asanovic

In this paper we introduce Chisel, a new hardware construction language that supports advanced hardware design using highly parameterized generators and layered domain-specific hardware languages. By embedding Chisel in the Scala programming language, we raise the level of hardware design abstraction by providing concepts including object orientation, functional programming, parameterized types, and type inference. Chisel can generate a high-speed C++-based cycle-accurate software simulator, or low-level Verilog designed to map to either FPGAs or to a standard ASIC flow for synthesis. This paper presents Chisel, its embedding in Scala, hardware examples, and results for C++ simulation, Verilog emulation and ASIC synthesis.

design automation conference | 2010

RAMP gold: an FPGA-based architecture simulator for multiprocessors

Zhangxi Tan; Andrew Waterman; Rimas Avizienis; Yunsup Lee; Henry Cook; David A. Patterson; Krste Asanovica

We present RAMP Gold, an economical FPGA-based architecture simulator that allows rapid early design-space exploration of manycore systems. The RAMP Gold prototype is a high-throughput, cycle-accurate full-system simulator that runs on a single Xilinx Virtex-5 FPGA board, and which simulates a 64-core shared-memory target machine capable of booting real operating systems. To improve FPGA implementation efficiency, functionality and timing are modeled separately and host multithreading is used in both models. We evaluate the prototypes performance using a modern parallel benchmark suite running on our manycore research operating system, achieving two orders of magnitude speedup compared to a widely-used software-based architecture simulator.

international symposium on computer architecture | 2010

A case for FAME: FPGA architecture model execution

Zhangxi Tan; Andrew Waterman; Henry Cook; Sarah Bird; Krste Asanovic; David A. Patterson

Given the multicore microprocessor revolution, we argue that the architecture research community needs a dramatic increase in simulation capacity. We believe FPGA Architecture Model Execution (FAME) simulators can increase the number of useful architecture research experiments per day by two orders of magnitude over Software Architecture Model Execution (SAME) simulators. To clear up misconceptions about FPGA-based simulation methodologies, we propose a FAME taxonomy to distinguish the costperformance of variations on these ideas. We demonstrate our simulation speedup claim with a case study wherein we employ a prototype FAME simulator, RAMP Gold, to research the interaction between hardware partitioning mechanisms and operating system scheduling policy. The study demonstrates FAMEs capabilities: we run a modern parallel benchmark suite on a research operating system, simulate 64-core target architectures with multi-level memory hierarchy timing models, and add experimental hardware mechanisms to the target machine. The simulation speedup achieved by our adoption of FAME-250×-enables experiments with more realistic time scales and data set sizes thanare possible with SAME.

european solid state circuits conference | 2014

A 45nm 1.3GHz 16.7 double-precision GFLOPS/W RISC-V processor with vector accelerators

Yunsup Lee; Andrew Waterman; Rimas Avizienis; Henry Cook; Chen Sun; Vladimir Stojanovic; Krste Asanovic

A 64-bit dual-core RISC-V processor with vector accelerators has been fabricated in a 45nm SOI process. This is the first dual-core processor to implement the open-source RISC-V ISA designed at the University of California, Berkeley. In a standard 40nm process, the RISC-V scalar core scores 10% higher in DMIPS/MHz than the Cortex-A5, ARMs comparable single-issue in-order scalar core, and is 49% more area-efficient. To demonstrate the extensibility of the RISC-V ISA, we integrate a custom vector accelerator alongside each single-issue in-order scalar core. The vector accelerator is 1.8× more energy-efficient than the IBM Blue Gene/Q processor, and 2.6× more than the IBM Cell processor, both fabricated in the same process. The dual-core RISC-V processor achieves maximum clock frequency of 1.3GHz at 1.2V and peak energy efficiency of 16.7 double-precision GFLOPS/W at 0.65V with an area of 3mm2.

symposium on vlsi circuits | 2015

A RISC-V vector processor with tightly-integrated switched-capacitor DC-DC converters in 28nm FDSOI

Brian Zimmer; Yunsup Lee; Alberto Puggelli; Jaehwa Kwak; Ruzica Jevtic; Ben Keller; Stevo Bailey; Milovan Blagojevic; Pi-Feng Chiu; Hanh-Phuc Le; Po-Hung Chen; Nicholas Sutardja; Rimas Avizienis; Andrew Waterman; Brian C. Richards; Philippe Flatresse; Elad Alon; Krste Asanovic; Borivoje Nikolic

This work demonstrates a RISC-V vector microprocessor implemented in 28nm FDSOI with fully-integrated non-interleaved switched-capacitor DCDC (SC-DCDC) converters and adaptive clocking that generates four on-chip voltages between 0.5V and 1V using only 1.0V core and 1.8V IO voltage inputs. The design pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices.

IEEE Journal of Solid-state Circuits | 2016

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC–DC Converters in 28 nm FDSOI

Brian Zimmer; Yunsup Lee; Alberto Puggelli; Jaehwa Kwak; Ruzica Jevtic; Ben Keller; Steven Bailey; Milovan Blagojevic; Pi-Feng Chiu; Hanh-Phuc Le; Po-Hung Chen; Nicholas Sutardja; Rimas Avizienis; Andrew Waterman; Brian C. Richards; Philippe Flatresse; Elad Alon; Krste Asanovic; Borivoje Nikolic

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices.

IEEE Micro | 2016

An Agile Approach to Building RISC-V Microprocessors

Yunsup Lee; Andrew Waterman; Henry Cook; Brian Zimmer; Ben Keller; Alberto Puggelli; Jaehwa Kwak; Ruzica Jevtic; Stevo Bailey; Milovan Blagojevic; Pi-Feng Chiu; Rimas Avizienis; Brian C. Richards; Jonathan Bachrach; David A. Patterson; Elad Alon; Bora Nikolic; Krste Asanovic

The final phase of CMOS technology scaling provides continued increases in already vast transistor counts, but only minimal improvements in energy efficiency, thus requiring innovation in circuits and architectures. However, even huge teams are struggling to complete large, complex designs on schedule using traditional rigid development flows. This article presents an agile hardware development methodology, which the authors adopted for 11 RISC-V microprocessor tape-outs on modern 28-nm and 45-nm CMOS processes in the past five years. The authors discuss how this approach enabled small teams to build energy-efficient, cost-effective, and industry-competitive high-performance microprocessors in a matter of months. Their agile methodology relies on rapid iterative improvement of fabricatable prototypes using hardware generators written in Chisel, a new hardware description language embedded in a modern programming language. The parameterized generators construct highly customized systems based on the free, open, and extensible RISC-V platform. The authors present a case study of one such prototype featuring a RISC-V vector microprocessor integrated with a switched-capacitor DC-DC converter alongside an adaptive clock generator in a 28-nm, fully depleted silicon-on-insulator process.

The British Journal for the Philosophy of Science | 2005

Bayesian confirmation and auxiliary hypotheses revisited: A reply to Strevens

Branden Fitelson; Andrew Waterman

Michael Strevens ([2001]) has proposed an interesting and novel Bayesian analysis of the Quine-Duhem (Q–D) problem (i.e., the problem of auxiliary hypotheses). Strevenss analysis involves the use of a simplifying idealization concerning the original Q–D problem. We will show that this idealization is far stronger than it might appear. Indeed, we argue that Strevenss idealization oversimplifies the Q–D problem, and we propose a diagnosis of the source(s) of the oversimplification. 1. Some background on Quine–Duhem2. Strevenss simplifying idealization3. Indications that (I) oversimplifies Q–D4. Strevenss argument for the legitimacy of(I) Some background on Quine–Duhem Strevenss simplifying idealization Indications that (I) oversimplifies Q–D Strevenss argument for the legitimacy of(I)

Explore More