Ian Buck | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ian Buck is active.

Explore More

Publication

Featured researches published by Ian Buck.

international conference on computer graphics and interactive techniques | 2008

Scalable Parallel Programming with CUDA

John R. Nickolls; Ian Buck; Michael Garland; Kevin Skadron

Presents a collection of slides covering the following topics: CUDA parallel programming model; CUDA toolkit and libraries; performance optimization; and application development.

international conference on computer graphics and interactive techniques | 2004

Brook for GPUs: stream computing on graphics hardware

Ian Buck; Tim Foley; Daniel Reiter Horn; Jeremy Sugerman; Kayvon Fatahalian; Mike Houston; Pat Hanrahan

In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present an analysis of the effectiveness of the GPU as a compute engine compared to the CPU, to determine when the GPU can outperform the CPU for a particular algorithm. We evaluate our system with five applications, the SAXPY and SGEMV BLAS operators, image segmentation, FFT, and ray tracing. For these applications, we demonstrate that our Brook implementations perform comparably to hand-written GPU code and up to seven times faster than their CPU counterparts.

international conference on computer graphics and interactive techniques | 2002

Ray tracing on programmable graphics hardware

Timothy John Purcell; Ian Buck; William R. Mark; Pat Hanrahan

Recently a breakthrough has occurred in graphics hardware: fixed function pipelines have been replaced with programmable vertex and fragment processors. In the near future, the graphics pipeline is likely to evolve into a general programmable stream processor capable of more than simply feed-forward triangle rendering.In this paper, we evaluate these trends in programmability of the graphics pipeline and explain how ray tracing can be mapped to graphics hardware. Using our simulator, we analyze the performance of a ray casting implementation on next generation programmable graphics hardware. In addition, we compare the performance difference between non-branching programmable hardware using a multipass implementation and an architecture that supports branching. We also show how this approach is applicable to other ray tracing algorithms such as Whitted ray tracing, path tracing, and hybrid rendering algorithms. Finally, we demonstrate that ray tracing on graphics hardware could prove to be faster than CPU based implementations as well as competitive with traditional hardware accelerated feed-forward triangle rendering.

international conference on computer graphics and interactive techniques | 2001

WireGL: a scalable graphics system for clusters

Greg Humphreys; Matthew Eldridge; Ian Buck; Gordan Stoll; Matthew Everett; Pat Hanrahan

We describe WireGL, a system for scalable interactive rendering on a cluster of workstations. WireGL provides the familiar OpenGL API to each node in a cluster, virtualizing multiple graphics accelerators into a sort-first parallel renderer with a parallel interface. We also describe techniques for reassembling an output image from a set of tiles distributed over a cluster. Using flexible display management, WireGL can drive a variety of output devices, from standalone displays to tiled display walls. By combining the power of virtual graphics, the familiarity and ordered semantics of OpenGL, and the scalability of clusters, we are able to create time-varying visualizations that sustain rendering performance over 70,000,000 triangles per second at interactive refresh rates using 16 compute nodes and 16 rendering nodes.

conference on high performance computing (supercomputing) | 2003

Merrimac: Supercomputing with Streams

William J. Dally; Francois Labonte; Abhishek Das; Pat Hanrahan; Jung Ho Ahn; Jayanth Gummaraju; Mattan Erez; Nuwan Jayasena; Ian Buck; Timothy J. Knight; Ujval J. Kapasi

Merrimac uses stream architecture and advanced interconnection networks to give an order of magnitude more performance per unit cost than cluster-based scientific computers built from the same technology. Organizing the computation into streams and exploiting the resulting locality using a register hierarchy enables a stream architecture to reduce the memory bandwidth required by representative applications by an order of magnitude or more. Hence a processing node with a fixed bandwidth (expensive) can support an order of magnitude more arithmetic units (inexpensive). This in turn allows a given level of performance to be achieved with fewer nodes (a 1-PFLOPS machine, for example, with just 8,192 nodes) resulting in greater reliability, and simpler system management. We sketch the design of Merrimac, a streaming scientific computer that can be scaled from a

international conference on computer graphics and interactive techniques | 2004

GPGPU: general purpose computation on graphics hardware

David Luebke; Mark J. Harris; Jens H. Krüger; Timothy John Purcell; Naga K. Govindaraju; Ian Buck; Cliff Woolley; Aaron E. Lefohn

20K 2 TFLOPS workstation to a

conference on high performance computing (supercomputing) | 2000

Distributed Rendering for Scalable Displays

Greg Humphreys; Ian Buck; Matthew Eldridge; Pat Hanrahan

20M 2 PFLOPS supercomputer and present the results of some initial application experiments on this architecture.

international conference on computer graphics and interactive techniques | 2000

Tracking graphics state for networked rendering

Ian Buck; Greg Humphreys; Pat Hanrahan

The graphics processor (GPU) on todays commodity video cards has evolved into an extremely powerful and flexible processor. The latest graphics architectures provide tremendous memory bandwidth and computational horsepower, with fully programmable vertex and pixel processing units that support vector operations up to full IEEE floating point precision. High level languages have emerged for graphics hardware, making this computational power accessible. Architecturally, GPUs are highly parallel streaming processors optimized for vector operations, with both MIMD (vertex) and SIMD (pixel) pipelines. Not surprisingly, these processors are capable of general-purpose computation beyond the graphics applications for which they were designed. Researchers have found that exploiting the GPU can accelerate some problems by over an order of magnitude over the CPU.However, significant barriers still exist for the developer who wishes to use the inexpensive power of commodity graphics hardware, whether for in-game simulation of physics of for conventional computational science. These chips are designed for and driven by video game development; the programming model is unusual, the programming environment is tightly constrained, and the underlying architectures are largely secret. The GPU developer must be an expert in computer graphics and its computational idioms to make effective use of the hardware, and still pitfalls abound. This course provides a detailed introduction to general purpose computation on graphics hardware (GPGPU). We emphasize core computational building blocks, ranging from linear algebra to database queries, and review the tools, perils, and tricks of the trade in GPU programming. Finally we present some interesting and important case studies on general-purpose applications of graphics hardware.The course presenters are experts on general-purpose GPU computation from academia and industry, and have presented papers and tutorials on the topic at SIGGRAPH, Graphics Hardware, Game Developers Conference, and elsewhere.

international conference on parallel architectures and compilation techniques | 2004

The Stream Virtual Machine

Francois Labonte; Peter Mattson; William Thies; Ian Buck; Christos Kozyrakis; Mark Horowitz

We describe a novel distributed graphics system that allows an application to render to a large tiled display. Our system, called WireGL, uses a cluster of off-the-shelf PCs connected with a high-speed network. WireGL allows an unmodified existing application to achieve scalable output resolution on such a display. This paper presents an efficient sorting algorithm which minimizes the network traffic for a scalable display. We will demonstrate that for most applications, our system provides scalable output resolution with minimal performance impact.

high performance computing and communications | 2009

Fast Parallel Expectation Maximization for Gaussian Mixture Models on GPUs Using CUDA

N. S. L. Phani Kumar; Sanjiv Satoor; Ian Buck

As networks get faster, it becomes more feasible to render large data sets remotely. For example, it is useful to run large scientific simulations on remote compute servers but visualize the results of those simulations on one or more local displays. The WireGL project at Stanford is researching new techniques for rendering over a network. For many applications, we can render remotely over a gigabit network to a tiled display with little or no performance loss over running locally. One of the elements of WireGL that makes this performance possible is our ability to track the graphics state of a running application. In this paper, we will describe our techniques for tracking state, as well as efficient algorithms for computing the difference between two graphics contexts. This fast differencing operation allows WireGL to transmit less state data over the network by updating server state lazily. It also allows our system to context switch between multiple graphics applications several million times per second without flushing the hardware accelerator. This results in substantial performance gains when sharing a remote display between multiple clients.

Explore More