Doug Carmean
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Doug Carmean.
international symposium on computer architecture | 2002
Eric Sprangle; Doug Carmean
One architectural method for increasing processor performance involves increasing the frequency by implementing deeper pipelines. This paper will explore the relationship between performance and pipeline depth using a Pentium® 4 processor like architecture as a baseline and will show that deeper pipelines can continue to increase performance.This paper will show that the branch misprediction latency is the single largest contributor to performance degradation as pipelines are stretched, and therefore branch prediction and fast branch recovery will continue to increase in importance. We will also show that higher performance cores, implemented with longer pipelines for example, will put more pressure on the memory system, and therefore require larger on-chip caches. Finally, we will show that in the same process technology, designing deeper pipelines can increase the processor frequency by 100%, which, when combined with larger on-chip caches can yield performance improvements of 35% to 90% over a Pentium® 4 like processor.
international symposium on microarchitecture | 2009
Larry Seiler; Doug Carmean; Eric Sprangle; Tom Forsyth; Pradeep Dubey; Stephen Junkins; Adam T. Lake; Robert D. Cavin; Roger Espasa; Ed Grochowski; Toni Juan; Michael Abrash; Jeremy Sugerman; Pat Hanrahan
The Larrabee many-core visual computing architecture uses multiple in-order x86 cores augmented by wide vector processor units, together with some fixed-function logic. This increases the architectures programmability as compared to standard GPUs. The article describes the Larrabee architecture, a software renderer optimized for it, and other highly parallel applications. The article analyzes performance through scalability studies based on real-world workloads.
IEEE Transactions on Visualization and Computer Graphics | 2009
Mikhail Smelyanskiy; David R. Holmes; Jatin Chhugani; Alan Larson; Doug Carmean; Dennis P. Hanson; Pradeep Dubey; Kurt E. Augustine; Daehyun Kim; Alan B. Kyker; Victor W. Lee; Anthony D. Nguyen; Larry Seiler; Richard A. Robb
Medical volumetric imaging requires high fidelity, high performance rendering algorithms. We motivate and analyze new volumetric rendering algorithms that are suited to modern parallel processing architectures. First, we describe the three major categories of volume rendering algorithms and confirm through an imaging scientist-guided evaluation that ray-casting is the most acceptable. We describe a thread- and data-parallel implementation of ray-casting that makes it amenable to key architectural trends of three modern commodity parallel architectures: multi-core, GPU, and an upcoming many-core Intelreg architecture code-named Larrabee. We achieve more than an order of magnitude performance improvement on a number of large 3D medical datasets. We further describe a data compression scheme that significantly reduces data-transfer overhead. This allows our approach to scale well to large numbers of Larrabee cores.
automation, robotics and control systems | 2013
Mageda Sharafeddine; Haitham Akkary; Doug Carmean
This paper presents a novel high performance substrate for building energy-efficient out-of-order superscalar cores. The architecture does not require a reorder buffer or physical registers for register renaming and instruction retirement. Instead, it uses a large number of virtual register IDs for register renaming, a physical register file of the same size as the logical register file, and checkpoints to bulk retire instructions and to recover from exceptions and branch mispredictions. By eliminating physical register renaming and the reorder buffer, the architecture not only eliminates complex power hungry hardware structures, but also reduces reorder buffer capacity stalls when execution encounters long delays from data cache misses, thus improving performance. The paper presents performance and power evaluation of this new architecture using Spec 2006 benchmarks. The performance data was collected using an x86 ASIM-based performance simulator from Intel Labs. The data shows that the new architecture improves performance of a 2-wide out-of-order x86 processor core by an average of 4.2%, while saving 43% of the energy consumption of the reorder buffer and retirement register file functional block.
international conference on computer aided design | 2012
Shih-Lien Lu; Tanay Karnik; Ganapati Srinivasa; Kai-Yuan Chao; Doug Carmean; Jim Held
DRAM has been the technology for computer main memory since Intel released the first commercial DRAM chip (i1103) in 1970. As technology scales and demand for memory performance, it seems DRAM is facing several challenges. Many other memory technologies are anticipated to replace it but none has emerged as a clear winner thus far. In this paper we post the question. Is it possible to re-examine the design of DRAM to continue its life for another decade at least?
international conference on computer graphics and interactive techniques | 2008
Larry Seiler; Doug Carmean; Eric Sprangle; Tom Forsyth; Michael Abrash; Pradeep Dubey; Stephen Junkins; Adam T. Lake; Jeremy Sugerman; Robert D. Cavin; Roger Espasa; Ed Grochowski; Toni Juan; Pat Hanrahan
Archive | 2014
Eric Sprangle; Doug Carmean; Rajesh Kumar
european conference on computer systems | 2007
Bratin Saha; Ali-Reza Adl-Tabatabai; Anwar M. Ghuloum; Mohan Rajagopalan; Richard L. Hudson; Leaf Petersen; Vijay Menon; Brian R. Murphy; Tatiana Shpeisman; Eric Sprangle; Anwar Rohillah; Doug Carmean; Jesse Fang
Archive | 2002
Herbert H. J. Hum; Doug Carmean
Archive | 2008
Herbert H. J. Hum; Eric Sprangle; Doug Carmean; Rajesh Kumar