Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Charles R. Moore is active.

Publication


Featured researches published by Charles R. Moore.


international symposium on computer architecture | 2003

Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture

Karthikeyan Sankaralingam; Ramadass Nagarajan; Haiming Liu; Changkyu Kim; Jaehyuk Huh; Doug Burger; Stephen W. Keckler; Charles R. Moore

This paper describes the polymorphous TRIPS architecture which can be configured for different granularities and types of parallelism. TRIPS contains mechanisms that enable the processing cores and the on-chip memory system to be configured and combined in different modes for instruction, data, or thread-level parallelism. To adapt to small and large-grain concurrency, the TRIPS architecture contains four out-of-order, 16-wide-issue Grid Processor cores, which can be partitioned when easily extractable fine-grained parallelism exists. This approach to polymorphism provides better performance across a wide range of application types than an approach in which many small processors are aggregated to run workloads with irregular parallelism. Our results show that high performance can be obtained in each of the three modes--ILP, TLP, and DLP-demonstrating the viability of the polymorphous coarse-grained approach for future microprocessors.


IEEE Computer | 2004

Scaling to the end of silicon with EDGE architectures

Doug Burger; Stephen W. Keckler; Kathryn S. McKinley; Michael Dahlin; Lizy Kurian John; Calvin Lin; Charles R. Moore; James H. Burrill; Robert McDonald; William Yoder

Microprocessor designs are on the verge of a post-RISC era in which companies must introduce new ISAs to address the challenges that modern CMOS technologies pose while also exploiting the massive levels of integration now possible. To meet these challenges, we have developed a new class of ISAs, called explicit data graph execution (EDGE), that will match the characteristics of semiconductor technology over the next decade. The TRIPS architecture is the first instantiation of an EDGE instruction set, a new, post-RISC class of instruction set architectures intended to match semiconductor technology evolution over the next decade, scaling to new levels of power efficiency and high performance.


international conference on computer design | 2003

Exploiting microarchitectural redundancy for defect tolerance

Premkishore Shivakumar; Stephen W. Keckler; Charles R. Moore; Doug Burger

The continued increase in microprocessor clock frequency that has come from advancements in fabrication technology and reductions in feature size, creates challenges in maintaining both manufacturing yield rates and long-term reliability of devices. Methods based on defect detection and reduction may not offer a scalable solution due to cost of eliminating contaminants in the manufacturing process and increasing chip complexity. This paper proposes to use the inherent redundancy available in existing and future chip microarchitectures to improve yield and enable graceful performance degradation in fail-in-place systems. We introduce a new yield metric called performance averaged yield (Ypav) which accounts both for fully functional chips and those that exhibit some performance degradation. Our results indicate that at 250nm we are able to increase the Ypav of a uniprocessor with only redundant rows in its caches from a base value of 85% to 98% using microarchitectural redundancy. Given constant chip area, shrinking feature sizes increases fault susceptibility and reduces the base Ypav to 60% at 50nm, which exploiting microarchitectural redundancy then increases to 99.6%.


international symposium on microarchitecture | 2003

Scalable hardware memory disambiguation for high ILP processors

Simha Sethumadhavan; Rajagopalan Desikan; Doug Burger; Charles R. Moore; Stephen W. Keckler

This paper describes several methods for improving the scalability of memory disambiguation hardware for future high ILP processors. As the number of in-flight instructions grows with issue width and pipeline depth, the load/store queues (LSQ) threaten to become a bottleneck in both power and latency. By employing lightweight approximate hashing in hardware with structures called Bloom filters, many improvements to the LSQ are possible. We propose two types of filtering schemes using Bloom filters: search filtering, which uses hashing to reduce both the number of lookups to the LSQ and the number of entries that must be searched, and state filtering, in which the number of entries kept in the LSQs is reduced by coupling address predictors and Bloom filters, permitting smaller queues. We evaluate these techniques for LSQs indexed by both instruction age and the instructions effective address, and for both centralized and physically partitioned LSQs. We show that search filtering avoids up to 98% of the associative LSQ searches, providing significant power savings and keeping LSQ searches to under one high-frequency clock cycle. We also show that with state filtering, the load queue can be eliminated altogether with only minor reductions n performance for small instruction window machines.


ACM Transactions on Architecture and Code Optimization | 2004

TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP

Karthikeyan Sankaralingam; Ramadass Nagarajan; Haiming Liu; Changkyu Kim; Jaehyuk Huh; Nitya Ranganathan; Doug Burger; Stephen W. Keckler; Robert McDonald; Charles R. Moore

This paper describes the polymorphous TRIPS architecture that can be configured for different granularities and types of parallelism. The TRIPS architecture is the first in a class of post-RISC, dataflow-like instruction sets called explicit data-graph execution (EDGE). This EDGE ISA is coupled with hardware mechanisms that enable the processing cores and the on-chip memory system to be configured and combined in different modes for instruction, data, or thread-level parallelism. To adapt to small and large-grain concurrency, the TRIPS architecture prototype contains two out-of-order, 16-wide-issue grid processor cores, which can be partitioned when easily extractable fine-grained parallelism exists. This approach to polymorphism provides better performance across a wide range of application types than an approach in which many small processors are aggregated to run workloads with irregular parallelism. Our results show that high performance can be obtained in each of the three modes---ILP, TLP, and DLP---demonstrating the viability of the polymorphous coarse-grained approach for future microprocessors.


international symposium on microarchitecture | 1993

The Power PC 601 microprocessor

Michael K. Becker; Michael S. Allen; Charles R. Moore; John Stephen Muhich; David P. Tuttle

The PowerPC 601 microprocessor, the first of a family of processors based on the PowerPC architecture, is described. The general-purpose processor contains a 32-Kb cache and a superscalar machine organization that allows dispatch and execution of up to three instructions each clock cycle. The bus interface and storage control mechanisms can be configured for a wide range of system designs, from low-cost desktop personal computers to high-performance multi-processor systems. The PowerPC architecture, machine organization, chip packaging technology, and performance are discussed.<<ETX>>


international symposium on microarchitecture | 2003

Exploiting ILP, TLP, and DLP with the polymorphous trips architecture

Karthikeyan Sankaralingam; Ramadass Nagarajan; Haiming Liu; Changkyu Kim; Jaehyuk Huh; Doug Burger; Stephen W. Keckler; Charles R. Moore

The Tera-op reliable intelligently adaptive processing system (TRIPS) architecture seeks to deliver system-level configurability to applications and runtime systems. It does so by employing the concept of polymorphism, which permits the runtime system to configure the hardware execution resources to match the mode of execution and demands of the compiler and application.


international solid-state circuits conference | 2003

A wire-delay scalable microprocessor architecture for high performance systems

Stephen W. Keckler; Doug Burger; Charles R. Moore; Ramadass Nagarajan; Karthikeyan Sankaralingam; Vikas Agarwal; M. S. Hrishikesh; Nitya Ranganathan; Premkishore Shivakumar

This scalable processor architecture consists of chained ALUs to minimize the physical distance between dependent instructions, thus mitigating the effect of long on-chip wire delays. Simulation studies demonstrate 1.3-15/spl times/ more instructions per clock than conventional superscalar architectures.


international symposium on microarchitecture | 2004

Parting Thoughts - Managing the transition from complexity to elegance: design convergence

Charles R. Moore

Basically, this means that you are very explicit about what goes into the design, and that you are committed to a rigorous refinement process. You carefully choose what design features to invest in and then become ruthless in collapsing other potential features in ways that make use of these investments. Rather than indulging in many small, low-cost, somewhat-useful features, you remain focused and say, “No, we’re not going to do that.” Instead, these features must fit within one of the existing mechanisms, even if it means a somewhat less-than-optimal implementation of that feature. (Remember, the fundamental mechanisms that you’ve already chosen are based on the most important requirements for the design.) It is interesting to note that to the extent the design can support these new features, folding them into the existing framework actually furthers the value of the original design investments. And so, the question remains, How do you do this? How do you choose what is most important? And, what’s the process you use to collapse unnecessary complexity? Obviously, there is no set formula, but the following development philosophies have proven helpful in projects that I have led.


international symposium on microarchitecture | 2003

Micro's top picks from microarchitecture conferences

Charles R. Moore; Kevin W. Rudd; Ruby B. Lee; Pradip Bose

IEEE Micro focuses on contemporary design issues facing chip and hardware system designers. We publish in-depth descriptions of new or soon-to-be-announced designs from the industry to highlight both the design and the design issues addressed. As a result, IEEE Micro is a very important source of information on a wide range of processor techniques used in the industry today. As many readers are certainly aware, there are several important computer architecture conferences each year where researchers from academia and industry present papers that describe emerging issues and propose new ideas for addressing them. We thought that it might be interesting to identify representative “Top Picks” from these conferences over the past year, based on their industry relevance and interest to the largest possible segment of IEEE Micro’s readership. This would give more visibility to these ideas, facilitate potential industry adoption, and provide wider recognition to the authors. We invited the selected papers’ authors to submit essentially the same paper on the same ideas, allowing updates on only minor new work done since the conference publication, along with rewriting and editing to tailor it to IEEE Micro audiences. Because of the abundance of excellent submissions and shortage of space, we invited some as short articles, where the authors had to significantly condense their original work. As you might expect, it is difficult, even risky, for us to try to identify which papers represent the “top” ones and which might have the biggest impact on future industrial designs. Essentially, every submission we received represented excellent ideas, since each submission had already been selected and carefully peer-reviewed for a highly selective architecture conference. Each submission also represented an immense amount of thought and effort by an author or team of authors. Although quantitative approaches have been advocated and widely used in research as a means for identifying promising ideas, there is still an art to computer architecture that you simply cannot quantify. In recognition of this fact, we allowed our review process to include some amount of qualitative assessment. As a result, these selections reflect our impressions of the work, and are only a very small set of representative “top” ideas. There were many excellent submissions that we did not have the space to include. In addition, we only considered work that that authors had described in a submitted abstract in response to our call for submissions, so it is also highly likely that there are some great ideas out there that this issue does not cover at all. Charles Moore

Collaboration


Dive into the Charles R. Moore's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ramadass Nagarajan

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Haiming Liu

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Heather M. Hanson

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge