Dileep Bhandarkar
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dileep Bhandarkar.
international symposium on computer architecture | 1994
Zarka Cvetanovic; Dileep Bhandarkar
The characteristics of several commercial and technical workloads on the DEC 7000 AXP system are compared using built-in hardware monitors. The data analyzed include total instructions, cycles, multiple-issued instructions, stall components, cache misses, and instruction types. The data indicates that the two classes of workloads have vastly different characteristics and impose different requirements on the system design. Compared to VAX, Alpha AXP takes advantage of lower cycles per instruction and cycle time to achieve a significant performance advantage. The cache and memory interconnect subsystems are expected to play a crucial role in the performance of future systems. A simple model for evaluating the effects of various design tradeoffs based on the data collected by using hardware monitors is proposed.
high-performance computer architecture | 1996
Zarka Cvetanovic; Dileep Bhandarkar
This paper compares the performance characteristics of the Alpha 21164 to the previous-generation 21064 microprocessor. Measurements on the 21164-based AlphaServer 8200 system are compared to the 21064-based DEC 7000 server using several commercial and technical workloads. The data analyzed includes cycles per instruction, multiple-issued instructions, branch predictions, stall components, cache misses, and instruction frequencies. The AlphaServer 8200 provides 2 to 3 times the performance of the DEC 7000 server based on the faster clock, larger on-chip cache, expanded multiple-issuing, and lower cache/memory latencies and higher bandwidth.
international symposium on computer architecture | 1990
Dileep Bhandarkar; Richard Brunner
The VAX Architecture has been extended to include an integrated, register-based vector processor. This extension allows both high-end and low-end implementations and can be supported with only small changes by VAX/VMS and VAX/ULTRIX operating systems. The extension is effectively exploited by the new vectorizing capabilities of VAX FORTRAN. Features of the VAX Vector Architecture and the design decisions which make it a consistent extension of the VAX Architecture are discussed.
ieee computer society international conference | 1990
Dileep Bhandarkar; Richard Brunner
The extension of the VAX architecture to include integrated vector processing is discussed. The design goals and constraints and an overview of the resulting architecture are presented. The architecture maximizes the asynchronism between the scalar and vector processors and the parallelism within the vector processor. However, the design is consistent enough with the overall philosophy of the VAX architecture that only minimal changes to existing operating systems be required to support it.<<ETX>>
ACM Sigarch Computer Architecture News | 1997
Dileep Bhandarkar
This paper compares an aggressive RISC and CISC implementation built with comparable technology. The two chips are the Alpha* 21164 and the Intel Pentium® Pro processor. The paper presents performance comparisons for industry standard benchmarks and uses performance counter statistics to compare various aspects of both designs.
high-performance computer architecture | 2003
Dileep Bhandarkar
Today’s leading edge microprocessors like the Intel’s Itanium ® 2 Processor feature over 220 million transistors in 0.18µm semiconductor process technology. Nanotechnology that continues to drive Moore’s Law provides a doubling of the transistor density every two years. This indicates that a Billion transistor chip is possible in the 65 nm technology within the next 3 to 4 years. Such chips can be used in mainstream enterprise server platforms. This talk will review the progress in semiconductor technology over the last 3 decades since the introduction of the first microprocessor in 1971. A short video tape will provide a historical perspective on Moore’s Law in the form of an interview with co-founder Gordon Moore, and his thoughts for the future of semiconductor technology. Key trends in high end microprocessor design including multi-threading and multi-core will be covered. We have started to see “SMP-on-a-chip” designs for high-end enterprise servers where two processors with Level 2 (L2) cache are incorporated on a single chip. Future microprocessors will offer higher levels of multiprocessor capability on chip as the transistor density increases.
international conference on parallel architectures and compilation techniques | 2002
Dileep Bhandarkar
Today’s leading edge microprocessors feature over 200 million transistors. Moore’s Law continues to provide a doubling of the transistor density every two years. This has spawned several new changes in the features designed into commercially available microprocessors. Intel’s Itanium Processor Family features massive on-chip execution resources as evidenced by the 6 integer units, 3 branch units, 2 floating point multiply-add units, and 2 load and 2 store units in the recently released Itanium 2 processor. Intel’s Xeon processor introduced simultaneous multi-threading (SMT) in a high volume microprocessor. IBM’s Power4 * microprocessor represents the first “SMP-on-a-chip” design for high-end enterprise servers—two processors with Level 2 (L2) cache are incorporated on each chip. Future microprocessors will offer higher levels of multiprocessor capability on chip as the transistor density increases. Computer manufacturers are incorporating these high-end microprocessors into large symmetric multiprocessing systems with 8, 16, 32 or even 64 processors. Another trend is the emergence of clustered commercially off the shelf (COTS) servers as credible supercomputing platforms. These trends provide compiler writers and application developers a wide range of platforms for developing parallel applications ranging from instruction level parallelism (ILP) in high-end microprocessors, to tightly coupled thread level parallelism (TLP) and massive message oriented parallelism in large-scale clusters. This talk will cover anticipated advances in semiconductor technology and relate those to trends in microprocessor design that will drive higher levels of parallelism in mainstream server platforms.
ieee computer society international conference | 1990
Dileep Bhandarkar; David A. Orbits; Richard T. Witek; Wayne Cardoza; David N. Cutler
An issue-oriented architecture designed for high performance is described. It uses features, such as simple instruction formats, large number of registers, and load/store architecture, found in some reduced-instruction-set-computer architectures. It also includes features, such as out-of-order completion, imprecise exceptions, and vector processing, found in supercomputers such as the CRAY-1. Furthermore, it provides a full set of system support features, such as multiprocessor synchronization, vectored exceptions, stacks, asynchronous system traps, and extensive memory management, found in complex architectures such as the VAX. The reduced instruction parallel/pipelined (RIP) architecture is described. The RIP architecture was designed as a robust architecture to meet a wide range of system requirements across a family of implementations. The processor model that guided the architecture definition consists of multiple pipelined function units, each of which executes a class of instructions.<<ETX>>
Archive | 1988
Dileep Bhandarkar; Dave Cutler; Wayne Cardoza; Rich Witek; Dave Orbits
Archive | 1988
David N. Cutler; David A. Orbits; Dileep Bhandarkar; Wayne Cardoza; Richard T. Witek