Is this you? Create Your Porfile

Adam Kaplan

University of California, Los Angeles

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adam Kaplan is active.

Explore More

Publication

Featured researches published by Adam Kaplan.

high-performance computer architecture | 2008

CMP network-on-chip overlaid with multi-band RF-interconnect

Mau-Chung Frank Chang; Jason Cong; Adam Kaplan; Mishali Naik; Glenn Reinman; Eran Socher; Sai-Wang Tam

In this paper, we explore the use of multi-band radio frequency interconnect (or RF-I) with signal propagation at the speed of light to provide shortcuts in a many core network-on-chip (NoC) mesh topology. We investigate the costs associated with this technology, and examine the latency and bandwidth benefits that it can provide. Assuming a 400 mm2 die, we demonstrate that in exchange for 0.13% of area overhead on the active layer, RF-I can provide an average 13% (max 18%) boost in application performance, corresponding to an average 22% (max 24%) reduction in packet latency. We observe that RF access points may become traffic bottlenecks when many packets try to use the RF at once, and conclude by proposing strategies that adapt RF-I utilization at runtime to actively combat this congestion.

ACM Transactions on Design Automation of Electronic Systems | 2002

Instruction generation for hybrid reconfigurable systems

Ryan Kastner; Adam Kaplan; S. Ogrenci Memik; Elaheh Bozorgzadeh

We present an algorithm for simultaneous template generation and matching. The algorithm profiles the graph and iteratively contracts edges to create the templates. The algorithm is general and can be applied to any type of graph, including directed graphs and hypergraphs. We discuss how to target the algorithm towards the novel problem of instruction generation and selection for a hybrid (re)configurable systems. In particular, we target the strategically programmable system, which embeds complex computational units like ALUs, IP blocks, etc. into a configurable fabric. We argue that an essential compilation step for these systems is instruction generation, as it is needed to specify the functionality of the embedded computational units. Additionally, instruction generation can be used to create soft macros tightly sequenced pre-specified operations placed in the configurable fabric.

compilers, architecture, and synthesis for embedded systems | 2002

Instruction generation and regularity extraction for reconfigurable processors

Philip Brisk; Adam Kaplan; Ryan Kastner; Majid Sarrafzadeh

The increasing demand for complex and specialized embedded hardware must be met by processors which are optimized for performance, yet are also extremely flexible. In our work, we explore the tradeoff between flexibility and performance in the domain of reconfigurable processor design. Specifically, we seek to identify regularly occurring, computation-heavy patterns in an application or set of applications. These patterns become candidates for hard-logic implementation, potentially embedded in the flexible reconfigurable fabric as special optimized instructions. In this work we present an extension to previous work in instruction generation: an algorithm that identifies parallel templates. We discuss the advantages of parallel templates, and prove the correctness of our algorithm. We introduce an All-Pairs Common Slack Graph (APCSG) as an effective tool for parallel template generation. Finally, we demonstrate the effectiveness of our algorithm on several applicationse dataflow graphs, reducing latency on average by 51.98%, without unreasonably increasing chip area.

design automation conference | 2004

Area-efficient instruction set synthesis for reconfigurable system-on-chip designs

Philip Brisk; Adam Kaplan; Majid Sarrafzadeh

Silicon compilers are often used in conjunction with Field Programmable Gate Arrays (FPGAs) to deliver flexibility, fast prototyping, and accelerated time-to-market. Many of these compilers produce hardware that is larger than necessary, as they do not allow instructions to share hardware resources. This study presents an efficient heuristic which transforms a set of custom instructions into a single hardware datapath on which they can execute. Our approach is based on the classic problems of finding the longest common subsequence and substring of two (or more) sequences. This heuristic produces circuits which are as much as 85.33% smaller than those synthesized by integer linear programming (ILP) approaches which do not explore resource sharing. On average, we obtained 55.41% area reduction for pipelined datapaths, and 66.92% area reduction for VLIW datapaths. Our solution is simple and effective, and can easily be integrated into an existing silicon compiler.

international symposium on microarchitecture | 2008

Power reduction of CMP communication networks via RF-interconnects

M-C. Frank Chang; Jason Cong; Adam Kaplan; Chunyue Liu; Mishali Naik; Jagannath Premkumar; Glenn Reinman; Eran Socher; Sai-Wang Tam

As chip multiprocessors scale to a greater number of processing cores, on-chip interconnection networks will experience dramatic increases in both bandwidth demand and power dissipation. Fortunately, promising gains can be realized via integration of radio frequency interconnect (RF-I) through on-chip transmission lines with traditional interconnects implemented with RC wires. While prior work has considered the latency advantage of RF-I, we demonstrate three further advantages of RF-I: (1) RF-I bandwidth can be flexibly allocated to provide an adaptive NoC, (2) RF-I can enable a dramatic power and area reduction by simplification of NoC topology, and (3) RF-I provides natural and efficient support for multicast. In this paper, we propose a novel interconnect design, exploiting dynamic RF-I bandwidth allocation to realize a reconfigurable network-on-chip architecture. We find that our adaptive RF-I architecture on top of a mesh with 4B links can even outperform the baseline with 16B mesh links by about 1%, and reduces NoC power by approximately 65% including the overhead incurred for supporting RF-I.

international conference on computer aided design | 2008

MC-Sim: an efficient simulation tool for MPSoC designs

Jason Cong; Karthik Gururaj; Guoling Han; Adam Kaplan; Mishali Naik; Glenn Reinman

The ability to integrate diverse components such as processor cores, memories, custom hardware blocks and complex network-on-chip (NoC) communication frameworks onto a single chip has greatly increased the design space available for system-on-chip (SoC) designers. Efficient and accurate performance estimation tools are needed to assist the designer in making design decisions. In this paper, we present MC-Sim, a heterogeneous multi-core simulator framework which is capable of accurately simulating a variety of processor, memory, NoC configurations and application specific coprocessors. We also describe a methodology to automatically generate fast, cycle-true behavioral, C-based simulators for coprocessors using a high-level synthesis tool and integrate them with MC-Sim, thus augmenting it with the capacity to simulate coprocessors. Our C-based simulators provide on an average 45times improvement in simulation speed over that of RTL descriptions. We have used this framework to simulate a number of real-life applications such as the MPEG4 decoder and litho-simulation, and experimented with a number of design choices. Our simulator framework is able to accurately model the performance of these applications (only 7% off the actual implementation) and allows us to explore the design space rapidly and achieve interesting design implementations.

design automation conference | 2003

Data communication estimation and reduction for reconfigurable systems

Adam Kaplan; Philip Brisk; Ryan Kastner

Widespread adoption of reconfigurable devices requires system level synthesis techniques to take an application written in a high level language and map it to the reconfigurable device. This paper describes methods for synthesizing the internal representation of a compiler into a hardware description language in order to program reconfigurable hardware devices. We demonstrate the usefulness of static single assignment (SSA) in reducing the amount of data communication in the hardware. However, the placement of /spl Phi/-nodes by current SSA algorithms is not optimal in terms of minimizing data communication. We propose a new algorithm which optimally places /spl Phi/-nodes, further decreasing area and communication latency. Our algorithm reduces the data communication (measured as total edge weight in a control data flow graph) by as much as 20% for some applications as compared to the best-known SSA algorithm - the pruned algorithm. We also describe future modifications to our model that should increase the effectiveness of our methods.

design, automation, and test in europe | 2006

Layout Driven Data Communication Optimization for High Level Synthesis

Ryan Kastner; Wenrui Gong; Xin Hao; Forrest Brewer; Adam Kaplan; P. Brisbane; Majid SarrafzadehWenrui

High level synthesis transformations play a major part in shaping the properties of the final circuit. However, most optimizations are performed without much knowledge of the final circuit layout. In this paper, we present a physically aware design flow for mapping high level application specifications to a synthesizable register transfer level hardware description. We study the problem of optimizing the data communication of the variables in the application specification. Our algorithm uses floorplan information that guides the optimization. We develop a simple, yet effective, incremental floorplanner to handle the perturbations caused by the data communication optimization. We show that the proposed techniques can reduce the wirelength of the final design, while maintaining a legal floorplan with the same area as the initial floorplan

Archive | 2003

Optimization for Reconfigurable Systems Using Hierarchical Abstraction

Elaheh Bozorgzadeh; Adam Kaplan; Ryan Kastner; Seda Ogrenci Memik; Majid Sarrafzadeh

In the previous chapters, we have seen various multilevel optimization techniques used to solve a variety of complex problems. In this chapter we discuss techniques to synthesize VLSI systems from a high level description. From system genesis as a high-level description to its final physical layout, the problem of hardware synthesis is too nebulous to be handled in a single stage optimization process. As a result, CAD flow is divided into three major stages: Compiler Optimization, High-level Synthesis, and Physical Design. The traditional flow of the stages departs from the exact concept of multilevel optimization as addressed in this book. Therefore, in this chapter we focus on another way to abstract complexity in VLSI systems: hierarchical abstraction.

Archive | 2009