Is this you? Create Your Porfile

Eren Kursun

University of California, Los Angeles

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eren Kursun is active.

Explore More

Publication

Featured researches published by Eren Kursun.

design automation conference | 2003

Global resource sharing for synthesis of control data flow graphs on FPGAs

Seda Ogrenci Memik; Gokhan Memik; Roozbeh Jafari; Eren Kursun

In this paper we discuss the global resource sharing problem during synthesis of control data flow graphs for FPGAs. We first define the Global Resource Sharing (GRS) problem. Then, we introduce the Global Inter Basic Block Resource Sharing (GIBBS) technique to solve the GRS problem. The first tries to minimize the number of connections between modules, the second considers the area gain, the third uses the criticality of operations assigned to resources as a measure for deciding on merging any given pair of resources, the fourth tries to capture common resource chains and overlap those to minimize both area and delay, and the fifth is the combination of these heuristics. While applying resource sharing, we also consider the execution frequency of the basic blocks. Using our techniques we synthesized several CDFGs representing applications from MediaBench suite. Our results show that, we can reduce the total area requirement by 44% on average (up to 59%) while increasing the execution time by 6% on average.

ACM Journal on Emerging Technologies in Computing Systems | 2008

Investigating the effects of fine-grain three-dimensional integration on microarchitecture design

Yuchun Ma; Yongxiang Liu; Eren Kursun; Glenn Reinman; Jason Cong

In this article we propose techniques that enable efficient exploration of the 3D design space, where each logical block can span more than one silicon layer. Fine-grain 3D integration provides reduced intrablock wire delay as well as improved power consumption. However, the corresponding power and performance advantage is usually underutilized, since various implementations of multilayer blocks require novel physical design and microarchitecture infrastructure to explore 3D microarchitecture design space. We develop a cubic packing engine which can simultaneously optimize physical and architectural design for efficient vertical integration. This technique selects the individual unit designs from a set of single-layer or multilayer implementations to get the best microarchitectural design in terms of performance, temperature, or both. Our experimental results using a design driver of a high-performance superscalar processor show a 36% performance improvement over traditional 2D for 2--4 layers and 14% over 3D with single-layer unit implementations. Since thermal characteristics of 3D integrated circuits are among the main challenges, thermal-aware floorplanning and thermal via insertion techniques are employed to keep the peak temperatures below threshold.

international conference on computer design | 2005

Reducing the latency and area cost of core swapping through shared helper engines

Anahita Shayesteh; Eren Kursun; Timothy Sherwood; Suleyman Sair; Glenn Reinman

Technologies scaling trends and the limitations of packaging and cooling have intensified the need for thermally efficient architectures and architecture-level temperature management techniques. To combat these trends, we explore the use of core swapping on microcore architecture, a deeply decoupled processor core with larger structures factored out as helper engines. The microcore architecture presents an ideal platform for core swapping thanks to helper engines that maintain the state of each process in a shared fabric surrounding the cores, reducing the impact of core swapping 43% on average while showing promising thermal reduction. It also has favorable performance when compared to other thermal management techniques. Furthermore, we evaluate alternative approaches to spending the area overhead of the additional microcore, including larger microcores, CMP cores, and SMT cores with different thermal management techniques.

international symposium on low power electronics and design | 2002

Early evaluation techniques for low power binding

Eren Kursun; Ankur Srivastava; Seda Ogrenci Memik; Majid Sarrafzadeh

This paper presents effective metrics to evaluate the power dissipation of scheduled data flow graphs (DFGs). This enables early evaluation of schedules without performing the computationally expensive resource-binding step. Our metrics correlate heavily (as high as 0.95 and > 0.75 for most test cases) with power dissipation values obtained after resource binding and rescheduling for power optimization steps. An experimental flow that integrates path-based scheduling, power optimal binding and power driven iterative rescheduling stages is constructed. The flow integrates commercial tools like Synopsys, VSS and academic compilers like SUIF in a common optimization framework. Experimental results on DFGs from MediaBench suit also demonstrate the fact that metric evaluation is on average 42.6 times faster than performing optimal binding and iterative power improvement. Hence metric based evaluation enables fast design exploration at early stages.

PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems | 2004

Low-overhead core swapping for thermal management

Eren Kursun; Glenn Reinman; Suleyman Sair; Anahita Shayesteh; Timothy Sherwood

Technology scaling trends and the limitations of packaging and cooling have intensified the need for thermally efficient architectures and architecture-level temperature management techniques. To combat these trends, we evaluate the thermal efficiency of the microcore architecture, a deeply decoupled processor core with larger structures factored out as helper engines. We further investigate activity migration (core swapping) as a means of controlling the thermal profile of the chip in this study. Specifically, the microcore architecture presents an ideal platform for core swapping thanks to helper engines that maintain the state of each process in a shared fabric surrounding the cores. This results in significantly reduced migration overhead, enabling seamless swapping of cores. Our results show that our thermal mechanisms outperform traditional Dynamic Thermal Management (DTM) techniques by reducing the performance hit caused by slowing/swapping of cores. Our experimental results show that the microcore architecture has 86% fewer thermally critical cycles compared to a conventional monolithic core.

international symposium on signals circuits and systems | 2004

Transistor level budgeting for power optimization

Eren Kursun; Soheil Ghiasi; Majid Sarrafzadeh

We present an optimal budget distribution method for low power circuit design using transistor sizing. The algorithm distributes the available budget inside the functional unit by efficient traversal of the Series Parallel Graph representation. The technique can be efficiently applied at different abstraction levels of the design as well as toward other optimization goals (such as area optimization). The complexity is O(n) in terms of the number of transistors in the circuit. Incorporating our method in the design flow yields significant improvements in power consumption. Experiments on circuits extracted from MCNC91 benchmark suite have revealed improvements up to 59% in average power and 65% in maximum power dissipation compared to an alternative budget distribution algorithm.

international symposium on circuits and systems | 2002

Algorithmic aspects of uncertainty driven scheduling

Seda Ogrenci Memik; Ankur Srivastava; Eren Kursun; Majid Sarrafzadeh

In this paper we discuss the algorithmic aspects of uncertainty driven scheduling which is a new design paradigm. Slack oriented design flow could be used to address the uncertainty problem in high level synthesis. We formalize the concept of slack and discuss different variations of the slack driven scheduling problem. The complexity issues are studied in detail and algorithms are proposed to solve the problem. These algorithms and proofs heavily exploit the concepts and techniques of graph theory and combinatorial optimization problems.

Journal of Circuits, Systems, and Computers | 2002

PREDICTABILITY IN RT-LEVEL DESIGNS

Ankur Srivastava; Eren Kursun; Majid Sarrafzadeh

The primary objective of this paper is to provide an initial impetus to predictability driven design flow. Predictability is the quantified form of accuracy. The novelty lies in defining and using the idea of predictability. In order to illustrate the basic concepts we focus on the power estimation problem in RT-Level designs. Our experiments showed that predictability at RT-Level could be improved by making the resource delay constraints more stringent. This procedure may come with increased power dissipation. We present an optimal pseudo-polynomial time algorithm to optimize predictability while keeping the increase in power dissipation within a budget. We further extend this algorithm to generate an ∊-approximate solution in polynomial time where ∊ is a user defined parameter. The algorithm probably generates solutions that differ at-most ∊Cmax from the optimal. The future work would include extending the concept of predictability to other levels of design flow and other cost function. We envision a design automation system which does effective tradeoff between predictability and cost hence enabling efficient design exploration.

Journal of Low Power Electronics | 2005

Early Quality Assessment for Low Power Behavioral Synthesis

Eren Kursun; Rajarshi Mukherjee; Seda Ogrenci Memik

Fast and effective exploration at the early stages of the design flow can yield significant improvement in the quality of the design and substantial reduction in design time. In this paper, we present an efficient technique to evaluate the power dissipation of scheduled Data Flow Graphs (DFGs). Scheduling dictates the compatibility of operations with respect to their assignments to functional units. Generally for scheduled DFGs, this relation is captured in the form of a comparability graph. As a consequence, the topology of the comparability graph determines the solution space available to the subsequent binding stage. In this work, our main contribution is a technique to assess the inherent flexibility of the schedules we start with. We developed early evaluation metrics in order to assess the degree of flexibility inherent in an initial schedule that will eventually affect the quality of the binding solution. Every schedule is associated with a compatibility graph that represents the conflicts and compatibilities among operations with respect to possible binding decisions. Our metric based evaluation technique is based on several properties (such as edge connectivity, edge weight distribution, etc.) of these compatibility graphs. These metrics essentially reflect the amount of freedom that is provided to the binding stage, which enables early assessment and relative comparison of different possible schedules without actually performing the resource-binding step. Our experimental framework integrates scheduling, early metric-based power evaluation, low power binding and power driven iterative rescheduling stages. The correlation between early evaluation and the power measurements after binding is as high as 0.95 and greater than 0.75 for majority of test cases. Experimental results on DFGs from MediaBench suite demonstrate the fact that metric evaluation is on average 42.6 times faster than performing optimal binding and iterative power improvement. Our results show that low power schedule selection is fast and effective. On average, the schedules selected by metric evaluation have 43% less power dissipation than schedules with iterative power improvement, based on a study set of 320 schedules. We also examined the thermal profile of the corresponding solutions. We observed that schedules selected with our metric evaluation technique have on average 12 C lower temperature, and the maximum on-chip temperatures are lower by 18 C compared to the overall average of all schedules. These thermal profiles are obtained using a functional unit-level thermal simulator after block-level floorplanning.

international conference on computer design | 2007