Bryan Donyanavard
University of California, Irvine
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bryan Donyanavard.
international conference on hardware/software codesign and system synthesis | 2016
Bryan Donyanavard; Tiago Mück; Santanu Sarma; Nikil D. Dutt
To meet the performance and energy efficiency demands of emerging complex and variable workloads, heterogeneous manycore architectures are increasingly being deployed, necessitating operating systems support for adaptive task allocation to efficiently exploit this heterogeneity in the face of unpredictable workloads. We present SPARTA, a throughput-aware runtime task allocation approach for Heterogeneous manycore Platforms (HMPs) to achieve energy efficiency. SPARTA collects sensor data to characterize tasks at runtime and uses this information to prioritize tasks when performing allocation in order to maximize energy-efficiency (instructions-per-Joule) without sacrificing performance. Our experimental results on heterogeneous manycore architectures executing mixes of MiBench and PARSEC benchmarks demonstrate energy reductions of up to 23% when compared to state-of-the-art alternatives. SPARTA is also scalable with low overhead, enabling energy savings in large-scale architectures with up to hundreds of cores.
embedded systems for real time multimedia | 2016
Hossein Tajik; Bryan Donyanavard; Nikil D. Dutt
Many multimedia applications exhibit a phasic behavior. Phasic behavior of applications has been studied primarily focused on code execution. However, temporal variation in an applications memory usage can deviate from its program behavior, providing opportunities to exploit these memory phases to enable more efficient use of on-chip memory resources. In this work, we define memory phases as opposed to program phases, and illustrate the potential disparity between them. We propose mechanisms for light-weight online memory-phase detection. Additionally, we demonstrate their utility by deploying these techniques for sharing distributed on-chip Scratchpad Memories (SPMs) in multi-core platforms. The information gathered during memory phases are used to prioritize different memory pages in a multi-core platform without having any prior knowledge about running applications. By exploiting memory-phasic behavior, we achieved up to 45% memory access latency improvement on a set of multimedia applications.
Iet Computers and Digital Techniques | 2016
Aviral Shrivastava; Nikil D. Dutt; Jian Cai; Majid Shoushtari; Bryan Donyanavard; Hossein Tajik
Software Programmable Memories, or SPMs, are raw on-chip memories that are not implicitly managed by the processor hardware, but explicitly by software. For example, while caches fetch data from memories automatically and maintain coherence with other caches, SPMs explicitly manage data movement between memories and other SPMs through software instructions. SPMs make the design of on-chip memories simpler, more scalable, and power efficient, but also place additional burden for programming of SPM-based processors. Traditionally, SPMs have been utilised in embedded systems, especially multimedia and gaming systems, but recently research on SPM-based systems has seen increased interest as a means to solve the memory scaling challenges of many-core architectures. This study presents an overview of the state-of-the-art in SPM management techniques in many-core processors, summarises some recent research on SPM-based systems, and outlines future research directions in this field.
ACM Transactions in Embedded Computing Systems | 2016
Hossein Tajik; Bryan Donyanavard; Nikil D. Dutt; Janmartin Jahn; Jörg Henkel
Distributed Scratchpad Memories (SPMs) in embedded many-core systems require careful selection of data placement to achieve good performance. Applications mapped to these platforms have varying memory requirements based on their runtime behavior, resulting in under- or overutilization of the local SPMs. We propose SPMPool to share the available on-chip SPMs on many-cores among concurrently executing applications in order to reduce the overall memory access latency. By pooling SPM resources, we can assign underutilized memory resources, due to idle cores or low memory usage, to applications dynamically. SPMPool is the first workload-aware SPM mapping solution for many-cores that dynamically allocates data at runtime—using profiled data—to address the unpredictable set of concurrently executing applications. Our experiments on workloads with varying interapplication memory intensity show that SPMPool can achieve up to 76p reduction in memory access latency for configurations ranging from 16 to 256 cores, compared to the traditional approach that limits executing cores to use their local SPMs.
architectural support for programming languages and operating systems | 2018
Amir M. Rahmani; Bryan Donyanavard; Tiago Mück; Kasra Moazzemi; Axel Jantsch; Onur Mutlu; Nikil D. Dutt
Resource management strategies for many-core systems need to enable sharing of resources such as power, processing cores, and memory bandwidth while coordinating the priority and significance of system- and application-level objectives at runtime in a scalable and robust manner. State-of-the-art approaches use heuristics or machine learning for resource management, but unfortunately lack formalism in providing robustness against unexpected corner cases. While recent efforts deploy classical control-theoretic approaches with some guarantees and formalism, they lack scalability and autonomy to meet changing runtime goals. We present SPECTR, a new resource management approach for many-core systems that leverages formal supervisory control theory (SCT) to combine the strengths of classical control theory with state-of-the-art heuristic approaches to efficiently meet changing runtime goals. SPECTR is a scalable and robust control architecture and a systematic design flow for hierarchical control of many-core systems. SPECTR leverages SCT techniques such as gain scheduling to allow autonomy for individual controllers. It facilitates automatic synthesis of the high-level supervisory controller and its property verification. We implement SPECTR on an Exynos platform containing ARM»s big.LITTLE-based heterogeneous multi-processor (HMP) and demonstrate that SPECTR»s use of SCT is key to managing multiple interacting resources (e.g., chip power and processing cores) in the presence of competing objectives (e.g., satisfying QoS vs. power capping). The principles of SPECTR are easily applicable to any resource type and objective as long as the management problem can be modeled using dynamical systems theory (e.g., difference equations), discrete-event dynamic systems, or fuzzy dynamics.
ACM Transactions in Embedded Computing Systems | 2018
Majid Shoushtari; Bryan Donyanavard; Luis Angel D. Bathen; Nikil D. Dutt
Traditional approaches for managing software-programmable memories (SPMs) do not support sharing of distributed on-chip memory resources and, consequently, miss the opportunity to better utilize those memory resources. Managing on-chip memory resources in many-core embedded systems with distributed SPMs requires runtime support to share memory resources between various threads with different memory demands running concurrently. Runtime SPM managers cannot rely on prior knowledge about the dynamically changing mix of threads that will execute and therefore should be designed in a way that enables SPM allocations for any unpredictable mix of threads contending for on-chip memory space. This article proposes ShaVe-ICE, an operating-system-level solution, along with hardware support, to virtualize and ultimately share SPM resources across a many-core embedded system to reduce the average memory latency. We present a number of simple allocation policies to improve performance and energy. Experimental results show that sharing SPMs could reduce the average execution time of the workload up to 19.5% and reduce the dynamic energy consumed in the memory subsystem up to 14%.
rapid system prototyping | 2017
Tiago Mück; Bryan Donyanavard; Nikil D. Dutt
Heterogeneous Multiprocessors (HMPs) are becoming pervasive in current modern embedded platforms (e.g. mobile devices). These platforms often provide better power-performance tradeoffs than their homogeneous predecessors; however, novel and intelligent resource management policies are required to manage the added complexity of heterogeneous platforms and exploit their power-performance benefits. In this paper we propose PoliCym, a framework for the prototyping, validating, and deploying resource management policies for heterogeneous platforms. PoliCym provides two main benefits to resource management policy developers and to the research community: 1) a trace-based offline simulator allows policies to be quickly prototyped, debugged, and validated on top of arbitrary platform configurations; and 2) a light-weight sensing-actuation interface allows the same policies to be efficiently deployed on top of Linux-based systems without the need for implementation changes or additional development cycles. We evaluate our light-weight interface in terms of overhead and validate the PoliCym offline simulator for an ARM big.LITTLE based HMP platform running Linux.
international conference on hardware software codesign and system synthesis | 2017
Bryan Donyanavard; Amir Mahdi Hosseini Monazzah; Tiago Mück; Nikil D. Dutt
Studies have shown memory and computational needs vary independently across applications. Recent work has explored using hybrid memory technology (SRAM+NVM) in on-chip memories of multicore processors (CMPs) to support the varied needs of diverse workloads. Such works suggest architectural modifications that require supplemental management in the memory hierarchy. Instead, we propose to deploy hybrid memory in a manner that integrates seamlessly with the existing heterogeneous multicore (HMP) architectural model, and therefore does not require any architectural modification, simply the integration of different memory technologies on-chip. We evaluate platforms with a combination of fast (SRAM cache) and slow (STT-MRAM cache) core-types for mobile workloads.
design, automation, and test in europe | 2018
Armin Sadighi; Bryan Donyanavard; Thawra Kadeed; Kasra Moazzemi; Tiago Mück; Ahmed Nassar; Amir M. Rahmani; Thomas Wild; Nikil D. Dutt; Rolf Ernst; Andreas Herkersdorf; Fadi J. Kurdahi
design, automation, and test in europe | 2018
Bryan Donyanavard; Amir M. Rahmani; Tiago Mück; Kasra Moazemmi; Nikil D. Dutt