Francesco Paterna | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francesco Paterna is active.

Explore More

Publication

Featured researches published by Francesco Paterna.

design automation conference | 2013

Workload and user experience-aware dynamic reliability management in multicore processors

Pietro Mercati; Andrea Bartolini; Francesco Paterna; Tajana Simunic Rosing; Luca Benini

Reliability is a major concern for nanoscale CMOS circuits. Degradation phenomena such as Electromigration, Negative Bias Temperature Instability, Time Dependent Dielectric Breakdown worsen with transistor scaling. Dynamic Reliability Management (DRM) techniques reduce reliability loss at runtime by constraining operating points, but they face the challenge of reducing user experience degradation while meeting a lifetime target. In this work we propose a sensor based hierarchical controller for multicore processor DRM, exploiting the major gap between the time scales of workload variations and reliability loss. We improve performance and user experience by locally relaxing reliability-induced operating point constraints, while meeting them over the large time windows relevant for reliability. With respect to the state-of-the-art, our solution guarantees timely execution of 100% of latency-critical applications, and have a 4% performance improvement over the whole lifetime.

IEEE Transactions on Computers | 2012

Variability-Aware Task Allocation for Energy-Efficient Quality of Service Provisioning in Embedded Streaming Multimedia Applications

Francesco Paterna; Andrea Acquaviva; Alberto Caprara; Francesco Papariello; Giuseppe Desoli; Luca Benini

Multimedia streaming applications running on next-generation parallel multiprocessor arrays in sub-45 nm technology face new challenges related to device and process variability, leading to performance and power variations across the cores. In this context, Quality of Service (QoS), as well as energy efficiency, could be severely impacted by variability. In this work, we propose a runtime variability-aware workload distribution technique for enhancing real-time predictability and energy efficiency based on an innovative Linear-Programming + Bin-Packing formulation which can be solved in linear time. We demonstrate our approach on the virtual prototype of a next-generation industrial multicore platform running representative multimedia applications. Experimental results confirm that our technique compensates variability, while improving energy-efficiency and minimizing deadline violations in presence of performance and power variations across the cores. The proposed policy can save up to 33 percent of energy with respect to the state-of-the-art policies and 65 percent of energy with respect to one variability-unaware task allocation policy while providing better QoS.

design, automation, and test in europe | 2014

A Linux-governor based Dynamic Reliability Manager for android mobile devices

Pietro Mercati; Andrea Bartolini; Francesco Paterna; Tajana Simunic Rosing; Luca Benini

Reliability is a major concern in multiprocessors. Dynamic Reliability Management (DRM) aims at trading off processor performance with lifetime. The state-of-the-art publications study only the theory supported by simulation. This paper presents the first complete software implementation, working on a real hardware, of a low-overhead, Android-compatible workload-aware DRM Governor for mobile multiprocessors. We discuss the design challenges and the run-time overhead involved. We show the effectiveness of our governor in guaranteeing the predefined target lifetime and show that it achieves up to 100% of lifetime improvement with respect to traditional governors, while providing comparable performance for critical applications.

design, automation, and test in europe | 2009

Adaptive idleness distribution for non-uniform aging tolerance in multiprocessor systems-on-chip

Francesco Paterna; Luca Benini; Andrea Acquaviva; Francesco Papariello; Giuseppe Desoli; Mauro Olivieri

In deep submicron designs of MultiProcessor Systems-on-Chip (MPSoC) architectures, uncompensated within-die process variations and aging effects will lead to an increasing uncertainty and unbalancing of expected core lifetimes. In this paper we present an adaptive workload allocation strategy for run-time compensation of variations- and aging-induced unbalanced core lifetimes by means of core activity duty cycling. The proposed techniques regulates the percentage of idle time on short-expected-life cores to meet the platform lifetime target with minimum performance degradation. Experiments have been conducted on a multiprocessor simulator of a next-generation industrial MPSoC platform for multimedia applications made of a general purpose processor and programmable accelerators.

international symposium on system-on-chip | 2011

Exploring instruction caching strategies for tightly-coupled shared-memory clusters

Daniele Bortolotti; Francesco Paterna; Christian Pinto; Andrea Marongiu; Martino Ruggiero; Luca Benini

Several Chip-Multiprocessor designs today leverage tightly-coupled computing clusters as a building block. These clusters consist of a fairly large number N of simple cores, featuring fast communication through a shared multibanked L1 data memory and ≈ 1 Instruction-Per-Cycle (IPC) per core. Thus, aggregated I-fetch bandwidth approaches ƒ * N, where ƒ is the cluster clock frequency. An effective instruction cache architecture is key to support this I-fetch bandwidth. In this paper we compare two main architectures for instruction caching targeting tightly coupled CMP clusters: (i) private instruction caches per core and (ii) shared instruction cache per cluster. We developed a cycle-accurate model of the tightly coupled cluster with several configurable architectural parameters for exploration, plus a programming environment targeted at efficient data-parallel computing. We conduct an in-depth study of the two architectural templates based on the use of both synthetic microbenchmarks and real program workloads. Our results provide useful insights and guidelines for designers.

design, automation, and test in europe | 2014

Ambient variation-tolerant and inter components aware thermal management for mobile system on chips

Francesco Paterna; Joe Zanotelli; Tajana Simunic Rosing

In this work we measure and study two key aspects of the thermal behavior of smartphones: 1) thermal interaction between the components on the printed circuit board and 2) the influence of phones ambient temperature which is subject to large variations. The measurements on the smartphone running typical workloads show that the heat generated by the communication subsystem and the high temperatures on the back cover of the phone can increase the SoC temperature by as much as 17°C. None of the run-time thermal management studies presented to date considered this interaction, as there was no model available. We design a thermal model that captures this thermal dependency and a policy able to avoid thermal emergencies while minimizing the impact on performance.

computing frontiers | 2010

Variability-tolerant run-time workload allocation for MPSoC energy minimization under real-time constraints

Francesco Paterna; Andrea Acquaviva; Alberto Caprara; Francesco Papariello; Giuseppe Desoli; Luca Benini

Multicore architectures will be adopted in the sub-50nm CMOS technology nodes for virtually all application domains with energy efficiency requirements exceeding 10GOPS/Watt. Unfortunately, future technology nodes will be increasingly affected by variation phenomena, and multicore architectures will be impacted in many ways by the variability of the underlying silicon fabrics [1, 6, 8]. Our architectural target is an advanced prototype of an industrial multicore platform for post-2014 set-top-box products, featuring a single CPU coordinator and an array of programmable VLIW hardware accelerators with multi-threading support. Next-generation set-top-boxes will support very high resolution, high-frame rate video rendering with complex 3D GUIs and stereoscopic visualization support [2]. These applications require extensive image processing and enhancements functions which are embarrassingly parallel and will be distributed on the VLIW accelerator array as a large number of barrier-synchronized tasks. Accelerators are nominally homogeneous, but unfortunately variability causes significant perturbations on their performance and power consumption. We define a two-phase approach based on linear programming and bin packing. Thanks to these steps, the technique performs task allocation exploiting the awareness of performance and power variations of the cores, thus minimizing deadline misses and improving energy efficiency of the platform with respect to a variation-blind approach. In this work we consider variability effects acting independently on critical path delay, leakage power, and dynamic power [3]. Variability distribution data have been obtained through the VAM tool

international conference on computer design | 2014

Dynamic Variability Management in Mobile Multicore Processors under Lifetime Constraints

Pietro Mercati; Francesco Paterna; Andrea Bartolini; Luca Benini; Tajana Simunic Rosing

Variability is a key issue in modern multiprocessors, resulting in performance and lifetime uncertainty, and high design margins. The margins can be reduced by exposing variability to software and then adapting at runtime. In this work we use sensors to monitor the variable operating conditions and the degradation rate. Based on the sensor data, our variability-aware OS scheduling algorithm assigns the workload to the cores and sets the power/performance tradeoffs to meet the mobile processors lifetime constraints while adjusting to variability and improving the overall performance. We implement our algorithm in Android OS on a mobile phone and show that it achieves up to 160% performance improvement over the state-of-the-art while meeting the lifetime constraints.

design, automation, and test in europe | 2011

An efficient on-line task allocation algorithm for QoS and energy efficiency in multicore multimedia platforms

Francesco Paterna; Andrea Acquaviva; Alberto Caprara; Francesco Papariello; Giuseppe Desoli; Luca Benini

The impact of variability on sub-45nm CMOS multimedia platforms makes hard to provide application QoS guarantees, as the speed variations across the cores may cause sub-optimal and sample-dependent utilization of the available resources and energy budget. These effects can be compensated by an efficient allocation of the workload at run-time. In the context of multimedia applications, a critical objective is to compensate core speed variability while matching time constraints without impacting the energy consumption. In this paper we present a new approach to compute optimal task allocations at run-time. The proposed strategy exploits an efficient and scalable implementation to find on-line the best possible solution in a tightly bounded time. Experimental results demonstrate the effectiveness of compensation both in terms of deadline miss rate and energy savings. Results have been compared with those obtained applying state-of-art techniques on a multithreaded MPEG2 decoder. The validation has been performed on a cycle-accurate virtual prototype of a next-generation industrial multicore platform that has been extended with process variability models.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2017

WARM: Workload-Aware Reliability Management in Linux/Android

Pietro Mercati; Francesco Paterna; Andrea Bartolini; Luca Benini; Tajana Simunic Rosing

With CMOS scaling beyond 14 nm, reliability is a major concern for IC manufacturers. Reliability-aware design has a non-negligible overhead and cannot account for user experience in mobile devices. An alternative is dynamic reliability management (DRM), which counteracts degradation by adapting the operating conditions at runtime. In this paper, for the first time we formulate DRM as an optimization problem that accounts for reliability, temperature and performance. We develop an optimal policy for multicores using convex optimization, and show that it is not feasible to implement on real systems. For this reason, we propose workload-aware reliability management (WARM), a fast DRM technique adapting to diverse workload requirements to trade reliability and user experience. WARM is implemented and tested on a real Android device. WARM approximates the solution of the convex solver within 5% on average, while executing more than

Explore More