Athanasios Chatzidimitriou
National and Kapodistrian University of Athens
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Athanasios Chatzidimitriou.
ieee international symposium on workload characterization | 2015
Sotiris Tselonis; Athanasios Chatzidimitriou; Nikos Foutris; Dimitris Gizopoulos
Fault injection on micro architectural structures modeled in performance simulators is an effective method for the assessment of microprocessors reliability in early design stages. Compared to lower level fault injection approaches it is orders of magnitude faster and allows execution of large portions of workloads to study the effect of faults to the final program output. Moreover, for many important hardware components it delivers accurate reliability estimates compared to analytical methods which are fast but are known to significantly over-estimate a structures vulnerability to faults. This paper investigates the effectiveness of micro architectural fault injection for x86 and ARM microprocessors in a differential way: by developing and comparing two fault injection frameworks on top of the most popular performance simulators, MARSS and Gem5. The injectors, called MaFIN and GeFIN (for MARSS-based and Gem5-based Fault Injector, respectively), are designed for accurate reliability studies and deliver several contributions among which: (a) reliability studies for a wide set of fault models on major hardware structures (for different sizes and organizations), (b) study on the reliability sensitivity of micro architecture structures for the same ISA (x86) implemented on two different simulators, (c) study on the reliability of workloads and micro architectures for the two most popular ISAs (ARM vs. x86). For the workloads of our experimental study we analyze the common trends observed in the CPU reliability assessments produced by the two injectors. Also, we explain the sources of difference when diverging reliability reports are provided by the tools. Both the common trends and the differences are attributed to fundamental implementations of the simulators and are supported by benchmarks runtime statistics. The insights of our analysis can guide the selection of the most appropriate tool for hardware reliability studies (and thus decision-making for protection mechanisms) on certain micro architectures for the popular x86 and ARM ISAs.
international symposium on performance analysis of systems and software | 2016
Athanasios Chatzidimitriou; Dimitris Gizopoulos
The increasing density and complexity of modern microprocessors, which is driven by manufacturing technologies scaling, significantly affect their reliability. Reliability evaluation during the early design stages is a challenging process for microprocessor designers. Statistical fault-injection on microarchitecture simulators is commonly used, among other techniques, since it can deliver early and accurate reliability estimations for many important microprocessor hardware structures. However, full-system microarchitectural simulators have a relatively small simulation throughput. Thus, the number of injection experiments that can be performed during a fault injection campaign can be limited and therefore lead to smaller statistical significance of the reliability assessment. Aiming to boost the throughput of microarchitecture-level fault injection, we present, in this paper, a multi-faceted microarchitecture-level toolset for reliability assessment of modern microprocessors. The framework is built around the Gem5 simulator and provides several modes of operation which employ acceleration features for all stages of a fault-injection based reliability assessment campaign. The tool throughput and the accuracy of the delivered reliability assessments can be traded off and allow architects to make informed decisions about the most suitable error protection mechanisms of any given microarchitecture and workload by studying the reports delivered by the toolset. We provide experimental results of the different modes of the toolset for both the x86 and ARM out-of-order models of Gem5. Our experimental results show that up to 8x acceleration of the fault injection campaigns can be achieved with less than 0.5 percentile points of accuracy loss.
vlsi test symposium | 2017
Athanasios Chatzidimitriou; Sotiris Tselonis; Dimitris Gizopoulos
Technology evolution has raised serious reliability considerations, as transistor dimensions shrink and modern microprocessors become denser and more vulnerable to faults. Reliability studies have proposed a plethora of methodologies for assessing system vulnerability which, however, highly rely on traditional reliability metrics that solely express failure rate over time. Although Failures In Time (FIT) is a very strong and representative reliability metric, it may fail to offer an objective comparison of highly diverse systems, such as CPUs against GPUs or other accelerators that are often employed to execute the same algorithms implemented for these platforms.
international symposium on microarchitecture | 2017
George Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Peter Lawthers; Shidhartha Das
In this paper, we present the first automated system-level analysis of multicore CPUs based on ARMv8 64-bit architecture (8-core, 28nm X-Gene 2 micro-server by AppliedMicro) when pushed to operate in scaled voltage conditions. We report detailed system-level effects including SDCs, corrected/uncorrected errors and application/system crashes. Our study reveals large voltage margins (that can be harnessed for energy savings) and also large
international on-line testing symposium | 2017
George Papadimitriou; Athanasios Chatzidimitriou; Charalampos Magdalinos; Dimitris Gizopoulos
V_{min}
international test conference | 2016
Alessandro Vallero; Alessandro Savino; Gianfranco Michele Maria Politano; S. Di Carlo; Athanasios Chatzidimitriou; Sotiris Tselonis; Dimitris Gizopoulos; Marc Riera; Ramon Canal; Antonio González; Maha Kooli; A. Bosio; G. Di Natale
variation among the 8 cores of the CPU chip, among 3 different chips (a nominal rated and two sigma chips), and among different benchmarks.Apart from the
dependable systems and networks | 2017
Athanasios Chatzidimitriou; Dimitris Gizopoulos; Maurizio Iacaruso; Mauro Pipponzi; Riccardo Mariani; Stefano Di Carlo
V_{min}
international on-line testing symposium | 2016
George N. Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Ronny Morad
analysis we propose a new composite metric (severity) that aggregates the behavior of cores when undervolted and can support system operation and design protection decisions. Our undervolting characterization findings are the first reported analysis for an enterprise class 64-bit ARMv8 platform and we highlight key differences with previous studies on x86 platforms. We utilize the results of the system characterization along with performance counters information to measure the accuracy of prediction models for the behavior of benchmarks running in particular cores. Finally, we discuss how the detailed characterization and the prediction results can be effectively used to support design and system software decisions to harness voltage margins for energy efficiency while preserving operation correctness. Our findings show that, on average, 19.4% energy saving can be achieved without compromising the performance, while with 25% performance reduction, the energy saving raises to 38.8%.CCS CONCEPTS• Hardware → Power and energy → Power estimation and optimization; • Hardware → Robustness → Hardware reliability → Process, voltage and temperature variations
Archive | 2019
Lev Mukhanov; Konstantinos Tovletoglou; Georgios Karakonstantis; George N. Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Shidhartha Das
In this paper, we explore the pessimistic voltage guardbands of two multicore x86-64 microprocessor chips that belong to different microarchitectures (one ultra-low power and one high-performance microprocessor), when programs are executed on individual cores of the CPU chips. We also examine the energy and temperature gains as positive effects of lowering the voltage in both chips while preserving the functional correctness of programs. The behavior of the cores was examined executing 8 different workloads from the SPEC CPU2006 suite. Our differential experimental study is performed on two state-of-the-art x86-64 microprocessors: an ultra-low power Intel Core i5-4200U and a high-performance Intel Core i7-3970X. Based on the results, the cores on each microprocessor chip behave differently for different workloads when undervolted, and the voltage guardbands are more than 15% below the nominal voltage levels. We show that the energy efficiency can be increased by a maximum of 20% and the reduction of temperature can be up to 25%.
IEEE Transactions on Device and Materials Reliability | 2017
George N. Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Ronny Morad
System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.