Dimitrios Rodopoulos
Katholieke Universiteit Leuven
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dimitrios Rodopoulos.
IEEE Transactions on Device and Materials Reliability | 2014
Dimitrios Rodopoulos; Pieter Weckx; Michail Noltsis; Francky Catthoor; Dimitrios Soudris
Bias Temperature Instability (BTI) is a major concern for the reliability of decameter to nanometer devices. Older modeling approaches fail to capture time-dependent device variability or maintain a crude view of the devices stress. Previously, a two-state atomistic model has been introduced, which is based on gate stack defect kinetics. Its complexity has been preventing seamless integration in simulations of large device inventories over typical system lifetimes. In this paper, we present an approach that alleviates this complexity. We introduce a novel signal representation for the gate stress. Using this format, atomistic BTI simulations require less model iterations while exhibiting minimum accuracy degradation. We also enable full temperature and voltage supply dependency since these attributes are far from constant in modern integrated systems. The proposed simulation methodology retains both the atomistic property and the workload memory that remain major differentiators of defect-based BTI simulation, in comparison to state-of-the-art approaches.
IEEE Computer Architecture Letters | 2015
Dimitrios Rodopoulos; Francky Catthoor; Dimitrios Soudris
As technology nodes approach deca-nanometer dimensions, many phenomena threaten the binary correctness of processor operation. Computer architects typically enhance their designs with reliability, availability and serviceability (RAS) schemes to correct such errors, in many cases at the cost of extra clock cycles, which, in turn, leads to processor performance variability. The goal of the current paper is to absorb this variability using Dynamic Voltage and Frequency Scaling (DVFS). A closed-loop implementation is proposed, which configures the clock frequency based on observed metrics that encapsulate performance variability due to RAS mechanisms. That way, performance dependability and predictability is achieved. We simulate the transient and steady state behavior of our approach, reporting responsiveness within less than 1 ms. We also assess our idea using the power model of real processor and report a maximum energy overhead of roughly 10 percent for dependable performance in the presence of RAS temporal overheads.
international conference on ic design and technology | 2014
Dimitrios Rodopoulos; Dimitrios Stamoulis; Grigorios Lyras; Dimitrios Soudris; Francky Catthoor
Prior art on Bias Temperature Instability (BTI) and Random Telegraph Noise (RTN) shows their importance for digital system reliability. Reaction-diffusion models align poorly with deca-nanometer dimension experiments. Modern atomistic models capture time-zero/-dependent effects but are complicated and constrained by system memory. We propose an atomistic BTI/RTN transient simulator that can be massively threaded across any many-core platform with a hypervisor. Compared to a commercial reference we achieve x7 maximum speedup with no accuracy degradation and simulate circuits with more than 100,000 transistors. We deterministically inspect the initial stages of circuit operation, correlate delay effects with the logic depth and hint towards optimal design and simulation practices.
international conference on embedded computer systems architectures modeling and simulation | 2014
Dimitrios Rodopoulos; Giorgos Chatzikonstantis; Andreas Pantelopoulos; Dimitrios Soudris; Chris I. De Zeeuw; Christos Strydis
Biologically accurate neuron simulations are increasingly important in research related to brain activity. They are computationally intensive and feature data and task parallelism. In this paper, we present a case study for the mapping of a biologically accurate inferior-olive (InfOli), neural cell simulator on an many-core research platform. The Single-Chip Cloud Computer (SCC) is an experimental processor created by Intel Labs. The target neurons provide a major input to the cerebellum and are involved in motor skills and space perception. We exploit task- and data-partitioning, scaling the simulation over more than 40,000 neurons. The voltage- and frequency-scaling capabilities of the chip are explored, achieving more than 20% energy savings with negligible performance degradation. Four platform configurations are evaluated and a mapping with balanced workload and constant voltage and frequency is formally derived as optimal.
IEEE Computer Architecture Letters | 2011
Kostas Siozios; Dimitrios Rodopoulos; Dimitrios Soudris
Detailed thermal analysis is usually performed exclusively at design time since it is a computationally intensive task. In this paper, we introduce a novel methodology for fast, yet accurate, thermal analysis. The introduced methodology is software supported by a new open source tool that enables hierarchical thermal analysis with adaptive levels of granularity. Experimental results prove the efficiency of our approach since it leads to average reduction of the execution overhead up to 70% with a penalty in accuracy ranging between 2% and 8%.
ACM Computing Surveys | 2017
Georgia Psychou; Dimitrios Rodopoulos; Mohamed M. Sabry; Tobias Gemmeke; David Atienza; Tobias G. Noll; Francky Catthoor
Nanoscale technology nodes bring reliability concerns back to the center stage of digital system design. A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro)architecture, the mapping, and platform software (SW). The field is surveyed in a systematic way based on nonoverlapping categories, which add insight into the ongoing work by exposing similarities and differences. HW and SW solutions are discussed in a similar fashion so that interrelationships become apparent. The presented categories are illustrated by representative literature examples to illustrate their properties. Moreover, it is demonstrated how hybrid schemes can be decomposed into their primitive components.
great lakes symposium on vlsi | 2016
Dimitrios Stamoulis; Simone Corbetta; Dimitrios Rodopoulos; Pieter Weckx; Peter Debacker; Brett H. Meyer; Ben Kaczer; Praveen Raghavan; Dimitrios Soudris; Francky Catthoor; Zeljko Zilic
Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.
computing frontiers | 2016
George Chatzikonstantis; Dimitrios Rodopoulos; Sofia Nomikou; Christos Strydis; Chris I. De Zeeuw; Dimitrios Soudris
The development of physiologically plausible neuron models comes with increased complexity, which poses a challenge for many-core computing. In this work, we have chosen an extension of the demanding Hodgkin-Huxley model for the neurons of the Inferior Olivary Nucleus, an area of vital importance for motor skills. The computing fabric of choice is an Intel Xeon-Xeon Phi system, widely-used in modern computing infrastructure. The target application is parallelized with combinations of MPI and OpenMP. The best configurations are scaled up to human InfOli numbers.
great lakes symposium on vlsi | 2015
Dimitrios Stamoulis; Dimitrios Rodopoulos; Brett H. Meyer; Dimitrios Soudris; Francky Catthoor; Zeljko Zilic
In this paper, we propose EDA methodologies for efficient, datapath-wide reliability analysis under Bias Temperature Instability (BTI). The proposed EDA flow combines the efficiency of atomistic, pseudo-transient BTI modeling with the accuracy of commercial Static Timing Analysis (STA) tools. In order to reduce the transistor inventory that needs to be tracked by the STA solver, we develop a threshold-pruning methodology to identify the variation-critical part of a design. That way, we accelerate variation-aware STA iterations, with a maximum speedup of 6.82x achieved for representative benchmark circuits. We substantiate the efficiency of the proposed framework for realistic designs. For a CPU datapath, our threshold-pruning technique outperforms built-in pruning commands of the STA solver by 16.87% in terms of runtime improvement. We demonstrate the impact of BTI after three years of operation, with clock frequency degradation up to 24% and functional yield reduction below 90% for higher frequencies.
IEEE Transactions on Very Large Scale Integration Systems | 2015
Dimitrios Rodopoulos; Antonis Papanikolaou; Francky Catthoor; Dimitrios Soudris
Transient errors are a major concern for the correct operation of low-level cache memories. Aggressive integration requires effective mitigation of such errors, without extreme overheads in power, timing, or silicon area. We demonstrate a hybrid (hardware-software) scheme that mitigates bit flips in data that reside in low-level caches. The methodology is shown to be applicable in streaming applications and we illustrate that with a video decoding case study on a state-of-the-art many-core chip. The single-chip cloud computer is an experimental processor created by Intel Labs. Dedicated on-chip memories are utilized to keep safe copies for key application data, thus allowing rollbacks upon error detection. The experimental results illustrate the tradeoff between application delay, consumed energy, and output fidelity as the injected errors are corrected. When output fidelity is considered as a hard constraint, application slack used for mitigation can be reclaimed with dynamic frequency scaling. Output fidelity is guaranteed regardless of the error injection intensity and the applications timing constraints are respected up to a certain upper bound of error injection.