Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dimitrios Stamoulis is active.

Publication


Featured researches published by Dimitrios Stamoulis.


international conference on ic design and technology | 2014

Understanding timing impact of BTI/RTN with massively threaded atomistic transient simulations

Dimitrios Rodopoulos; Dimitrios Stamoulis; Grigorios Lyras; Dimitrios Soudris; Francky Catthoor

Prior art on Bias Temperature Instability (BTI) and Random Telegraph Noise (RTN) shows their importance for digital system reliability. Reaction-diffusion models align poorly with deca-nanometer dimension experiments. Modern atomistic models capture time-zero/-dependent effects but are complicated and constrained by system memory. We propose an atomistic BTI/RTN transient simulator that can be massively threaded across any many-core platform with a hypervisor. Compared to a commercial reference we achieve x7 maximum speedup with no accuracy degradation and simulate circuits with more than 100,000 transistors. We deterministically inspect the initial stages of circuit operation, correlate delay effects with the logic depth and hint towards optimal design and simulation practices.


great lakes symposium on vlsi | 2016

Capturing True Workload Dependency of BTI-induced Degradation in CPU Components

Dimitrios Stamoulis; Simone Corbetta; Dimitrios Rodopoulos; Pieter Weckx; Peter Debacker; Brett H. Meyer; Ben Kaczer; Praveen Raghavan; Dimitrios Soudris; Francky Catthoor; Zeljko Zilic

Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.


great lakes symposium on vlsi | 2015

Efficient Reliability Analysis of Processor Datapath using Atomistic BTI Variability Models

Dimitrios Stamoulis; Dimitrios Rodopoulos; Brett H. Meyer; Dimitrios Soudris; Francky Catthoor; Zeljko Zilic

In this paper, we propose EDA methodologies for efficient, datapath-wide reliability analysis under Bias Temperature Instability (BTI). The proposed EDA flow combines the efficiency of atomistic, pseudo-transient BTI modeling with the accuracy of commercial Static Timing Analysis (STA) tools. In order to reduce the transistor inventory that needs to be tracked by the STA solver, we develop a threshold-pruning methodology to identify the variation-critical part of a design. That way, we accelerate variation-aware STA iterations, with a maximum speedup of 6.82x achieved for representative benchmark circuits. We substantiate the efficiency of the proposed framework for realistic designs. For a CPU datapath, our threshold-pruning technique outperforms built-in pruning commands of the STA solver by 16.87% in terms of runtime improvement. We demonstrate the impact of BTI after three years of operation, with clock frequency degradation up to 24% and functional yield reduction below 90% for higher frequencies.


international symposium on low power electronics and design | 2016

Can We Guarantee Performance Requirements under Workload and Process Variations

Dimitrios Stamoulis; Diana Marculescu

Modern many-core systems must cope with a wide range of heterogeneity due to both manufacturing process variations and extreme requirements of multi-application, multithreaded workloads. The latter is increasingly challenging in the context of different performance constraints per multithreaded application. Existing thread mapping methods primarily focus on maximizing performance under a global power budget, failing to provide thread- and application-specific performance guarantees. This paper provides a comprehensive approach for variation- and workload-aware thread mapping on heterogeneous multi-core systems that satisfies per-application performance requirements and is manufacturing process variation-aware, while providing an analysis of its robustness to uncertainties in the power and performance models. We formulate the variation-aware mapping problem as a constrained 0-1 integer linear program (ILP) and we propose a heuristic-based algorithm for efficiently solving it. Compared with an optimal solver, our method produces results less than 10% away from optimum on average, with four orders of magnitude improvement in runtime. Moreover, the newly proposed method is robust to model uncertainty and in meeting per application performance requirements, while agnostic approaches result in performance bound violations (up to 100% in many cases).


international conference on computer aided design | 2016

Exploring aging deceleration in FinFET-based multi-core systems

Ermao Cai; Dimitrios Stamoulis; Diana Marculescu

Power and thermal issues are the main constraints for high-performance multi-core systems. As the current technology of choice, FinFET is observed to have lower delay under higher temperature in super-threshold voltage region, an effect called temperature effect inversion (TEI). While it has been shown that system performance can be improved under power constraints, as technology aggressively scales down to sub-20nm nodes, thermal issues also emerge as important reliability concerns throughout the system lifetime. To the best of our knowledge, we are the first to provide a comprehensive evaluation of both TEI and aging effects on the performance and power of FinFET-based multi-core systems with multiple voltage/frequency levels. Our experimental results show that aging effects can be reduced by up to 53.59% by exploiting the TEI effect. Based on a combined multivariate objective for power and aging, this work proposes an aging-aware algorithm, dubbed AgingMin, to select the optimal TEI-aware voltage/frequency operation points for decelerating the aging effect. Experimental results show that AgingMin improves the classic 10-year system lifetime by an average of 1.61 years while introducing less than 1% power overhead when compared to existing state-of-the-art techniques.


workshop on cyber physical systems | 2017

Enhancing precipitation models by capturing multivariate and multiscale climate dynamics

Ruizhou Ding; Dimitrios Stamoulis; Kartikeya Bhardwaj; Diana Marculescu; Radu Marculescu

To improve precipitation predictions and enable accurate rainfall models for smart water networks, it is imperative to account for multivariate and multiscale precipitation dynamics with long-memory temporal relationships, long-range spatial dependencies, and low-frequency variability. While prior art has motivated the use of complex networks to capture these trends, existing work is limited to specific climate phenomena (such as El Niño) and regions. In this paper we employ a comprehensive assessment of complex networks with respect to multivariate dynamics across multiple temporal and spatial scales (multiscale). Our work is the first to incorporate both carbon emissions and precipitation anomalies data into a multivariate analysis. By effectively substantiating the ability of complex networks to capture multivariate and multiscale climate dynamics, we postulate their potential as reanalysis and assessment tools to enhance regional water models of arbitrary range and granularity, and to eventually enable reliable smart water systems.


international conference on electronics, circuits, and systems | 2014

Linear regression techniques for efficient analysis of transistor variability

Dimitrios Stamoulis; Dimitrios Rodopoulos; Brett H. Meyer; Dimitrios Soudris; Zeljko Zilic

Prior art on time-zero/-dependent variability shows its importance for digital system reliability throughout a typical integrated circuit (IC) lifetime. Timing analysis results could be questionable if the impact of such variations is not taken properly into consideration. Modern models can accurately capture transistor variability but they suffer from prolonged execution times. In this paper, we employ linear regression analysis to accelerate transistor variability estimation. Compared to commercial transistor-level Static Timing Analysis (STA) tools, we achieve a 4.63× average speedup and a 3.56× average memory usage reduction for standard cells and ISCAS85 benchmark circuits, with negligible accuracy degradation.


Archive | 2019

Time-Efficient Modeling and Simulation of True Workload Dependency for BTI-Induced Degradation in Processor-Level Platform Specifications

Simone Corbetta; Pieter Weckx; Dimitrios Rodopoulos; Dimitrios Stamoulis; Francky Catthoor

As semiconductor technology nodes approach scale to the deep submicron range, bias temperature instability (BTI)-induced degradation threatens the functional and parametric correctness of a digital design. In order to mitigate the negative consequences, a platform-level analysis should be enabled, early in the platform design trajectory. Unfortunately, today’s industrial EDA flows can only achieve this for large netlists by abstracting most of the workload-dependent impact. And, BTI is very workload-dependent in nature due to the partial recovery of the induced degradation during stress. In this chapter, we propose a global analysis flow which captures all of the relevant workload dependency, while still sustaining an acceptable execution time for platform netlists containing millions of devices.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2017

Profit: priority and power/performance optimization for many-core systems

Zhuo Chen; Dimitrios Stamoulis; Diana Marculescu

As power density emerges as the main constraint for many-core systems, controlling power consumption under the thermal design power while maximizing the performance becomes increasingly critical. To dynamically save power, dynamic voltage frequency scaling techniques have proved to be effective and are widely available commercially. Meanwhile, systems have certain performance constraints that the applications should satisfy to ensure quality of service. In this paper, we present an online distributed reinforcement learning (OD-RL)-based DVFS control algorithm for many-core system performance improvement under both power and performance constraints. At the finer grain, a per-core RL method is used to learn the optimal control policy of the voltage/frequency (VF) levels in a model-free manner. At the coarser grain, an efficient global power budget reallocation algorithm is used to maximize the overall performance. The experiments show that compared to the state-of-the-art algorithms: 1) OD-RL produces up to 98% less budget overshoot; 2) up to 23% higher energy efficiency; and 3) two orders of magnitude speedup over state-of-the-art techniques for systems with hundreds of cores. Furthermore, priority-aware OD-RL can better satisfy performance constraints than OD-RL with: 1)


design, automation, and test in europe | 2018

HyperPower: Power- and memory-constrained hyper-parameter optimization for neural networks

Dimitrios Stamoulis; Ermao Cai; Da-Cheng Juan; Diana Marculescu

17.8\boldsymbol {\times }

Collaboration


Dive into the Dimitrios Stamoulis's collaboration.

Top Co-Authors

Avatar

Diana Marculescu

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Dimitrios Rodopoulos

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Dimitrios Soudris

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

Ermao Cai

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Da-Cheng Juan

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Zhuo Chen

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Pieter Weckx

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge