Is this you? Create Your Porfile

Nikos Foutris

National and Kapodistrian University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nikos Foutris is active.

Explore More

Publication

Featured researches published by Nikos Foutris.

international symposium on microarchitecture | 2011

Accelerating microprocessor silicon validation by exposing ISA diversity

Nikos Foutris; Dimitris Gizopoulos; Mihalis Psarakis; Xavier Vera; Antonio González

Microprocessor design validation is a time consuming and costly task that tends to be a bottleneck in the release of new architectures. The validation step that detects the vast majority of design bugs is the one that stresses the silicon prototypes by applying huge numbers of random tests. Despite its bug detection capability, this step is constrained by extreme computing needs for random tests simulation to extract the bug-free memory image for comparison with the actual silicon image. We propose a self-checking method that accelerates silicon validation and significantly increases the number of applied random tests to improve bug detection efficiency and reduce time-to-market. Analysis of four major ISAs (ARM, MIPS, PowerPC, and x86) reveals their inherent diversity: more than three quarters of the instructions can be replaced with equivalent instructions. We exploit this property in post-silicon validation and propose a methodology for the generation of random tests that detect bugs by comparing results of equivalent instructions. We support our bug detection method in hardware with a light-weight mechanism which, in case of a mismatch, replays the random test replacing the offending instruction with its equivalent. Our bug detection method and corresponding hardware significantly accelerate the post-silicon validation process. Evaluation of the method on an x86 microprocessor model demonstrates its efficiency over simulation-based and self-checking alternatives, in terms of bug detection capabilities and validation time speedup.

international test conference | 2010

MT-SBST: Self-test optimization in multithreaded multicore architectures

Nikos Foutris; Mihalis Psarakis; Dimitris Gizopoulos; Andreas Apostolakis; Xavier Vera; Antonio González

Instruction-based or software-based self-testing (SBST) is a scalable functional testing paradigm that has gained increasing acceptance in testing of single-threaded uniprocessors. Recent computer architecture trends towards chip multiprocessing and multithreading have raised new challenges in the test process. In this paper, we present a novel self-test optimization strategy for multithreaded, multicore microprocessor architectures and apply it to both manufacturing testing (execution from on-chip cache memory) and post-silicon validation (execution from main memory) setups. The proposed self-test program execution optimization aims to: (a) take maximum advantage of the available execution parallelism provided by multiple threads and multiple cores, (b) preserve the high fault coverage that single-thread execution provides for the processor components, and (c) enhance the fault coverage of the thread-specific control logic of the multithreaded multiprocessor. The proposed multithreaded (MT) SBST methodology generates an efficient multithreaded version of the test program and schedules the resulting test threads into the hardware threads of the processor to reduce the overall test execution time and on the same time to increase the overall fault coverage. We demonstrate our methodology in the OpenSPARC T1 processor model which integrates eight CPU cores, each one supporting four hardware threads. MT-SBST methodology and scheduling algorithm significantly speeds up self-test time at both the core level (3.6 times) and the processor level (6.0 times) against single-threaded execution, while at the same time it improves the overall fault coverage. Compared with straightforward multithreaded execution, it reduces the self-test time at both the core level and the processor level by 33% and 20%, respectively. Overall, MT-SBST reaches more than 91% stuck-at fault coverage for the functional units and 88% for the entire chip multiprocessor, a total of more than 1.5M logic gates.

ieee international symposium on workload characterization | 2015

Differential Fault Injection on Microarchitectural Simulators

Sotiris Tselonis; Athanasios Chatzidimitriou; Nikos Foutris; Dimitris Gizopoulos

Fault injection on micro architectural structures modeled in performance simulators is an effective method for the assessment of microprocessors reliability in early design stages. Compared to lower level fault injection approaches it is orders of magnitude faster and allows execution of large portions of workloads to study the effect of faults to the final program output. Moreover, for many important hardware components it delivers accurate reliability estimates compared to analytical methods which are fast but are known to significantly over-estimate a structures vulnerability to faults. This paper investigates the effectiveness of micro architectural fault injection for x86 and ARM microprocessors in a differential way: by developing and comparing two fault injection frameworks on top of the most popular performance simulators, MARSS and Gem5. The injectors, called MaFIN and GeFIN (for MARSS-based and Gem5-based Fault Injector, respectively), are designed for accurate reliability studies and deliver several contributions among which: (a) reliability studies for a wide set of fault models on major hardware structures (for different sizes and organizations), (b) study on the reliability sensitivity of micro architecture structures for the same ISA (x86) implemented on two different simulators, (c) study on the reliability of workloads and micro architectures for the two most popular ISAs (ARM vs. x86). For the workloads of our experimental study we analyze the common trends observed in the CPU reliability assessments produced by the two injectors. Also, we explain the sources of difference when diverging reliability reports are provided by the tools. Both the common trends and the differences are attributed to fundamental implementations of the simulators and are supported by benchmarks runtime statistics. The insights of our analysis can guide the selection of the most appropriate tool for hardware reliability studies (and thus decision-making for protection mechanisms) on certain micro architectures for the popular x86 and ARM ISAs.

vlsi test symposium | 2014

Accelerated online error detection in many-core microprocessor architectures

Mihalis Psarakis; Nikos Foutris; Dimitris Gizopoulos

Forthcoming many-core processors are expected to be highly unreliable due to their high design complexity and aggressive manufacturing technology scaling. Online functional testing is an attractive low-cost error detection solution. A functional error detection scheme for many-core architectures can easily employ existing techniques from single-core microprocessors and exploit the available massive parallelism to reduce the total test execution time. However, the straightforward execution of test programs on such parallel architectures does not achieve the maximum theoretical speedup due to severe congestion on common hardware resources, especially the shared memory and the interconnection network. In this paper, we first identify the memory hierarchy parameters of many-core architectures that slow down the execution of parallel test programs. Then, we study typical test programs to identify which of their parts can be parallelized to improve performance. Finally, we propose a test program parallelization methodology for many-core architectures to accelerate online detection of permanent faults. We evaluate the proposed methodology on a popular many-core architecture, Intels Single-chip Cloud Computer (SCC) showing an up to 47.6X speedup compared to a serial test program execution approach.

international conference on computer design | 2013

Assessing the impact of hard faults in performance components of modern microprocessors

Nikos Foutris; Dimitris Gizopoulos; John Kalamatianos; Vilas Sridharan

A growing portion of the silicon area of modern high-performance microprocessors is dedicated to components that increase performance but do not determine functional correctness. Permanent hardware faults in these components can lead to performance fluctuation (not necessarily degradation) and do not produce functional errors. Although this fact has been identified previously, extensive research has not yet been conducted to accurately classify and quantify permanent faults in these components over a set of CPU benchmarks or measure the magnitude of the performance impact. Depending on the results of such studies, performance-related components of microprocessors can be disabled in fine or coarse granularities, salvaging microprocessor functionality at different performance levels. This paper analyzes the impact of permanent faults in the arrays and control logic of key microprocessor performance components such as the branch predictor, branch target buffer, return address stack, and data and instruction prefetchers. We apply a statistically safe fault injection campaign for single faults in performance components on a modified version of the cycle-accurate x86 architectural simulator PTLsim running the SPEC CPU2006 suite. Our evaluation reveals significant differences in the effect of faults and their performance impacts across the components as well as within each component (different fields). We classify faults for all components and analyze their IPC impact in the arrays and control logic. Our analysis shows that a very large fraction (44% to 96%) of permanent faults in these components leads only to performance fluctuation. Observation confirms the intuition that there are no functionality errors; however, many cases of a single fault in a performance component can significantly degrade microprocessor performance (2-20%average IPC reduction for SPEC CPU2006).

international on-line testing symposium | 2014

Versatile architecture-level fault injection framework for reliability evaluation: A first report

Nikos Foutris; Sotiris Tselonis; Dimitris Gizopoulos

Forthcoming technologies hold the promise of a significant increase in integration density, performance and functionality. However, a dramatic change in microprocessors reliability is also expected. Developing mechanisms for early and accurate reliability estimation will save significant design effort, resources and consequently will positively impact products time-to-market (TTM). In this paper, we propose a versatile architecture-level fault injection framework, built on top of a state-of-the-art x86 microprocessor simulator, for thorough and fast characterization of a wide range of hardware components with respect to various fault models.

international on-line testing symposium | 2013

Online error detection in multiprocessor chips: A test scheduling study

Nikos Foutris; Dimitris Gizopoulos; Mihalis Psarakis; Antonis M. Paschalis

Multicore architectures are employed in the majority of computing domains (general-purpose microprocessors as well as specialized high-performance architectures such as network processors). Online error detection in such chips can employ effective techniques from single core microprocessors, however, effective test scheduling should be employed to minimize the overall chip test execution time which can significantly increase due to congestion on common hardware resources used by the cores. In this paper, we analyze the most important aspects of online error detection and scheduling in multiprocessor chips and evaluate test execution time in several different configurations of Intels SCC architecture.

Microprocessors and Microsystems | 2015

Cross-layer reliability evaluation, moving from the hardware architecture to the system level

Alessandro Vallero; Sotiris Tselonis; Nikos Foutris; Maha Kooli; Alessandro Savino; Gianfranco Michele Maria Politano; Alberto Bosio; G. Di Natale; Dimitris Gizopoulos; S. Di Carlo

Advanced computing systems realized in forthcoming technologies hold the promise of a significant increase of computational capabilities. However, the same path that is leading technologies toward these remarkable achievements is also making electronic devices increasingly unreliable. Developing new methods to evaluate the reliability of these systems in an early design stage has the potential to save costs, produce optimized designs and have a positive impact on the product time-to-market.CLERECO European FP7 research project addresses early reliability evaluation with a cross-layer approach across different computing disciplines, across computing system layers and across computing market segments. The fundamental objective of the project is to investigate in depth a methodology to assess system reliability early in the design cycle of the future systems of the emerging computing continuum. This paper presents a general overview of the CLERECO project focusing on the main tools and models that are being developed that could be of interest for the research community and engineering practice.

international on-line testing symposium | 2015

Bayesian network early reliability evaluation analysis for both permanent and transient faults

Alessandro Vallero; Alessandro Savino; Sotiris Tselonis; Nikos Foutris; Gianfranco Michele Maria Politano; Dimitris Gizopoulos; S. Di Carlo

Analyzing the impact of software execution on the reliability of a complex digital system is an increasing challenging task. Current approaches mainly rely on time consuming fault injections experiments that prevent their usage in the early stage of the design process, when fast estimations are required in order to take design decisions. To cope with these limitations, this paper proposes a statistical reliability analysis model based on Bayesian Networks. The proposed approach is able to estimate system reliability considering both the hardware and the software layer of a system, in presence of hardware transient and permanent faults. In fact, when digital system reliability is under analysis, hardware resources of the processor and instructions of program traces are employed to build a Bayesian Network. Finally, the probability of input errors to alter both the correct behavior of the system and the output of the program is computed. According to experimental results presented in this paper, it can be stated that Bayesian Network model is able to provide accurate reliability estimations in a very short period of time. As a consequence it can be a valid alternative to fault injection, especially in the early stage of the design.

vlsi test symposium | 2016

Faults in data prefetchers: Performance degradation and variability

Nikos Foutris; Athanasios Chatzidimitriou; Dimitris Gizopoulos; John Kalamatianos; Vilas Sridharan

High-performance microprocessors employ data prefetchers to mitigate the ever-growing gap between CPU computing rates and memory latency. Technology scaling along with low voltage operation exacerbates the likelihood and rate of hard (permanent) faults in technologies used by prefetchers such as SRAM and flip flop arrays. Faulty prefetch behavior does not affect correctness but can be detrimental to performance. Hard faults in data prefetchers (unlike their soft counterparts which are rare) can cause significant single-thread performance degradation and lead to large performance variability across otherwise identical cores. In this paper, we characterize in-depth both of these aspects in microprocessors suffering from multiple hard faults in their data prefetcher components. Our study reveals fault scenarios in the prefetcher table that can degrade IPC by more than 17%, while faults in the prefetch input and request queues can slow IPC up to 24% and 26%, respectively, compared to fault-free operation. Moreover, we find that a faulty data prefetcher can substantially increase the performance variability across identical cores: the standard deviation of IPC loss for different benchmarks can be more than 4.5%.

Explore More