Is this you? Create Your Porfile

Sotirios Xydis

National Technical University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sotirios Xydis is active.

Explore More

Publication

Featured researches published by Sotirios Xydis.

IEEE Transactions on Circuits and Systems I-regular Papers | 2014

An Optimized Modified Booth Recoder for Efficient Design of the Add-Multiply Operator

Kostas Tsoumanis; Sotirios Xydis; Constantinos Efstathiou; Nikolaos Moschopoulos; Kiamal Z. Pekmestzi

Complex arithmetic operations are widely used in Digital Signal Processing (DSP) applications. In this work, we focus on optimizing the design of the fused Add-Multiply (FAM) operator for increasing performance. We investigate techniques to implement the direct recoding of the sum of two numbers in its Modified Booth (MB) form. We introduce a structured and efficient recoding technique and explore three different schemes by incorporating them in FAM designs. Comparing them with the FAM designs which use existing recoding schemes, the proposed technique yields considerable reductions in terms of critical delay, hardware complexity and power consumption of the FAM unit.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2015

SPIRIT: Spectral-Aware Pareto Iterative Refinement Optimization for Supervised High-Level Synthesis

Sotirios Xydis; Gianluca Palermo; Vittorio Zaccaria; Cristina Silvano

Supervised high-level synthesis (HLS) is a new class of design problems where exploration strategies play the role of supervisor for tuning an HLS engine. The complexity of the problem is increased due to the large set of tunable parameters exposed by the “new wave” of HLS tools that include not only architectural alternatives but also compiler transformations. In this paper, we developed a novel exploration approach, called spectral-aware Pareto iterative refinement, that exploits response surface models (RSMs) and spectral analysis for predicting the quality of the design points without resorting to costly architectural synthesis procedures. We show that the target solution space can be accurately modeled through RSMs, thus enabling a speedup of the overall exploration without compromising the quality of results. Furthermore, we introduce the usage of spectral techniques to find high variance regions of the design space that require analysis for improving the RSMs prediction accuracy.

international conference on modern circuits and systems technologies | 2016

ECG signal analysis and arrhythmia detection on IoT wearable medical devices

Dimitra Azariadi; Vasileios Tsoutsouras; Sotirios Xydis; Dimitrios Soudris

Healthcare is one of the most rapidly expanding application areas of the Internet of Things (IoT) technology. IoT devices can be used to enable remote health monitoring of patients with chronic diseases such as cardiovascular diseases (CVD). In this paper we develop an algorithm for ECG analysis and classification for heartbeat diagnosis, and implement it on an IoT-based embedded platform. This algorithm is our proposal for a wearable ECG diagnosis device, suitable for 24-hour continuous monitoring of the patient. We use Discrete Wavelet Transform (DWT) for the ECG analysis, and a Support Vector Machine (SVM) classifier. The best classification accuracy achieved is 98.9%, for a feature vector of size 18, and 2493 support vectors. Different implementations of the algorithm on the Galileo board, help demonstrate that the computational cost is such, that the ECG analysis and classification can be performed in real-time.

design, automation, and test in europe | 2014

Voltage island management in near threshold manycore architectures to mitigate dark silicon

Cristina Silvano; Gianluca Palermo; Sotirios Xydis; Ioannis S. Stamelakos

The power-wall problem driven by the stagnation of supply voltages in deep-submicron technology nodes, is now the major scaling barrier for moving towards the manycore era. Although the technology scaling enables extreme volumes of computational power, power budget violations will permit only a limited portion to be actually exploited, leading to the so called dark silicon. Near-Threshold voltage Computing (NTC) has emerged as a promising approach to overcome the manycore power-wall, at the expenses of reduced performance values and higher sensitivity to process variations. Given that several application domains operate over specific performance constraints, the performance sustainability is considered a major issue for the wide adoption of NTC. Thus, in this paper, we investigate how performance guarantees can be ensured when moving towards NTC manycores through variability-aware voltage and frequency allocation schemes. We propose three aggressive NTC voltage tuning and allocation strategies, showing that STC performance can be efficiently sustained or even optimized at the NTC regime. Finally, we show that NTC highly depends on the underlying workload characteristics, delivering average power gains of 65% for thread-parallel workloads and up to 90% for process-parallel workloads, while offering an extensive analysis on the effects of different voltage tuning/allocation strategies and voltage regulator configurations.

international conference on embedded computer systems: architectures, modeling, and simulation | 2010

Custom multi-threaded Dynamic Memory Management for Multiprocessor System-on-Chip platforms

Sotirios Xydis; Alexandros Bartzas; Iraklis Anagnostopoulos; Dimitrios Soudris; Kiamal Z. Pekmestzi

We address the problem of custom Dynamic Memory Management (DMM) in Multi-Processor System-on-Chip (MPSoC) architectures. Customization is enabled through the definition of a design space that captures in a global, modular and parameterized manner the primitive building blocks of multi-threaded DMM. A systematic exploration methodology is proposed to efficiently traverse the design space. Customized Pareto DMM configurations are automatically generated through the development of software tools implementing the proposed methodology. Experimental evaluation based on a real-life multithreaded dynamic network application show that the proposed methodology delivers higher quality (application-specific) solutions in comparison with state-of-the-art dynamic memory managers together with 62% exploration runtime reductions.

ieee computer society annual symposium on vlsi | 2010

2PARMA: Parallel Paradigms and Run-Time Management Techniques for Many-Core Architectures

Cristina Silvano; William Fornaciari; S. Crespi Reghizzi; Giovanni Agosta; Gianluca Palermo; Vittorio Zaccaria; Patrick Bellasi; Fabrizio Castro; Simone Corbetta; A. Di Biagio; E. Speziale; Michele Tartara; David Siorpaes; Heiko Hübert; Benno Stabernack; Jens Brandenburg; Martin Palkovic; Praveen Raghavan; Chantal Ykman-Couvreur; Alexandros Bartzas; Sotirios Xydis; Dimitrios Soudris; Torsten Kempf; Gerd Ascheid; Rainer Leupers; Heinrich Meyr; J. Ansari; P. Mähönen; Bart Vanthournout

The main goals of the 2PARMA project are: the definition of a parallel programming model combining component-based and single-instruction multiple-thread approaches, instruction set virtualisation based on portable byte-code, run-time resource management policies and mechanisms as well as design space exploration methodologies for many-core computing architectures.

Integration | 2009

Designing coarse-grain reconfigurable architectures by inlining flexibility into custom arithmetic data-paths

Sotirios Xydis; George Economakos; Kiamal Z. Pekmestzi

This paper introduces a design technique for coarse-grained reconfigurable architectures targeting digital signal processing (DSP) applications. The design procedure is analyzed in detail and an area-time-power efficient reconfigurable kernel architecture is presented. The proposed technique inlines flexibility into custom carry-save (CS) arithmetic datapaths exploiting a stable and canonical interconnection scheme. The canonical interconnection is revealed by a transformation, called uniformity transformation, imposed on the basic architectures of CS-multipliers and CS-chain-adders/subtractors. Experimental results including quantitative and qualitative comparisons with existing reconfigurable arithmetic cores and exploration results of the proposed reconfigurable architecture are provided.

IEEE Transactions on Very Large Scale Integration Systems | 2011

High Performance and Area Efficient Flexible DSP Datapath Synthesis

Sotirios Xydis; George Economakos; Dimitrios Soudris; Kiamal Z. Pekmestzi

This paper presents a new methodology for the synthesis of high performance flexible datapaths, targeting computationally intensive digital signal processing kernels of embedded applications. The proposed methodology is based on a novel coarse-grained reconfigurable/flexible architectural template, which enables the combined exploitation of the horizontal and vertical parallelism along with the operation chaining opportunities found in the applications behavioral description. Efficient synthesis techniques exploiting these architectural optimization concepts from a higher level of abstraction are presented and analyzed. Extensive experimentation showed average latency and area reductions up to 33.9% and 53.9%, respectively, and higher hardware area utilization, compared to previously published high performance coarse-grained reconfigurable datapaths.

IEEE Embedded Systems Letters | 2011

Custom Microcoded Dynamic Memory Management for Distributed On-Chip Memory Organizations

Iraklis Anagnostopoulos; Sotirios Xydis; Alexandros Bartzas; Zhonghai Lu; Dimitrios Soudris; Axel Jantsch

Multiprocessor system-on-chip (MPSoCs) have attracted significant attention since they are recognized as a scalable paradigm to interconnect and organize a high number of cores. Current multicore embedded systems exhibit increased levels of dynamic behavior, leading to unexpected memory footprint variations unknown at design time. Dynamic memory management (DMM) is a promising solution for such types of dynamic systems. Although some efficient dynamic memory managers have been proposed for conventional bus-based MPSoC platforms, there are no DMM solutions regarding the constraints and the opportunities delivered by the physical distribution of multiple memory nodes of the platform. In this work, we address the problem of providing customized microcoded DMM on MPSoC platforms with distributed memory organization. Customization is enabled at application- and platform-level. Results show that customized microcoded DMM can serve approximately 7× more allocation requests compared to pure distributed memory platforms and perform 25% faster than the corresponding high-level implementation in C language.

ifip ieee international conference on very large scale integration | 2013

A framework for Compiler Level statistical analysis over customized VLIW architecture

Amir Hossein Ashouri; Vittorio Zaccaria; Sotirios Xydis; Gianluca Palermo; Cristina Silvano

Very Long Instruction Word (VLIW) application specific processors represent an attractive solution for embedded computing, offering significant computational power with reduced hardware complexity. However, they impose higher compiler complexity since the instructions are executed in parallel based on the static compiler schedule. Therefore, finding a promising set of compiler transformations and defining their effects have a significant impact on the overall system performance. The proposed methodology provides the designer with an integrated framework to automatically (i) generate optimized application-specific VLIW architectural configurations and (ii) analyze compiler level transformations, enabling application-specific compiler tuning over customized VLIW system architectures. We based the aforementioned analysis on a Design of Experiments (DoEs) procedure that captures in a statistical manner the higher order effects among different sets of activated compiler transformations. Applying the proposed methodology onto real-case embedded application scenarios, we show that (i) only a limited set of compiler transformations exposes high confidence level (over 95%) in affecting the performance and (ii) using them we could be able to achieve gains between (16-23)% in comparison to the default optimization levels.

Explore More