Is this you? Create Your Porfile

Aristidis Sotiropoulos

National Technical University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aristidis Sotiropoulos is active.

Explore More

Publication

Featured researches published by Aristidis Sotiropoulos.

international parallel and distributed processing symposium | 2001

Minimizing completion time for loop tiling with computation and communication overlapping

Georgios I. Goumas; Aristidis Sotiropoulos; Nectarios Koziris

This paper proposes a new method for the problem of minimizing the execution time of nested for-loops using a tiling transformation. In our approach, we are interested not only in tile size and shape according to the required communication to computation ratio, but also in overall completion time. We select a time hyperplane to execute different tiles much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive, atomic tile executions. We assign tiles to processors according to the tile space boundaries thus considering the iteration space bounds. Our schedule considerably reduces overall completion time under the assumption that some part from every communication phase can be efficiently overlapped with atomic, pure tile computations. The overall schedule resembles a pipelined datapath where computations are not anymore interleaved with sends and receives to non-local processors. Experimental results in a cluster of Pentiums by using various MPI send primitives show that the total completion time is significantly reduced.

conference on high performance computing (supercomputing) | 2002

Pipelined Scheduling of Tiled Nested Loops onto Clusters of SMPs Using Memory Mapped Network Interfaces

Maria Athanasaki; Aristidis Sotiropoulos; Georgios Tsoukalas; Nectarios Koziris

This paper describes the performance benefits attained using enhanced network interfaces to achieve low latency communication. We present a novel, pipelined scheduling approach which takes advantage of DMA communication mode, to send data to other nodes, while the CPUs are performing calculations. We also use zero-copy communication through pinned-down physical memory regions, provided by NIC’s driver modules. Our testbed concerns the parallel execution of tiled nested loops onto a cluster of SMP nodes with single PCI-SCI NICs inside each node. In order to schedule tiles, we apply a hyperplane-based grouping transformation to the tiled space, so as to group together independent neighboring tiles and assign them to the same SMP node. Experimental evaluation illustrates that memory mapped NICs with enhanced communication features enable the use of a more advanced pipelined (overlapping) schedule, which considerably improves performance, compared to an ordinary blocking schedule, implemented with conventional, CPU and kernel bounded, communication primitives.

SpringerPlus | 2016

An overview of platforms for cloud based development

George Fylaktopoulos; Georgios I. Goumas; Michael Skolarikis; Aristidis Sotiropoulos; Ilias Maglogiannis

Abstract This paper provides an overview of the state of the art technologies for software development in cloud environments. The surveyed systems cover the whole spectrum of cloud-based development including integrated programming environments, code repositories, software modeling, composition and documentation tools, and application management and orchestration. In this work we evaluate the existing cloud development ecosystem based on a wide number of characteristics like applicability (e.g. programming and database technologies supported), productivity enhancement (e.g. editor capabilities, debugging tools), support for collaboration (e.g. repository functionality, version control) and post-development application hosting and we compare the surveyed systems. The conducted survey proves that software engineering in the cloud era has made its initial steps showing potential to provide concrete implementation and execution environments for cloud-based applications. However, a number of important challenges need to be addressed for this approach to be viable. These challenges are discussed in the article, while a conclusion is drawn that although several steps have been made, a compact and reliable solution does not yet exist.

Journal of Parallel and Distributed Computing | 2003

A pipelined schedule to minimize completion time for loop tiling with computation and communication overlapping

Nectarios Koziris; Aristidis Sotiropoulos; Georgios I. Goumas

This paper proposes a new method for the problem of minimizing the execution time of nested for-loops using a tiling transformation. In our approach, we are interested not only in tile size and shape according to the required communication to computation ratio, but also in overall completion time. We select a time hyperplane to execute different tiles much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive, atomic tile executions. We assign tiles to processors according to the tile space boundaries, thus considering the iteration space bounds. Our schedule considerably reduces overall completion time under the assumption that some part from every communication phase can be efficiently overlapped with atomic, pure tile computations. The overall schedule resembles a pipelined datapath where computations are not anymore interleaved with sends and receives to nonlocal processors. We survey the application of our schedule to modern communication architectures. We performed two sets of experimental results, one using MPI primitives over FastEthernet and one using the SISCI API over an SCI network. In both cases, the total completion time is significantly reduced.

international parallel and distributed processing symposium | 2002

Enhancing the performance of tiled loop execution onto clusters using memory mapped network interfaces and pipelined schedules

Aristidis Sotiropoulos; Georgios Tsoukalas; Nectarios Koziris

This paper describes the performance benefits attained using enhanced network interfaces to achieve low latency communication. Our experimental testbed concerns the parallel execution of tiled nested loops onto a Linux PC cluster with PCI-SCI NICs (Dolphin D330). Tiles are necessarily exchanging data and should also have large computational grain, so that their parallel execution becomes beneficial. We schedule tiles much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive, atomic tile executions. The applied nonblocking schedule resembles a pipelined datapath where computation phases are overlapped with communication ones, instead of being interleaved with them. We are using DMA communication mode, to remote write (send) data to other nodes, while the host CPU is computing all iterations within each tile. We achieve zero-copy communication through pinned-down physical memory regions for DMA (PCI exported segments to SCI global space). Results illustrate that when using enhanced communication features such as DMA transfers, memory-mapped interfaces and zero-copy mechanisms, overall performance is considerably enhanced than when typically using conventional, CPU and kernel bounded, communication primitives.

The Journal of Supercomputing | 2005

Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs

Maria Athanasaki; Aristidis Sotiropoulos; Georgios Tsoukalas; Nectarios Koziris; Panayiotis Tsanakas

This paper proposes a novel approach for the parallel execution of tiled Iteration Spaces onto a cluster of SMP PC nodes. Each SMP node has multiple CPUs and a single memory mapped PCI-SCI Network Interface Card. We apply a hyperplane-based grouping transformation to the tiled space, so as to group together independent neighboring tiles and assign them to the same SMP node. In this way, intranode (intragroup) communication is annihilated. Groups are atomically executed inside each node. Nodes exchange data between successive group computations. We schedule groups much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive atomic group executions. The applied non-blocking schedule resembles a pipelined datapath, where group computation phases are overlapped with communication ones, instead of being interleaved with them. Our experimental results illustrate that the proposed method outperforms previous approaches involving blocking communication or conventional grouping schemes.

Future Generation Computer Systems | 2018

A distributed modular platform for the development of cloud based applications

George Fylaktopoulos; Michael Skolarikis; I. Papadopoulos; Georgios I. Goumas; Aristidis Sotiropoulos; Ilias Maglogiannis

Abstract In this paper we describe the CIRANO platform, a modular Integrated Development Environment (IDE) for cloud based applications. The proposed platform is built to support Model Driven Development (MDD) and team collaboration, facilitating the rapid development of advanced applications in the cloud. The paper presents at a first stage the state of the art in the field of cloud IDEs and describes the design, implementation and technical details of the CIRANO platform. The main features of the proposed platform are presented in two case studies concerning the development of an application from scratch and porting of an existing application. The paper discusses the findings in comparison with existing tools and proposes extensions of the platform as future work.

cluster computing and the grid | 2002

Efficient Utilization of Memory Mapped NICs onto Clusters using Pipelined Schedules

Aristidis Sotiropoulos; Georgios Tsoukalas; Nectarios Koziris

This paper describes the performance benefits attained using enhanced network interfaces to achieve low latency communication. We make use of DMA communication mode, to send data to other nodes, while the CPU performs useful calculations. Zero-copy communication is achieved through pinned-down physical memory regions, provided by NICs driver modules. Our testbed concerns the parallel execution of tiled nested loops onto a Linux PC cluster with PCI-SCI NICs (Dolphin D330). Tiles are essentially exchanging data and should also have large Computational grain, so that their parallel execution becomes beneficial. We schedule tiles much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive, atomic tile executions. The applied nonblocking schedule resembles a pipelined data-path where computation phases are overlapped with communication ones, instead of being interleaved with them. Experimental evaluation illustrates that when using enhanced communication features such as DMA transfers, memory-mapped interfaces and zero-copy mechanisms, overall performance is considerably improved compared to using conventional, CPU and kernel bounded, communication primitives.

international conference on parallel processing | 2002

A pipelined execution of tiled nested loops on SMPs with computation and communication overlapping

Maria Athanasaki; Aristidis Sotiropoulos; Georgios Tsoukalas; Nectarios Koziris

This paper proposes a novel approach for the parallel execution of tiled iteration spaces onto a cluster of SMP PC nodes. Each SMP node has multiple CPUs and a single memory mapped PCI-SCI network interface card. We apply a hyperplane-based grouping transformation to the tiled space, so as to group together independent neighboring tiles and assign them to the same SMP node. In this way, intranode (intragroup) communication is annihilated. Groups are atomically executed inside each node. Nodes exchange data between successive group computations. We schedule groups much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive atomic group executions. The applied non-blocking schedule resembles a pipelined datapath where group computation phases are overlapped with communication ones, instead of being interleaved with them. Our experimental results illustrate that the proposed method outperforms previous approaches involving blocking communication or conventional grouping schemes.

grid computing | 2007