Georgios Smaragdos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Georgios Smaragdos is active.

Explore More

Publication

Featured researches published by Georgios Smaragdos.

field programmable gate arrays | 2014

FPGA-based biophysically-meaningful modeling of olivocerebellar neurons

Georgios Smaragdos; Sebastian Isaza; Martijn F. van Eijk; Ioannis Sourdis; Christos Strydis

The Inferior-Olivary nucleus (ION) is a well-charted region of the brain, heavily associated with sensorimotor control of the body. It comprises ION cells with unique properties which facilitate sensory processing and motor-learning skills. Various simulation models of ION-cell networks have been written in an attempt to unravel their mysteries. However, simulations become rapidly intractable when biophysically plausible models and meaningful network sizes (>=100 cells) are modeled. To overcome this problem, in this work we port a highly detailed ION cell network model, originally coded in Matlab, onto an FPGA chip. It was first converted to ANSI C code and extensively profiled. It was, then, translated to HLS C code for the Xilinx Vivado toolflow and various algorithmic and arithmetic optimizations were applied. The design was implemented in a Virtex 7 (XC7VX485T) device and can simulate a 96-cell network at real-time speed, yielding a speedup of x700 compared to the original Matlab code and x12.5 compared to the reference C implementation running on a Intel Xeon 2.66GHz machine with 20GB RAM. For a 1,056-cell network (non-real-time), an FPGA speedup of x45 against the C code can be achieved, demonstrating the designs usefulness in accelerating neuroscience research. Limited by the available on-chip memory, the FPGA can maximally support a 14,400-cell network (non-real-time) with online parameter configurability for cell state and network size. The maximum throughput of the FPGA ION-network accelerator can reach 2.13 GFLOPS.

design, automation, and test in europe | 2015

Accelerating complex brain-model simulations on GPU platforms

H.A. Du Nguyen; Zaid Al-Ars; Georgios Smaragdos; Christos Strydis

The Inferior Olive (IO) in the brain, in conjunction with the cerebellum, is responsible for crucial sensorimotor-integration functions in humans. In this paper, we simulate a computationally challenging IO neuron model consisting of three compartments per neuron in a network arrangement on GPU platforms. Several GPU platforms of the two latest NVIDIA GPU architectures (Fermi, Kepler) have been used to simulate large-scale IO-neuron networks. These networks have been ported on 4 diverse GPU platforms and implementation has been optimized, scoring 3x speedups compared to its unoptimized version. The effect of GPU L1-cache and thread block size as well as the impact of numerical precision of the application on performance have been evaluated and best configurations have been chosen. In effect, a maximum speedup of 160x has been achieved with respect to a reference CPU platform.

international parallel and distributed processing symposium | 2014

A Dependable Coarse-Grain Reconfigurable Multicore Array

Georgios Smaragdos; Danish Anis Khan; Ioannis Sourdis; Christos Strydis; Alirad Malek; Stavros Tzilis

Recent trends in semiconductor technology have dictated the constant reduction of device size. One negative effect stemming from the reduction in size and increased complexity is the reduced device reliability. This paper is centered around the matter of permanent fault tolerance and graceful system degradation in the presence of permanent faults. We take advantage of the natural redundancy of homogeneous multicores following a sparing strategy to reuse functional pipeline stages of faulty cores. This is done by incorporating reconfigurable interconnects next to which the cores of the system are placed, providing the flexibility to redirect the data-flow from the faulty pipeline stages of damaged cores to spare (still) functional ones. Several micro-architectural changes are introduced to decouple the processor stages and allow them to be interchangeable. The proposed approach is a clear departure from previous ones by offering full flexibility as well as highly graceful performance degradation at reasonable costs. More specifically, our coarsegrain fault tolerant multicore array provides up to ×4 better availability compared to a conventional multicore and up to ×2 higher probability to deliver at least one functioning core in high fault densities. For our benchmarks, our design (synthesized for STM 65nm SP technology) incurs a total execution-time overhead for the complete system ranging from ×1.37 to ×3.3 compared to a (baseline) non-fault-tolerant system, depending on the permanent-fault density. The area overhead is 19.5% and the energy consumption, without incorporating any power/energy- saving technique, is estimated on average to be 20.9% higher compared to the baseline, unprotected design.

defect and fault tolerance in vlsi and nanotechnology systems | 2014

A probabilistic analysis of resilient reconfigurable designs

Alirad Malek; Stavros Tzilis; Danish Anis Khan; Ioannis Sourdis; Georgios Smaragdos; Christos Strydis

Reconfigurable hardware can be employed to tolerate permanent faults. Hardware components comprising a System-on-Chip can be partitioned into a handful of substitutable units interconnected with reconfigurable wires to allow isolation and replacement of faulty parts. This paper offers a probabilistic analysis of reconfigurable designs estimating for different fault densities the average number of fault-free components that can be constructed as well as the probability to guarantee a particular availability of components. Considering the area overheads of reconfigurability, we evaluate the resilience of various reconfigurable designs with different granularities. Based on this analysis, we conduct a comprehensive design-space exploration to identify the granularity mixes that maximize the fault-tolerance of a system. Our findings reveal that mixing fine-grain logic with a coarse-grain sparing approach tolerates up to 3× more permanent faults than component redundancy and 2× more than any other purely coarse-grain solution. Component redundancy is preferable at low fault densities, while coarse-grain and mixed-grain reconfigurability maximize availability at medium and high fault densities, respectively.

biomedical circuits and systems conference | 2014

ESL design of customizable real-time neuron networks

Martijn F. van Eijk; Carlo Galuzzi; Amir Zjajo; Georgios Smaragdos; Christos Strydis; Rene van Leuken

In this paper, we present the design and implementation of an Inferior-Olivary Nucleus (ION) network on an FPGA device. Compared with existing neuron networks, the proposed design allows to easily customize the network topology and implement existing as well as ad-hoc topologies, in order to explore different levels of connectivities between the cells. Starting from the model of an ION cell, the model has been optimized and an ION network has been designed and implemented in multiple steps. By using the Xilinx Vivado Suite, the design has been synthesized and mapped on a Virtex 7 XC7VX550T FPGA device. Experimental results show that a network of 48 ION cells can be simulated in brain real-time using double floating-point arithmetic, which allows to precisely simulate the networks behavior.

international symposium on performance analysis of systems and software | 2016

Performance analysis of accelerated biophysically-meaningful neuron simulations

Georgios Smaragdos; Georgios Chatzikostantis; Sofia Nomikou; Dimitrios Rodopoulos; Ioannis Sourdis; Dimitrios Soudris; Chris I. De Zeeuw; Christos Strydis

In-vivo and in-vitro experiments are routinely used in neuroscience to unravel brain functionality. Although they are a powerful experimentation tool, they are also time-consuming and, often, restrictive. Computational neuroscience attempts to solve this by using biologically-plausible and biophysically-meaningful neuron models, most prominent among which are the conductance-based models. Their computational complexity calls for accelerator-based computing to mount large-scale or real-time neuroscientific experiments. In this paper, we analyze and draw conclusions on the class of conductance models by using a representative modeling application of the inferior olive (InfOli), an important part of the olivocerebellar brain circuit. We conduct an extensive profiling session to identify the computational and data-transfer requirements of the application under various realistic use cases. The application is, then, ported onto two acceleration nodes, an Intel Xeon Phi and a Maxeler Vectis Data Flow Engine (DFE). We evaluate the performance scalability and resource requirements of the InfOli application on the two target platforms. The analysis of InfOli, which is a real-life neuroscientific application, can serve as a useful guide for porting a wide range of similar workloads on platforms like the Xeon Phi or the Maxeler DFEs. As accelerators are increasingly populating High-Performance Computing (HPC) infrastructure, the current paper provides useful insight on how to optimally use such nodes to run complex and relevant neuron modeling workloads.

IEEE Micro | 2016

Resilient Chip Multiprocessors with Mixed-Grained Reconfigurability

Ioannis Sourdis; Danish Anis Khan; Alirad Malek; Stavros Tzilis; Georgios Smaragdos; Christos Strydis

This article presents a chip multiprocessor (CMP) design that mixes coarse- and fine-grained reconfigurability to increase core availability of safety-critical embedded systems in the presence of hard errors. The authors conducted a comprehensive design-space exploration to identify the granularity mixes that maximize CMP fault tolerance and minimize performance and energy overheads. The authors added fine-grained reconfigurable logic to a coarse-grained sparing approach. Their resulting design can tolerate 3 times more hard errors than core redundancy and 1.5 times more than any other purely coarse-grained solution.

international conference on supercomputing | 2014

Real-Time Olivary Neuron Simulations on Dataflow Computing Machines

Georgios Smaragdos; Craig Davies; Christos Strydis; Ioannis Sourdis; Catalin Bogdan Ciobanu; Oskar Mencer; Chris I. De Zeeuw

The Inferior-Olivary nucleus ION is a well-charted brain region, heavily associated with the sensorimotor control of the body. It comprises neural cells with unique properties which facilitate sensory processing and motor-learning skills. Simulations of such neurons become rapidly intractable when biophysically plausible models and meaningful network sizes at least in the order of some hundreds of cells are modeled. To overcome this problem, we accelerate a highly detailed ION network model using a Maxeler Dataflow Computing Machine. The design simulates a 330-cell network at real-time speed and achieves maximum throughputs of 24.7 GFLOPS. The Maxeler machine, integrating a Virtex-6 FPGA, yields speedups of ×92-102, and ×2-8 compared to a reference-C implementation, running on a Intel Xeon 2.66GHz, and a pure Virtex-7 FPGA implementation, respectively.

cluster computing and the grid | 2016

mCluster: A Software Framework for Portable Device-Based Volunteer Computing

Dimitris Theodoropoulos; Grigorios Chrysos; Iosif Koidis; George Charitopoulos; Emmanouil Pissadakis; Antonis Varikos; Dionisios N. Pnevmatikatos; Georgios Smaragdos; Christos Strydis; Nikos Zervos

Recent market forecasts predict that the portable computing trend will vastly spread, as by 2020 there will bemore than 3 billion LTE device users worldwide. Motivated by this fact, many companies and research institutes have already launched research projects that utilize portable devices, voluntarily provided by users, to perform the required computations. Many such projects employ Berkeleys BOINC middleware, since it can support a large variety of stationary and mobile devices. However, currently available BOINC high-level APIs, either do not support portable devices or lack advanced processing capabilities (such as inter-node task dependencies) and/or easiness of use. To resolve these issues, we propose the mCluster software framework for application execution powered by the BOINC middleware on portable devices. mCluster adopts a task-based programming model that requires simple, pragma-based annotations of the application software, in order to dynamically resolve task dependencies. To evaluate our framework, we have have mapped a scientific application from the neuroscience domain on an small-scaled network of portable devices. mCluster significantly reduces the required programming effort and complexity to efficiently map BOINC-powered applications with task dependencies on portable devices compared to previous approaches.

Journal of Neural Engineering | 2017

BrainFrame: a node-level heterogeneous accelerator platform for neuron simulations

Georgios Smaragdos; Georgios Chatzikonstantis; Rahul Kukreja; Harry Sidiropoulos; Dimitrios Rodopoulos; Ioannis Sourdis; Zaid Al-Ars; Christoforos Kachris; Dimitrios Soudris; Chris I. De Zeeuw; Christos Strydis

OBJECTIVE The advent of high-performance computing (HPC) in recent years has led to its increasing use in brain studies through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a homogeneous acceleration platform to effectively address the complete array of modeling requirements. APPROACH In this paper we propose and build BrainFrame, a heterogeneous acceleration platform that incorporates three distinct acceleration technologies, an Intel Xeon-Phi CPU, a NVidia GP-GPU and a Maxeler Dataflow Engine. The PyNN software framework is also integrated into the platform. As a challenging proof of concept, we analyze the performance of BrainFrame on different experiment instances of a state-of-the-art neuron model, representing the inferior-olivary nucleus using a biophysically-meaningful, extended Hodgkin-Huxley representation. The model instances take into account not only the neuronal-network dimensions but also different network-connectivity densities, which can drastically affect the workloads performance characteristics. MAIN RESULTS The combined use of different HPC technologies demonstrates that BrainFrame is better able to cope with the modeling diversity encountered in realistic experiments while at the same time running on significantly lower energy budgets. Our performance analysis clearly shows that the model directly affects performance and all three technologies are required to cope with all the model use cases. SIGNIFICANCE The BrainFrame framework is designed to transparently configure and select the appropriate back-end accelerator technology for use per simulation run. The PyNN integration provides a familiar bridge to the vast number of models already available. Additionally, it gives a clear roadmap for extending the platform support beyond the proof of concept, with improved usability and directly useful features to the computational-neuroscience community, paving the way for wider adoption.

Explore More