Dimitris Theodoropoulos

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dimitris Theodoropoulos is active.

Explore More

Publication

Featured researches published by Dimitris Theodoropoulos.

IEEE Transactions on Multimedia | 2011

Multi-Core Platforms for Beamforming and Wave Field Synthesis

Dimitris Theodoropoulos; Georgi Kuzmanov; Georgi Gaydadjiev

Immersive-Audio technologies are widely used to build experimental and commercial audio systems. However, most of them are based on standard PCs, which introduce performance limitations and excessive power consumption. To address these drawbacks, we explore the implementation prospectives of two Immersive-Audio technologies: the beamforming (BF) and the wave field synthesis (WFS). We target two popular multi-core platforms, namely graphic processor units (GPUs) and field programmable gate arrays (FPGAs). We identify the most computationally intensive parts of both applications and employ the CUDA environment to map them onto a Quadro FX1700, a GeForce 8600GT, a GTX275, and a GTX460 GPU. Furthermore, we design our custom multi-core hardware accelerators for both algorithms and map them onto Virtex6 FPGAs. Both GPU and FPGA implementations are compared against OpenMP-annotated software running on a Core2 Duo at 3.0 GHz. Experimental results suggest that middle-range GPUs process data equally well as the Core2 Duo for the BF, and approximately two times faster for the WFS. However, high-end GPU and FPGA solutions provide an order of magnitude better performance for BF, and approximately two orders of magnitude better performance for WFS than the Core2 Duo. Ultimately, single-chip GPU and FPGA implementations can provide more power-effective solutions, since they can drive more complex microphone and loudspeaker setups than PC-based approaches.

symposium on application specific processors | 2009

A reconfigurable beamformer for audio applications

Dimitris Theodoropoulos; Georgi Kuzmanov; Georgi Gaydadjiev

Beamforming is a signal processing technique that improves the signal strength received from a specific location. It is already used for many decades in telecommunications, while over the last years, it has been adopted by the audio research society, mostly to enhance speech recognition. In this paper, we propose a scalable organization for a hardware time-invariant beamformer that can be used in small handheld devices and complete 3D-audio systems. Our design can be configured according to the number of input channels. Furthermore, all critical internal modules, such as decimators, FIR filters and interpolators can be adjusted to support various input sampling rates. We developed a hardware prototype in VHDL targeting the Xilinx ML410 board incorporating Virtex4 FX60 FPGA. Following a constrained approach regarding FPGA resource utilization, our hardware prototype occupies 21% of the aforementioned FPGA when instantiating 16 beamforming modules, and consumes approximately 2 Watts of power. Furthermore, our design achieves a speedup of 28 compared to an OMP-annotated software implementation running on a Pentium D at 3.4 GHz. We also compared our design against prior related work. Results suggest that it can extract an audio source up to 11 times faster compared to a reconfigurable adaptive beamformer, and up to 19 times faster compared to DSP implementations.

Microprocessors and Microsystems | 2013

DeSyRe: On-demand system reliability

Ioannis Sourdis; Christos Strydis; Antonino Armato; Christos-Savvas Bouganis; Babak Falsafi; Georgi Gaydadjiev; Sebastian Isaza; Alirad Malek; R. Mariani; Dionisios N. Pnevmatikatos; Dhiraj K. Pradhan; Gerard K. Rauwerda; Robert M. Seepers; Rishad Ahmed Shafik; Kim Sunesen; Dimitris Theodoropoulos; Stavros Tzilis; Michalis Vavouras

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect-/fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints.

design, automation, and test in europe | 2009

Algorithms for the automatic extension of an instruction-set

Carlo Galuzzi; Dimitris Theodoropoulos; Roel Meeuws; Koen Bertels

In this paper, two general algorithms for the automatic generation of instruction-set extensions are presented. The basic instruction set of a reconfigurable architecture is specialized with new application-specific instructions. The paper proposes two methods for the generation of convex multiple input multiple output instructions, under hardware resource constraints, based on a two-step clustering process. Initially, the application is partitioned in single-output instructions of variable size and then, selected clusters are combined in convex multiple output clusters following different policies. Our results on well-known kernels show that the extended instructions-set allows to execute applications more efficiently and needing fewer cycles. Our results show that a significant overall application speed-up is achieved even for large kernels (for ADPCM decoder the speed-up is up to x2.2 and for TWOFISH encoder the speedup is up to x5.5).

applied reconfigurable computing | 2009

CCproc: A Custom VLIW Cryptography Co-processor for Symmetric-Key Ciphers

Dimitris Theodoropoulos; Alexandros Siskos; Dionisios N. Pnevmatikatos

In this paper, we present CCProc, a flexible cryptography co-processor for symmetric-key algorithms. Based on an extensive analysis of many symmetric-key ciphers, including the five AES finalists, we designed an Instruction Set Architecture tailored to symmetric-key ciphers and built a hardware processor prototype by using the VHDL language. The design was mapped on FPGAs and ASIC. Results show a small-area design, while also supporting many ciphers. Besides flexibility, a 4-core FPGA design can achieve up to 615 Mbits/sec at 95 MHz for Rijndael.

international conference on embedded computer systems architectures modeling and simulation | 2015

The AXIOM project (Agile, eXtensible, fast I/O Module)

Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Carlos Álvarez; Eduard Ayguadé; Javier Bueno; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Carlos Segura; Carles Fernández; David Oro; Javier R. Saeta; Paolo Gai; Antonio Rizzo; Roberto Giorgi

The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power for the assigned tasks, consume the least possible energy for such task (energy efficiency), scale up through modularity, allow for an easy programmability across performance scaling, and exploit at best existing standards at minimal costs.

Microprocessors and Microsystems | 2016

The AXIOM software layers

Carlos Álvarez; Eduard Ayguadé; Jaume Bosch; Javier Bueno; Artem Cherkashin; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Miquel Vidal; Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Davide Catani; David Oro; Carles Fernández; Carlos Segura; Javier Rodríguez; Javier Hernando; Claudio Scordino; Paolo Gai; Pierluigi Passera; Alberto Pomella; Nicola Bettin; Antonio Rizzo; Roberto Giorgi

People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed. The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).

digital systems design | 2015

The AXIOM Software Layers

Carlos Álvarez; Eduard Ayguadé; Javier Bueno; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Davide Catani; Claudio Scordino; Paolo Gai; Carlos Segura; Carles Fernández; David Oro; Javier R. Saeta; Pierluigi Passera; Alberto Pomella; Antonio Rizzo; Roberto Giorgi

field programmable gate arrays | 2010

A 3d-audio reconfigurable processor

Dimitris Theodoropoulos; Georgi Kuzmanov; Georgi Gaydadjiev

Various multimedia communication systems based on 3D-Audio algorithms have been proposed by researchers from the acoustic data processing domain. However, all systems reported in the literature follow a PC-based approach that introduces processing bottlenecks and excessive power consumption. In order to alleviate these problems, we propose a reconfigurable 3D-Audio processor that can record and render sound sources concurrently. Audio recording and rendering are performed by two hardware accelerators exploiting the beamforming and the Wave Field Synthesis algorithms. The theoretical scalability of the proposed processor is explored with respect to systems consisting of different microphone and loudspeaker arrays configurations. A working FPGA prototype is compared against a software implementation on a Core2 Duo system. Results suggest that the proposed reconfigurable hardware solution can process data up to 2.4x faster than the software approach, while power consumption is approximately 7 Watts according to the Xilinx XPower report.

international parallel and distributed processing symposium | 2009

Reconfigurable accelerator for WFS-based 3D-audio

Dimitris Theodoropoulos; Georgi Kuzmanov; Georgi Gayd

In this paper, we propose a reconfigurable and scalable hardware accelerator for 3D-audio systems based on the Wave Field Synthesis technology. Previous related work reveals that WFS sound systems are based on using standard PCs. However, two major obstacles are the relative low number of real-time sound sources that can be processed and the high power consumption. The proposed accelerator alleviates these limitations by its performance and energy efficient design. We propose a scalable organization comprising multiple rendering units (RUs), each of them independently processing audio samples. The processing is done in an environment of continuously varying number of sources and speakers. We provide a comprehensive study on the design trade-offs with respect to this multiplicity of sources and speakers. A hardware prototype of our proposal was implemented on a Virtex4FX60 FPGA operating at 200 MHz. A single RU can achieve up to 7× WFS processing speedup compared to a software implementation running on a Pentium D at 3.4 GHz, while consuming, according to Xilinx XPower, approximately 3 W of power only.

Explore More