Is this you? Create Your Porfile

Kostas Tsoumanis

National Technical University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kostas Tsoumanis is active.

Explore More

Publication

Featured researches published by Kostas Tsoumanis.

IEEE Transactions on Very Large Scale Integration Systems | 2016

Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation

Georgios Zervakis; Kostas Tsoumanis; Sotirios Xydis; Dimitrios Soudris; Kiamal Z. Pekmestzi

Approximate computing has received significant attention as a promising strategy to decrease power consumption of inherently error tolerant applications. In this paper, we focus on hardware-level approximation by introducing the partial product perforation technique for designing approximate multiplication circuits. We prove in a mathematically rigorous manner that in partial product perforation, the imposed errors are bounded and predictable, depending only on the input distribution. Through extensive experimental evaluation, we apply the partial product perforation method on different multiplier architectures and expose the optimal architecture-perforation configuration pairs for different error constraints. We show that, compared with the respective exact design, the partial product perforation delivers reductions of up to 50% in power consumption, 45% in area, and 35% in critical delay. In addition, the product perforation method is compared with the state-of-the-art approximation techniques, i.e., truncation, voltage overscaling, and logic approximation, showing that it outperforms them in terms of power dissipation and error.

IEEE Transactions on Very Large Scale Integration Systems | 2016

Flexible DSP Accelerator Architecture Exploiting Carry-Save Arithmetic

Kostas Tsoumanis; Sotirios Xydis; Georgios Zervakis; Kiamal Z. Pekmestzi

Hardware acceleration has been proved an extremely promising implementation strategy for the digital signal processing (DSP) domain. Rather than adopting a monolithic application-specific integrated circuit design approach, in this brief, we present a novel accelerator architecture comprising flexible computational units that support the execution of a large set of operation templates found in DSP kernels. We differentiate from previous works on flexible accelerators by enabling computations to be aggressively performed with carry-save (CS) formatted data. Advanced arithmetic design concepts, i.e., recoding techniques, are utilized enabling CS optimizations to be performed in a larger scope than in previous approaches. Extensive experimental evaluations show that the proposed accelerator architecture delivers average gains of up to 61.91% in area-delay product and 54.43% in energy consumption compared with the state-of-art flexible datapaths.

great lakes symposium on vlsi | 2015

Approximate Multiplier Architectures Through Partial Product Perforation: Power-Area Tradeoffs Analysis

Georgios Zervakis; Kostas Tsoumanis; Sotirios Xydis; Nicholas Axelos; Kiamal Z. Pekmestzi

Approximate computing has received significant attention as a promising strategy to decrease power consumption of inherently error-tolerant applications. Hardware approximation mainly targets arithmetic units, e.g. adders and multipliers. In this paper, we design new approximate hardware multipliers and propose the Partial Product Perforation technique, which omits a number of consecutive partial products by perforating their generation. Through extensive experimental evaluation, we apply the partial product perforation method on different multiplier architectures and expose the optimal configurations for different error values. We show that the partial product perforation delivers reductions of up to 50% in power consumption, 45% in area and 35% in critical delay. Also, the product perforation method is compared with state-of-the-art works on approximate computing that consider the Voltage Over-Scaling (VOS) and logic approximation (i.e. design of approximate compressors) techniques, outperforming them in terms of power dissipation by up to 17% and 20% on average respectively. Finally, with respect to the aforementioned gains, the error value delivered by the proposed product perforation method is smaller by 70% and 99% than the VOS and logic approximation methods respectively.

international midwest symposium on circuits and systems | 2015

Delta DICE: A Double Node Upset resilient latch

Nikolaos Eftaxiopoulos; Nicholas Axelos; Georgios Zervakis; Kostas Tsoumanis; Kiamal Z. Pekmestzi

In this paper we propose the novel Delta DICE latch that is tolerant to SNUs (Single Node Upsets) and DNUs (Double Node Upsets). The latch comprises three DICE cells in a delta interconnection topology, providing enough redundant nodes to guarantee resilience to conventional SNUs, as well as DNUs due to charge sharing. Simulation results demonstrated that in terms of power dissipation and propagation delay, the Delta DICE latch outperforms BISER-based latches that are SNU or DNU tolerant and provides DNU resilience at a small energy×delay penalty compared to other SNU tolerant cells.

international conference on electronics, circuits, and systems | 2014

Fused modulo 2 n − 1 add-multiply unit

Kostas Tsoumanis; Kiamal Z. Pekmestzi; Constantinos Efstathiou

Complex arithmetic operations are widely used in Digital Signal Processing (DSP) applications. Targeting to increase performance, in this work, we focus on optimizing the design of the modulo 2n - 1 Add-Multiply (AM) operation. We incorporate in the design the direct recoding of the sum of two numbers in its Modified Booth (MB) form. Compared to the conventional design of first instantiating a modulo 2n - 1 adder and then, driving its output to a modulo 2n - 1 multiplier, the proposed fused AM design yields considerable reductions in terms of critical delay, area complexity and power consumption.

international symposium on low power electronics and design | 2015

Hybrid approximate multiplier architectures for improved power-accuracy trade-offs

Georgios Zervakis; Sotirios Xydis; Kostas Tsoumanis; Dimitrios Soudris; Kiamal Z. Pekmestzi

Approximate computing forms a promising design alternative for inherently error resilient applications, trading accuracy for power savings. In this paper, we exploit multi-level approximation, i.e. at the algorithmic, the logic and the circuit level, to design low power approximate arithmetic architectures for hardware multipliers. Motivated from the limited power savings that approximation techniques can achieve in isolation, we explore hybrid methods that apply simultaneously more than one techniques from different layers. We introduce the concept of perforation for approximate arithmetic circuit design and we explore the newly defined design space of hybrid designs showing that it leads to lower power consumption at every examined error range. To address the increased complexity of the target design space, we introduce an heuristic optimization technique and the corresponding design framework that automatically generates hybrid low-power approximate multipliers requiring a small number of design evaluations, i.e. synthesis, simulation, power and timing analysis. Through extensive experimentation, we show that the proposed techniques converge towards optimal solutions and deliver approximate designs that are always more efficient with respect to state-of-art approaches. Power savings of 11% are reported for small error bounds and more than 30% in case of more relaxed error constraints.

Intelligent Decision Technologies | 2014

A high radix montgomery multiplier with concurrent error detection

Georgios Zervakis; Nikolaos Eftaxiopoulos; Kostas Tsoumanis; Nicholas Axelos; Kiamal Z. Pekmestzi

Modular multiplication is essential in cryptographic algorithms (e.g. RSA), as it determines the performance of the entire cryptographic operation and its reliability is crucial for the system security. In this paper, we propose a high-radix Montgomery Modular Multiplication (MMM) implementation and conduct an exploration to find the optimal radix. Also, a concurrent error detection circuit with 99.9% detection rate, small area and power overheads (2.24% and 1.46% respectively) is proposed to protect the MMM against fault attacks and natural faults.

Intelligent Decision Technologies | 2014

An independent dual gate SOI FinFET soft-error resilient memory cell

Nikolaos Eftaxiopoulos; Nicholas Axelos; Georgios Zervakis; Kostas Tsoumanis; Kiamal Z. Pekmestzi

In this paper we present an 8T footless storage element, the FFDICE (FinFET DICE), a dual interlocked structure using Independent Gate SOI FinFET transistors that exhibits soft error resilience characteristics. Compared to the conventional DICE cell, the proposed design achieves area savings by dispensing with the four NMOS driver transistors, retains the excellent tolerance characteristics to single node upsets and similar multiple node upset resilience. Of significance to modern designs that apply voltage scaling techniques to achieve power savings, simulation results on Static Voltage Noise Margin and Static Current Noise Margin metrics show that the proposed cell exhibits excellent stability across an examined voltage range of 0.75V to 1V.

Intelligent Decision Technologies | 2014

Modulo 2 n +1 addition and multiplication for redundant operands

Kostas Tsoumanis; Constantinos Efstathiou; Kiamal Z. Pekmestzi

Complex arithmetic operations are widely used in Digital Signal Processing (DSP) applications. Keeping the intermediate results in a redundant representation (e.g. carry-save) is a common technique to speed up chained arithmetic operations due to the elimination of the intermediate parallel additions which occupy significant area and largely increase the overall critical delay. Thus, arithmetic units with operands in a redundant representation are of considerable practical interest. In this work, we propose an efficient modulo 2n+1 addition unit with one or both operands in the redundant carry-save representation and, also, we introduce an efficient modulo 2n+1 multiplier with the one of two operands in the redundant carry-save form.

international conference on modern circuits and systems technologies | 2016

Fused modulo 2n + 1 Add-Multiply unit for diminished-1 operands

Kostas Tsoumanis; Kiamal Z. Pekmestzi; Constantinos Efstathiou

Complex arithmetic operations dominate Digital Signal Processing (DSP) applications heavily degrading their performance. Targeting to accelerate the Residue Number Systems-based DSP applications, we optimize the design of the Add-Multiply (AM) operation with modulo 2n + 1 diminished-1 operands by incorporating the direct recoding of the sum of two numbers in its Modified Booth form. Compared to the conventional allocation of an adder and a subsequent multiplier, the proposed fused AM design yields delay, area and power gains.

Explore More