Is this you? Create Your Porfile

Konstantinos Nakos

National and Kapodistrian University of Athens

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Konstantinos Nakos is active.

Explore More

Publication

Featured researches published by Konstantinos Nakos.

international conference on electronics, circuits, and systems | 2006

A High Performance VLSI FFT Architecture

Konstantinos Babionitakis; Konstantinos Manolopoulos; Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos; Vassilios A. Chouliaras

High performance VLSI-based FFT architectures are key to signal processing and telecommunication systems since they meet the hard real-time constraints at low silicon area and low power compared to CPU-based solutions. In order to meet these goals, this paper presents a novel VLSI FFT architecture based on combining three consecutive radix-4 stages to result in a 64-point FFT engine. Cascading these 64-point FFT engines consequences an improved architecture design featuring certain characteristics. First, it can efficiently accommodate large input data sets in real time. It also simplifies processing requirements due to the radix-4 calculations. Finally, it reduces memory requirements and latency to one third compared to the fully unfolded radix-4 architecture. Two different implementations are utilized in order to validate the architecture efficiency: a FPGA implementation of a 4096-point FFT achieving a throughput of 4096 point/20.48 usec, and a VLSI implementation sustaining a throughput of 4096 point/3.89 usec.

international conference on electronics, circuits, and systems | 2006

An Efficient H.264 VLSI Advanced Video Encoder

Konstantinos Babionitakis; George Lentaris; Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos; Gregory Doumenis; George Georgakarakos; John Sifnaios

Video technology evolution has boosted the need for the H.264/AVC encoder with real-time performance. In order to meet such need the present paper presents a VLSI H.264/AVC encoder architecture and the relevant details on design and implementation of the specific modules. The encoder design complies with the reference software encoder of the standard and follows the baseline profile level 3.0. The encoder constitutes an IP-core and/or stand-alone solution targeting to low area applications. The architecture achieves maximum throughput of 30 frames/sec with frame size 1024times768. Results and performance measurements of the entire encoder have been validated on FPGA and VLSI .18 mum.

international conference on electronics, circuits, and systems | 2008

Addressing technique for parallel memory accessing in radix-2 FFT processors

Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos

This paper presents an efficient technique for addressing in radix-2 FFT architectures. The novel addressing organization provides parallel load and store of the data involved in a radix-2 butterfly computation. The addressing scheme is based on a permutation of the FFT data, which leads to the minimization of the address generating circuit and the butterfly processor control. The paper proves the correctness of the technique and includes a FPGA implementation.

Integration | 2008

Customization of an embedded RISC CPU with SIMD extensions for video encoding: A case study

Vassilios A. Chouliaras; Vincent M. Dwyer; Shahrukh Agha; Jose Luis Nunez-Yanez; Dionysios I. Reisis; Konstantinos Nakos; Konstantinos Manolopoulos

This work presents a detailed case study in customizing a configurable, extensible, 32-bit RISC processor with vector/SIMD instruction extensions for the efficient execution of block-based video-coding algorithms utilizing a proprietary co-design environment. In addition to the default Full-Search motion estimation of the MPEG-2 Test Model 5, fourteen fast ME algorithms were implemented in both scalar and vector form. Results demonstrate a reduction of up to 68% in the dynamic instruction count of the full search-based encoder whereas the fast motion estimation algorithms achieved a reduction in instruction count of nearly 90%, both accelerated via three 128-bit vector/SIMD instructions when compared to the scalar, reference implementation of the standard. We address in detail the profiling, vectorization and the development of these vector instruction set extensions, discuss in depth the implementation of a parametric vector accelerator that implements these instructions and show the introduction of that accelerator into a 32-bit RISC processor pipeline, in a closely-coupled configuration.

international conference on electronics, circuits, and systems | 2007

High Performance 16K, 64K, 256K complex points VLSI Systolic FFT Architectures

Konstantinos Manolopoulos; Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos; Vassilios A. Chouliaras

Targeting to improving the efficiency of real-time Fourier transform computations with large input data sets, this paper presents the design and the VLSI implementation of 16 K, 64 K and 265 K complex points fast Fourier transform (FFT) systolic architectures. These organizations are deeply pipelined to maximize the operating frequency and follow the approach of decomposing the transforms into 64 -point FFT computations to minimize the buffer size between consecutive stages. The resulting organizations achieve real time performance on testing and observation applications. They include simple processing elements and they are scalable with respect to the operating frequency and data width. Validation on FPGA showed operation at 250 MHz and 125 MHz for the 16 K and the 64 K architectures with throughput lGs/s and 500 Ms/s respectively. The VLSI implementations of the proposed 16 K, 64 K and 265 K architectures achieve post-route clock frequencies of 352, 256.5, and 188 MHz respectively and they can sustain throughputs of 1.4 Gs/s, lGs/s and 188 Ms/s.

international conference on electronics, circuits, and systems | 2009

Efficient cascaded VLSI FFT architecture for OFDM systems

Vassilios A. Chouliaras; Panagiotis Galiatsatos; Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos

This paper presents a throughput efficient cascaded FFT architecture suitable for OFDM telecommunication applications. The design exploits a technique parallelizing the radix-2 butterfly computations to increase the throughput by 2, while it keeps the complexity of the VLSI area equal to the single path delay feedback architectures. A 2048 complex point radix-2 implementation with .13 TSMC validates the results.

international conference on electronics circuits and systems | 2003

A VLSI architecture for minimizing the transmission power in OFDM transceivers

Konstantinos Babionitakis; Y. Dagres; Konstantinos Nakos; Dionysios I. Reisis

This paper presents a VLSI architecture for optimizing the transmission power required in turbo-Coded Orthogonal Frequency Division Multiplexing modems. The technique adapts the transmission parameters according to the Quality of Service requirements. CORDIC computations are used to improve the VSLI area. The architecture performs at wire-speed, uses minimal area and has shown the performance gain in an indoor wireless application. An implementation using Field Programmable Gated Array technology has validated the results.

Journal of Signal Processing Systems | 2018

Parallel Memory Accessing for FFT Architectures

V. Kitsakis; Konstantinos Nakos; Dionysios I. Reisis; Nikolaos Vlassopoulos

The current paper introduces an efficient technique for parallel data addressing in FFT architectures performing in-place computations. The novel addressing organization provides parallel load and store of the data involved in radix-r butterfly computations and leads to an efficient architecture when r is a power of 2. The addressing scheme is based on a permutation of the FFT data, which leads to the improvement of the address generating circuit and the butterfly processor control. Moreover, the proposed technique is suitable for mixed radix applications, especially for radixes that are powers of 2 and straightforward continuous flow implementation. The paper presents the technique and the resulting FFT architecture and shows the advantages of the architecture compared to hitherto published results. The implementations on a Xilinx FPGA Virtex-7 VC707 of the in-place radix-8 FFT architectures with input sizes 64 and 512 complex points validate the results.

International Journal of Computers and Applications | 2007

Thread-parallel MPEG-2 and MPEG-4 encoders for shared-memory System-on-Chip multiprocessors

Vassilios A. Chouliaras; Tr Jacobs; Jose Luis Nunez-Yanez; Konstantinos Manolopoulos; Konstantinos Nakos; Dionysios I. Reisis

Abstract This work focuses on speeding up MPEG-2 and MPEG-4 encoding by using thread parallelism for shared-memory, System-on-Chip (SoC) multiprocessors. Improving the performance of the MPEG encoders is shown by reducing the dynamic instruction count at multiple processor contexts and then mapping onto a configurable SoC multiprocessor. The resulting reduction in the dynamic instruction count of the parallelized MPEG-2 TM5 encoder for 32 processor contexts reaches a maximum of 95% and that of the MPEG-4 XViD a maximum of 83% for 16 processor contexts, both compared to the sequential encoder. To realize the parallelized encoders we present a configurable, N-way, extensible, bus-based, cache-coherent SoC multiprocessor, augmented with data-parallel coprocessors, and we give the VLSI implementation for the 2-way and 4-way configurations.

Journal of Real-time Image Processing | 2008

A real-time motion estimation FPGA architecture

Konstantinos Babionitakis; Gregory Doumenis; George Georgakarakos; George Lentaris; Konstantinos Nakos; Dionysios I. Reisis; Ioannis Sifnaios; Nikolaos Vlassopoulos

Explore More