Georgi Gaydadjiev | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Georgi Gaydadjiev is active.

Explore More

Publication

Featured researches published by Georgi Gaydadjiev.

IEEE Transactions on Computers | 2004

The MOLEN polymorphic processor

Stamatis Vassiliadis; Stephan Wong; Georgi Gaydadjiev; Koen Bertels; Georgi Kuzmanov; Elena Moscu Panainte

In this paper, we present a polymorphic processor paradigm incorporating both general-purpose and custom computing processing. The proposal incorporates an arbitrary number of programmable units, exposes the hardware to the programmers/designers, and allows them to modify and extend the processor functionality at will. To achieve the previously stated attributes, we present a new programming paradigm, a new instruction set architecture, a microcode-based microarchitecture, and a compiler methodology. The programming paradigm, in contrast with the conventional programming paradigms, allows general-purpose conventional code and hardware descriptions to coexist in a program: In our proposal, for a given instruction set architecture, a onetime instruction set extension of eight instructions, is sufficient to implement the reconfigurable functionality of the processor. We propose a microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution. To prove the viability of the proposal, we experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA. We have implemented three operations, SAD, DCT, and IDCT. The overall attainable application speedup for the MPEG-2 encoder and decoder is between 2.64-3.18 and between 1.56-1.94, respectively, representing between 93 percent and 98 percent of the theoretically obtainable speedups.

field programmable gate arrays | 2005

64-bit floating-point FPGA matrix multiplication

Yong Dou; Stamatis Vassiliadis; Georgi Kuzmanov; Georgi Gaydadjiev

We introduce a 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations. A general block matrix multiplication algorithm, applicable for an arbitrary matrix size is proposed. The algorithm potentially enables optimum performance by exploiting the data locality and reusability incurred by the general matrix multiplication scheme and considering the limitations of the I/O bandwidth and the local storage volume. We implement a scalable linear array of processing elements (PE) supporting the proposed algorithm in the Xilinx Virtex II Pro technology. Synthesis results confirm a superior performance-area ratio compared to related recent works. Assuming the same FPGA chip, the same amount of local memory, and the same I/O bandwidth, our design outperforms related proposals by at least 1.7X and up to 18X consuming the least reconfigurable resources. A total of 39 PEs can be integrated into the xc2vp125-7 FPGA, reaching performance of, e.g., 15.6 GFLOPS with 1600 KB local memory and 400 MB/s external memory bandwidth.

vlsi test symposium | 1996

March LR: a test for realistic linked faults

A. J. van de Goor; Georgi Gaydadjiev; V. G. Mikitjuk; Vyacheslav N. Yarmolik

Many march tests have already been designed to cover faults of different fault models. The complexity of these tests arises when linked faults are taken into consideration. This paper gives an overview of the most important and commonly used fault models, including the industrys popular disturb fault model. The fault coverage of march tests is analysed in a novel way, i.e., in terms of their detection capabilities for: simple faults, and linked faults; whereby the infinite class of linked faults has been reduced to a set of realistic linked faults. Thereafter the paper presents a methodology to design tests for realistic linked faults, resulting in the new tests March LR, March LRD and March LRDD. These new tests will be shown to be more efficient and to offer a higher fault coverage than comparable existing tests.

field-programmable logic and applications | 2007

DWARV: Delftworkbench Automated Reconfigurable VHDL Generator

Yana Yankova; Georgi Kuzmanov; Koen Bertels; Georgi Gaydadjiev; Yi Lu; Stamatis Vassiliadis

In this paper, we present the DWARV C-to-VHDL generation toolset. The toolset provides support for broad range of application domains. It exploits the operation parallelism, available in the algorithms. Our designs are generated with a view of actual hardware/software co-execution on a real hardware platform. The carried experiments on the MOLEN polymorphic processor prototype suggest overall application speedups between 1.4x and 6.8x, corresponding to 13% to 94% of the theoretically achievable maximums, constituted by Amdahls law.

design, automation, and test in europe | 2008

Transparent reconfigurable acceleration for heterogeneous embedded applications

Antonio Carlos Schneider Beck; Mateus B. Rutzig; Georgi Gaydadjiev; Luigi Carro

Embedded systems are becoming increasingly complex. Besides the additional processing capabilities, they are characterized by high diversity of computational models coexisting in a single device. Although reconfigurable architectures have already shown to be a potential solution for such systems, they just present significant speedups of very specific dataflow oriented kernels. Furthermore, reconfigurable fabric is still withheld by the need of special tools and compilers, clearly not sustaining backward software compatibility. In this paper, we propose a new technique to optimize both dataflow and control-flow oriented code in a totally transparent process, without the need of any modification in the source or binary codes. For that, we have developed a Binary Translation algorithm implemented in hardware, which works in parallel to a MIPS processor. The proposed mechanism is responsible for transforming sequences of instructions at runtime to be executed on a dynamic coarse-grain reconfigurable array, supporting speculative execution. Executing the MIBench suite, we show performance improvements of up to 2.5 times, while reducing 1.7 times the required energy, using trivial hardware resources.

applied reconfigurable computing | 2007

Architectural exploration of the ADRES coarse-grained reconfigurable array

Frank Bouwens; Mladen Berekovic; Andreas Kanstein; Georgi Gaydadjiev

Reconfigurable computational architectures are envisioned to deliver power efficient, high performance, flexible platforms for embedded systems design. The coarse-grained reconfigurable architecture ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) and its compiler offer a tool flow to design sparsely interconnected 2D array processors with an arbitrary number of functional units, register files and interconnection topologies. This article presents an architectural exploration methodology and its results for the first implementation of the ADRES architecture on a 90nm standard-cell technology. We analyze performance, energy and power trade-offs for two typical kernels from the multimedia and wireless domains: IDCT and FFT. Architecture instances of different sizes and interconnect structures are evaluated with respect to their power versus performance trade-offs. An optimized architecture is derived. A detailed power breakdown for the individual components of the selected architecture is presented.

memory technology, design and testing | 2004

The state-of-art and future trends in testing embedded memories

Said Hamdioui; Georgi Gaydadjiev; A. J. van de Goor

According to the International Technology Roadmap for Semiconductors (ITRS 2001), embedded memories will continue to dominate the increasing system on chips (SoCs) content in the next years, approaching 94% in about 10 years. Therefore the memory yield will have a dramatical impact on the overall defect-per-million (DPM) level, hence on the overall SoC yield. Meeting a high memory yield requires understanding memory designs, modelling their faulty behaviors in the presence of defects, designing adequate tests and diagnosis strategies as well as efficient repair schemes. This paper presents the state of art in memory testing including fault modeling, test design, built-in-self-test (BIST) and built-in-self-repair (BISR). Further research challenges and opportunities are discussed in enabling testing (embedded) memories, which use deep submicron technologies.

high performance embedded architectures and compilers | 2008

Architecture enhancements for the ADRES coarse-grained reconfigurable array

Frank Bouwens; Mladen Berekovic; Bjorn De Sutter; Georgi Gaydadjiev

Reconfigurable architectures provide power efficiency, flexibility and high performance for next generation embedded multimedia devices. ADRES, the IMEC Coarse-Grained Reconfigurable Array architecture and its compiler DRESC enable the design of reconfigurable 2D array processors with arbitrary functional units, register file organizations and interconnection topologies. This creates an enormous design space making it difficult to find optimized architectures. Therefore, architectural explorations aiming at energy and performance trade-offs become a major effort. In this paper we investigate the influence of register file partitions, register file sizes and the interconnection topology of ADRES. We analyze power, performance and energy delay trade-offs using IDCT and FFT as benchmarks while targeting 90nm technology. We also explore quantitatively the influences of several hierarchical optimizations for power by applying specific hardware techniques, i.e. clock gating and operand isolation. As a result, we propose an enhanced architecture instantiation that improves performance by 60 - 70% and reduces energy by 50%.

international conference / workshop on embedded computer systems: architectures, modeling and simulation | 2004

The Molen Programming Paradigm

Stamatis Vassiliadis; Georgi Gaydadjiev; Koen Bertels; Elena Moscu Panainte

In this paper we present the Molen programming paradigm, which is a sequential consistency paradigm for programming Custom Computing Machines (CCM). The programming paradigm allows for modularity and provides mechanisms for explicit parallel execution. Furthermore it requires only few instructions to be added in an architectural instruction set while allowing an almost arbitrary number of op-codes per user to be used in a CCM. A number of programming examples and discussion is provided in order to clarify the operation, sequence control and parallelism of the proposed programming paradigm.

design, automation, and test in europe | 2009

Assessing fat-tree topologies for regular network-on-chip design under nanoscale technology constraints

Daniele Ludovici; F. Gilabert; Simone Medardoni; Crispín Gómez; María Engracia Gómez; Pedro López; Georgi Gaydadjiev; Davide Bertozzi

Most of past evaluations of fat-trees for on-chip interconnection networks rely on oversimplifying or even irrealistic architecture and traffic pattern assumptions, and very few layout analyses are available to relieve practical feasibility concerns in nanoscale technologies. This work aims at providing an in-depth assessment of physical synthesis efficiency of fat-trees and at extrapolating silicon-aware performance figures to back-annotate in the system-level performance analysis. A 2D mesh is used as a reference architecture for comparison, and a 65 nm technology is targeted by our study. Finally, in an attempt to mitigate the implementation cost of k-ary n-tree topologies, we also review an alternative unidirectional multi-stage interconnection network which is able to simplify the fat-tree architecture and to minimally impact performance.

Explore More