Is this you? Create Your Porfile

Martin Danek

Czech Technical University in Prague

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Danek is active.

Explore More

Publication

Featured researches published by Martin Danek.

international conference on artificial neural networks | 2001

A Generalisable Measure of Self-Organisation and Emergence

W. Andy Wright; Robert E. Smith; Martin Danek; Pillip Greenway

In adaptive systems that involve large numbers of entities, emergent, global behaviours that arise from localised interactions are a critical concept. Understanding and shaping emergence may be essential to such systemssuccess. To aid in this understanding, this paper introduces a measure gleaned from non-linear systems theory. The paper discusses how this measure can be used in reinforcing self organising behaviours in adaptive systems. Further, it is shown that the measure can be successfully employed as feedback to a system employing evolutionary computation (EC) and using this to design in desired self organising behaviours in an approximation to a biological plausible collective system.

field-programmable logic and applications | 2008

Increasing the level of abstraction in FPGA-based designs

Martin Danek; Jiri Kadlec; Roman Bartosinski; Lukas Kohout

Traditional design techniques for FPGAs are based on using hardware description languages, with functional and post-place-and-route simulation as a means to check design correctness and remove detected errors. With large complexity of things to be designed it is necessary to introduce new design approaches that will increase the level of abstraction while maintaining the necessary efficiency of a computation performed in hardware that we are used to today. This paper presents one such methodology that builds upon existing research in multithreading, object composability and encapsulation, partial runtime reconfiguration, and self adaptation. The methodology is based on currently available FPGA design tools. The efficiency of the methodology is evaluated on basic vector and matrix operations.

field-programmable logic and applications | 2007

Accelerating Microblaze Floating Point Operations

Jiri Kadlec; Roman Bartosinski; Martin Danek

The MicroBlaze processor serves in many FPGA designs as the central 32 bit CPU with access to the global off chip memory and peripherals. MicroBlaze provides FSL links for up to 8 coprocessors. We present two MicroBlaze designs. The first design works with 8 PicoBlaze-based accelerators for pipelined, single-precision floating point vector-oriented operations, and delivers over 1.2 GFLOPs. The second design uses 4 similar double precision accelerators and delivers 600 MFLOPs. The acceleration results are documented on batch computation of a finite impulse response filter. Each PicoBlaze soft core can be re-programmed by MicroBlaze. This provides a framework for a partial dynamic change of the functionality of accelerators. This program change can be done via the FSL link in parallel with the current computation of the accelerator.

design and diagnostics of electronic circuits and systems | 2010

Instruction set extensions for multi-threading in LEON3

Martin Danek; Leos Kafka; Lukas Kohout; Jaroslav Sykora

This paper describes instruction set extensions for a variant of multi-threading called micro-threading for the LEON3 SPARCv8 processor. We show an architecture of the developed processor and its key blocks — cache controller, register file, thread scheduler. The processor has been implemented in a Xilinx Virtex2Pro FPGA. The extensions are evaluated in terms of extra resources needed, and the overall performance of the developed processor is evaluated on a simple DSP computation typical for embedded systems.

design and diagnostics of electronic circuits and systems | 2012

The architecture and the technology characterization of an FPGA-based customizable Application-Specific Vector Processor

Jaroslav Sykora; Lukas Kohout; Roman Bartosinski; Leos Kafka; Martin Danek; Petr Honzik

The traditional approach to IP core design is to use simulations with test vectors. This is not feasible when dealing with complex function cores such as the Image Segmentation case-study algorithm in this paper. An algorithm developer needs to carry out experiments on large real-world data sets, with fast turn-around times, and in real time to facilitate performance tuning and incremental development. We propose a methodology called Application-Specific Vector Processor (ASVP). The ASVP approach first constructs a programmable architecture customized for a given application, then employs software techniques to develop firmware that implements the algorithm. Our sample implementation that supports the Image Segmentation kernel is capable of 332 MFLOPs, 400 MFLOPs, and 250 MFLOPs per coprocessor core in Virtex 5, Virtex 6 and Spartan 6 technologies, respectively. The core size is roughly 1500 slices, depending on the configuration and technology.

international conference on evolvable systems | 2008

Self-Adaptive Networked Entities for Building Pervasive Computing Architectures

Martin Danek; Jean-Marc Philippe; Petr Honzik; Christian Gamrat; Roman Bartosinski

This paper presents a framework for building and modeling a new-generation self-adaptive systems. The first part of the paper proposes an architecture of a self-adaptive networked entity that forms the basic element of the approach. The second part describes a modeling environment based on Matlab / Simulink and one possible implementation of the self-adaptive networked entity. A physical realization of the proposed system is demonstrated on the computation of a simple FIR filter in several FPGAs acting as hardware in the loop in Matlab / Simulink.

digital systems design | 2012

Reducing Instruction Issue Overheads in Application-Specific Vector Processors

Jaroslav Sykora; Roman Bartosinski; Lukas Kohout; Martin Danek; Petr Honzik

The traditional approach to IP core design is to use simulations with test vectors. This is not feasible when dealing with complex function cores such as the Image Segmentation case-study algorithm in this paper. An algorithm developer needs to carry out experiments on large real-world data sets, with fast turn-around times, and in real time to facilitate performance tuning and incremental development. Previously we proposed a methodology called Application-Specific Vector Processor (ASVP). The ASVP approach first constructs a programmable architecture customized for a given application, then employs software techniques to develop firmware that implements the algorithm. In our setting we employ an embedded simple scalar CPU (8-bit PicoBlaze 3) to control a floating-point vector processing unit (VPU) by issuing wide (horizontally encoded) instructions to it. In this work we dramatically reduce the overhead of the wide-instruction issue (in one case by 13x) by implementing a new two-level configuration table. The table stores frequently used vector definitions (in Level 1) and vector instructions (in Level 2), pre-loading them quickly into the issue buffer. A configuration in the issue buffer can be further modified before being sent to the processing unit. This ensures the architecture stays general and fully customizable.

Archive | 2012

UTLEON3: Exploring Fine-Grain Multi-Threading in FPGAs

Martin Danek; Leos Kafka; Luks Kohout; Jaroslav Skora; Roman Bartosinski

This book describes a specification, microarchitecture, VHDL implementation and evaluation of a SPARC v8 CPU with fine-grain multi-threading, called micro-threading. The CPU, named UTLEON3, is an alternative platform for exploring CPU multi-threading that is compatible with the industry-standard GRLIB package. The processor microarchitecture was designed to map in an efficient way the data-flow scheme on a classical von Neumann pipelined processing used in common processors, while retaining full binary compatibility with existing legacy programs.

design and diagnostics of electronic circuits and systems | 2010

Reconfigurable hardware objects for image processing on FPGAs

Jan Kloub; Petr Honzik; Martin Danek

Embedded systems are getting more complex; that is why the high level of abstraction is required during the development process. High abstraction methods simplify implementation of complex computation systems and shorten the time to market. This paper presents an implementation of a graphic computing element (GCE) which can be used as a runtime parametrized building block in image processing applications in FPGAs. In terms of the object oriented model GCE encapsulates its internal data representation and rules for their manipulation. Several basic image processing operations have been implemented (Sobel edge detection, Gauss, mean, etc. filtering). These operations are called as GCE methods. Because of high spatial dependency of image data in image processing, an efficient image data reuse method has been implemented.

automation, robotics and control systems | 2011

Analysis of execution efficiency in the microthreaded processor UTLEON3

Jaroslav Sykora; Leos Kafka; Martin Danek; Lukas Kohout

We analyse an impact of long-latency instructions, the family blocksize parameter, and the thread switch modifier on execution efficiency of families of threads in a single-core configuration of the UTLEON3 processor that implements the SVP microthreading model. The analysis is supported by code execution in an FPGA implementation of the processor. By classifying long-latency operations as either pipelined (e.g. floatingpoint operations) or non-pipelined (e.g. cache faults) we show that the blocksize parameter that controls resource utilization in the microthreaded processor has profound effects when the latency is pipelined, i.e. increasing the blocksize can improve the performance. In the nonpipelined long-latency case the efficiency reaches its maximum even with a small value of blocksize beyond which it cannot improve due to occupancy of an exclusive resource (memory bus congestion). The conclusions drawn in this paper can be used to optimize code compilation for the microthreaded processor. As the compiler specifies the blocksize parameter for each family of threads individually, it can optimize the register file utilization of the processor.

Explore More