Is this you? Create Your Porfile

Marcin Pietron

AGH University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marcin Pietron is active.

Explore More

Publication

Featured researches published by Marcin Pietron.

automation, robotics and control systems | 2013

Comparison of GPU and FPGA implementation of SVM algorithm for fast image segmentation

Marcin Pietron; Maciej Wielgosz; Dominik Zurek; Ernest Jamro; Kazimierz Wiatr

This paper presents preliminary implementation results of the SVM (Support Vector Machine) algorithm. SVM is a dedicated mathematical formula which allows us to extract selective objects from a picture and assign them to an appropriate class. Consequently, a black and white images reflecting an occurrence of the desired feature is derived from an original picture fed into the classifier. This work is primarily focused on the FPGA and GPU implementations aspects of the algorithm as well as on comparison of the hardware and software performance. A human skin classifier was used as an example and implemented both on Intel Xeon E5645.40 GHz, Xilinx Virtex-5 LX220 and Nvidia Tesla m2090. It is worth emphasizing that in case of FPGA implementation the critical hardware components were designed using HDL (Hardware Description Language), whereas the less demanding or standard ones such as communication interfaces, FIFO, FSMs were implemented in Impulse C. Such an approach allowed us both to cut a design time and preserve a high performance of the hardware classification module. In case of GPU implementation whole algorithm is implemented in CUDA.

international joint conference on rough sets | 2016

Formal Analysis of HTM Spatial Pooler Performance Under Predefined Operation Conditions

Marcin Pietron; Maciej Wielgosz; Kazimierz Wiatr

This paper introduces mathematical formalism for Spatial Pooler (SP) of Hierarchical Temporal Memory (HTM) with a spacial consideration for its hardware implementation. Performance of HTM network and its ability to learn and adjust to a problem at hand is governed by a large set of parameters. Most of parameters are codependent which makes creating efficient HTM-based solutions challenging. It requires profound knowledge of the settings and their impact on the performance of system. Consequently, this paper introduced a set of formulas which are to facilitate the design process by enhancing tedious trial-and-error method with a tool for choosing initial parameters which enable quick learning convergence. This is especially important in hardware implementations which are constrained by the limited resources of a platform.

international conference on conceptual structures | 2015

GPGPU for Difficult Black-box Problems

Marcin Pietron; Aleksander Byrski; Marek Kisiel-Dorohinicki

Difficult black-box problems arise in many scientific and industrial areas. In this paper, efficient use of a hardware accelerator to implement dedicated solvers for such problems is discussed and studied based on an example of Golomb Ruler problem. The actual solution of the problem is shown based on evolutionary and memetic algorithms accelerated on GPGPU. The presented results prove that GPGPU outperforms CPU in some memetic algorithms which can be used as a part of hybrid algorithm of finding near optimal solutions of Golomb Ruler problem. The presented research is a part of building heterogenous parallel algorithm for difficult black-box Golomb Ruler problem.

Computer Science | 2013

Accelerating SELECT WHERE and SELECT JOIN queries on a GPU

Marcin Pietron; Pawe l Russek; Kazimierz Wiatr

This paper presents implementations of a few selected SQL operations using theCUDA programming framework on the GPU platform. Nowadays, the GPU’sparallel architectures give a high speed-up on certain problems. Therefore, thenumber of non-graphical problems that can be run and sped-up on the GPUstill increases. Especially, there has been a lot of research in data mining onGPUs. In many cases it proves the advantage of oﬄoading processing fromthe CPU to the GPU. At the beginning of our project we chose the set ofSELECT WHERE and SELECT JOIN instructions as the most common op-erations used in databases. We parallelized these SQL operations using threemain mechanisms in CUDA: thread group hierarchy, shared memories, andbarrier synchronization. Our results show that the implemented highly parallelSELECT WHERE and SELECT JOIN operations on the GPU platform canbe signiﬁcantly faster than the sequential one in a database system run on theCPU.

International Journal of Applied Mathematics and Computer Science | 2010

Loop profiling tool for HPC code inspection as an efficient method of FPGA based acceleration

Marcin Pietron; Pawel Russek; Kazimierz Wiatr

Loop profiling tool for HPC code inspection as an efficient method of FPGA based acceleration This paper presents research on FPGA based acceleration of HPC applications. The most important goal is to extract a code that can be sped up. A major drawback is the lack of a tool which could do it. HPC applications usually consist of a huge amount of a complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLLs (High Level Languages) such as Mitrion-C (Mohl, 2006). HLLs were invented to make the development of HPRC applications faster. Loop profiling is one of the steps to check if the insertion of an HLL to an existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with minimal initialization of it during the execution of algorithms.

Journal of Computational Science | 2017

Leveraging heterogeneous parallel platform in solving hard discrete optimization problems with metaheuristics

Marcin Pietron; Aleksander Byrski; Marek Kisiel-Dorohinicki

Abstract The research reported in the paper deals with difficult black-box problems solved by means of popular metaheuristic algorithms implemented on up-to-date parallel, multi-core, and many-core platforms. In consecutive publications we are trying to show how particular population-based techniques may further benefit from employing dedicated hardware like GPGPU or FPGA for delegating different parts of the computing in order to speed it up. The main contribution of this paper is an experimental study focused on profiling of different possibilities of implementation of Scatter Search algorithm, especially delegating some of its selected components to GPGPU. As a result, a concise know-how related to the implementation of a population-based metaheuristic similar to Scatter Search is presented using a difficult discrete optimization problem; namely, Golomb Ruler, as a benchmark.

international conference on agents and artificial intelligence | 2016

Parallel Implementation of Spatial Pooler in Hierarchical Temporal Memory

Marcin Pietron; Maciej Wielgosz; Kazimierz Wiatr

Hierarchical Temporal Memory is a structure that models some of the structural and algorithmic properties of the neocortex. HTM is a biological model based on the memory-prediction theory of brain. HTM is a method for discovering and learning of observed input patterns and sequences, building an increasingly complex models. HTM combines and extends approaches used in sparse distributed memory, bayesian networks, spatial and temporal clustering algorithms, using a tree-shaped hierarchy neural networks. It is quite a new model of deep learning process, which is very efficient technique in artificial intelligence algorithms. HTM like other deep learning models (Boltzmann machine, deep belief networks etc.) has structure which can be efficiently processed by parallel machines. Modern multi-core processors with wide vector processing units (SSE, AVX), GPGPU are platforms that can tremendously speed up learning, classifying or clustering algorithms based on deep learning models (e.g. Cuda Toolkit 7.0). The current bottleneck of this new flexible artifficial intelligence model is efficiency. This article focuses on parallel processing of HTM learning algorithms in parallel hardware platforms. This work is the first one about implementation of HTM architecture and its algorithms in hardware accelerators. The article doesn’t study quality of the algorithm.

federated conference on computer science and information systems | 2016

Using Spatial Pooler of Hierarchical Temporal Memory for object classification in noisy video streams

Maciej Wielgosz; Marcin Pietron; Kazimierz Wiatr

This paper focuses on analyzing a Spatial Pooler (SP) of Hierarchical Temporal Memory (HTM) ability for facilitating object classification in noisy video streams. In particular, we seek to determine whether employing SP as a component of the video system increases overall robustness to noise. We have implemented our own version of HTM and applied it to object recognition tasks under various testing conditions. The system is composed of a video preprocessing block, a dimensionality reduction section which contains SP, a histograms collecting module and SVM classifier. Our experiments involve assessing performance of two different system setups (i.e. a version featuring SP and one without it) under various noise conditions with 32-frame video files. In order to make tests fair and repeatable the videos of several 3-D geometric shapes were artificially generated. Subsequently, Gaussian noise of a different intensity was introduced to the videos making them more indistinct. Such an approach mimics real-life scenarios where the system is taught ideal objects and then faces in its normal working conditions the challenge of detecting noisy ones. The results of the experiments reveal the superiority of the solution featuring Spatial Pooler over the one without it. Furthermore, the system with SP performed better also in the experiment without a noise component introduced and achieved a mean F1-score of 0.91 in ten trials.

Computer Science | 2013

Comparison of Hybrid Sorting Algorithms Implemented on Different Parallel Hardware Platforms

Dominik Zurek; Marcin Pietron; Maciej Wielgosz; Kazimierz Wiatr

Sorting is a common problem in computer science. There are lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, hardware platforms enable to create wide parallel algorithms. We have standard processors consist of multiple cores and hardware accelerators like GPU. The graphic cards with their parallel architecture give new possibility to speed up many algorithms. In this paper we describe results of implementation of a few different sorting algorithms on GPU cards and multicore processors. Then hybrid algorithm will be presented which consists of parts executed on both platforms, standard CPU and GPU.

international conference on conceptual structures | 2017

Toward hybrid platform for evolutionary computations of hard discrete problems

Dominik Żurek; Kamil Piętak; Marcin Pietron; Marek Kisiel-Dorohinicki

Abstract Memetic agent-based paradigm, which combines evolutionary computation and local search techniques in one of promising meta-heuristics for solving large and hard discrete problem such as Low Autocorrellation Binary Sequence (LABS) or optimal Golomb-ruler (OGR). In the paper as a follow-up of the previous research, a short concept of hybrid agent-based evolutionary systems platform, which spreads computations among CPU and GPU, is shortly introduced. The main part of the paper presents an efficient parallel GPU implementation of LABS local optimization strategy. As a means for comparison, speed-up between GPU implementation and CPU sequential and parallel versions are shown. This constitutes a promising step toward building hybrid platform that combines evolutionary meta-heuristics with highly efficient local optimization of chosen discrete problems.

Explore More