Is this you? Create Your Porfile

Dariusz Mrozek

Silesian University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dariusz Mrozek is active.

Explore More

Publication

Featured researches published by Dariusz Mrozek.

Journal of Molecular Modeling | 2014

Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA

Dariusz Mrozek; Miłosz Brożek; Bożena Małysiak-Mrozek

Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have in common is still a challenging task, especially when entire databases containing tens or even hundreds of thousands of protein structures must be scanned. Graphics processing units (GPUs) and general purpose graphics processing units (GPGPUs) can perform many time-consuming and computationally demanding processes much more quickly than a classical CPU can. In this paper, we describe the GPU-based implementation of the CASSERT algorithm for 3D protein structure similarity searching. This algorithm is based on the two-phase alignment of protein structures when matching fragments of the compared proteins. The GPU (GeForce GTX 560Ti: 384 cores, 2GB RAM) implementation of CASSERT (“GPU-CASSERT”) parallelizes both alignment phases and yields an average 180-fold increase in speed over its CPU-based, single-core implementation on an Intel Xeon E5620 (2.40GHz, 4 cores). In this paper, we show that massive parallelization of the 3D structure similarity search process on many-core GPU devices can reduce the execution time of the process, allowing it to be performed in real time. GPU-CASSERT is available at: http://zti.polsl.pl/dmrozek/science/gpucassert/cassert.htm.

Bioinformatics | 2014

Cloud4Psi: cloud computing for 3D protein structure similarity searching

Dariusz Mrozek; Bożena Małysiak-Mrozek; Artur Kłapciński

Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact: [email protected]

grid computing | 2015

Scaling Ab Initio Predictions of 3D Protein Structures in Microsoft Azure Cloud

Dariusz Mrozek; Paweł Gosk; Bożena Małysiak-Mrozek

Computational methods for protein structure prediction allow us to determine a three-dimensional structure of a protein based on its pure amino acid sequence. These methods are a very important alternative to costly and slow experimental methods, like X-ray crystallography or Nuclear Magnetic Resonance. However, conventional calculations of protein structure are time-consuming and require ample computational resources, especially when carried out with the use of ab initio methods that rely on physical forces and interactions between atoms in a protein. Fortunately, at the present stage of the development of computer science, such huge computational resources are available from public cloud providers on a pay-as-you-go basis. We have designed and developed a scalable and extensible system, called Cloud4PSP, which enables predictions of 3D protein structures in the Microsoft Azure commercial cloud. The system makes use of the Warecki-Znamirowski method as a sample procedure for protein structure prediction, and this prediction method was used to test the scalability of the system. The results of the efficiency tests performed proved good acceleration of predictions when scaling the system vertically and horizontally. In the paper, we show the system architecture that allowed us to achieve such good results, the Cloud4PSP processing model, and the results of the scalability tests. At the end of the paper, we try to answer which of the scaling techniques, scaling out or scaling up, is better for solving such computational problems with the use of Cloud computing.

Computer Networks and Isdn Systems | 2013

CASSERT: A Two-Phase Alignment Algorithm for Matching 3D Structures of Proteins

Dariusz Mrozek; Bożena Małysiak-Mrozek

Protein structure alignment allows assessment of protein similarities and leads to the knowledge of the nature of proteins themselves. In this paper, we present a new version of the two-phase alignment algorithm for matching protein structures, called CASSERT. The algorithm can be used in scanning databases of protein structures while searching protein similarities. Effectiveness of the CASSERT was studied comparing its results to those returned by DALI algorithm. Performed tests confirm that the CASSERT algorithm exhibits high effectiveness in protein structure similarity searching and can be a useful tool in the identification of proteins and their functions.

intelligent information systems | 2016

An efficient and flexible scanning of databases of protein secondary structures

Dariusz Mrozek; Bartek Socha; Stanisław Kozielski; Bożena Małysiak-Mrozek

Protein secondary structure describe protein construction in terms of regular spatial shapes, including alpha-helices, beta-strands, and loops, which protein amino acid chain can adopt in some of its regions. This information is supportive for protein classification, functional annotation, and 3D structure prediction. The relevance of this information and the scope of its practical applications cause the requirement for its effective storage and processing. Relational databases, widely-used in commercial systems in recent years, are one of the serious alternatives honed by years of experience, enriched with developed technologies, equipped with the declarative SQL query language, and accepted by the large community of programmers. Unfortunately, relational database management systems are not designed for efficient storage and processing of biological data, such as protein secondary structures. In this paper, we present a new search method implemented in the search engine of the PSS-SQL language. The PSS-SQL allows formulation of queries against a relational database in order to find proteins having secondary structures similar to the structural pattern specified by a user. In the paper, we will show how the search process can be accelerated by multiple scanning of the Segment Index and parallel implementation of the alignment procedure using multiple threads working on multiple-core CPUs.

ieee international conference on fuzzy systems | 2007

An Optimal Alignment of Proteins Energy Characteristics with Crisp and Fuzzy Similarity Awards

Dariusz Mrozek; Bożena Małysiak; Stanisław Kozielski

We discuss the usage of constant and fuzzy similarity awards while establishing an optimal alignment between energy characteristics of two compared protein energy profiles. Single protein energy profile is a set of energy characteristics of various types of energy. The energy profile is determined for a given protein structure. We use these profiles to find protein molecules of the same structural protein family and inspect conformational modifications in their molecular structures as an effect of biochemical reactions or environmental influences. Energy profiles are received in the computational process based on the molecular mechanics theory. Afterwards, these profiles can be stored in the special purpose database (EDB) and used by the search engine to find similar fragments of protein structures on the energy level. To optimize the alignment path we use modified, energy-adapted Smith-Waterman method with one of the tested similarity awards.

Information Sciences | 2016

HDInsight4PSi: Boosting performance of 3D protein structure similarity searching with HDInsight clusters in Microsoft Azure cloud

Dariusz Mrozek; Paweł Daniłowicz; Bożena Małysiak-Mrozek

Abstract 3D protein structure similarity searching is one of the important processes performed in structural bioinformatics, since it allows for protein function identification and reconstruction of phylogeny for weakly related organisms. Due to the complexity of 3D protein structures and exponential growth of protein structures in public repositories, like the Protein Data Bank, the process is time-consuming and requires increased computational resources. This causes the necessity to prepare computer systems to be able to deal with such huge volumes of macromolecular data. In this paper, we show how 3D protein structure similarity searching can be performed in parallel by distributing MapReduce jobs on the HDInsight cluster in Microsoft Azure commercial cloud. Our solution combines the use of two important computing paradigms that gain popularity in recent years—Hadoop/MapReduce and Cloud computing. Our experiments performed with the use of the whole repository of protein structures from Protein Data Bank confirm that such a technological fusion is very beneficial and can be successfully applied when performing time-consuming computations over biological data. Moreover, appropriate preparation of data allows to reduce the time needed for computations and significantly accelerates the similarity searching.

Archive | 2014

Beyond Databases, Architectures, and Structures

Stanisław Kozielski; Dariusz Mrozek; Pawel Kasprowski; Bożena Małysiak-Mrozek; Daniel Kostrzewa

The paper considers the problem of prediction of a probability distribution. We take into account an extrapolation model based on evolution of quantiles. We may use any concrete model which allows to track and extrapolate boundaries of buckets of an equi-height histogram. This histogram with p+1 boundaries is equivalent to p-quantiles. Using such baseline extrapolation model we may obtain lines of locations of bucket boundaries that may intersect in future. To avoid intersections and to extend (in time) correctness of the results, we propose to use a model of continuous dynamical system with viscous resistance forces for obtaining improved lines of locations. The proposed model allows to obtain lines with unchanged shapes or very similar ones (comparing to the results from the baseline extrapolation model) but without any intersections. This approach will be helpful when a previously used baseline extrapolation model is too much time limited. The work was inspired by the problem of prediction of an attribute value distribution used for query selectivity estimation. However, the proposed method may be applied not only in query optimization problem.

international conference on computational collective intelligence | 2010

Improving performance of protein structure similarity searching by distributing computations in hierarchical multi-agent system

Alina Momot; Bożena Małysiak-Mrozek; Stanisław Kozielski; Dariusz Mrozek; Łukasz Hera; Michał Momot

Since protein structure similarity searching is very complex and time-consuming, one of the possible acceleration methods is parallelization by distributing the calculation on multiple computers. In the paper, we present a theoretical model of the hierarchical multi-agent system dedicated to the task of protein structure similarity searching. We also show results of several numerical experiments confirming a suitability of such distribution for the similarity searching performed for the Muconate Lactonizing Enzyme (PDB ID = 1MUC) from the Protein Data Bank (PDB) against the database containing almost thousand randomly chosen molecules.

international conference of the ieee engineering in medicine and biology society | 2010

PSS-SQL: Protein Secondary Structure - Structured Query Language

Dariusz Mrozek; Dominika Wieczorek; Bożena Małysiak-Mrozek; Stanisław Kozielski

Secondary structure representation of proteins provides important information regarding protein general construction and shape. This representation is often used in protein similarity searching. Since existing commercial database management systems do not offer integrated exploration methods for biological data e.g. at the level of the SQL language, the structural similarity searching is usually performed by external tools. In the paper, we present our newly developed PSS-SQL language, which allows searching a database in order to identify proteins having secondary structure similar to the structure specified by the user in a PSS-SQL query. Therefore, we provide a simple and declarative language for protein structure similarity searching.

Explore More