Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jaroslav Zendulka is active.

Publication


Featured researches published by Jaroslav Zendulka.


PLOS Computational Biology | 2014

PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations

Jaroslav Bendl; Jan Štourač; Ondrej Salanda; Antonín Pavelka; Eric D. Wieben; Jaroslav Zendulka; Jan Brezovsky; Jiri Damborsky

Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.


PLOS Computational Biology | 2016

PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

Jaroslav Bendl; Miloš Musil; Jan Štourač; Jaroslav Zendulka; Jiří Damborský; Jan Brezovský

An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.


NOSTRADAMUS | 2013

Wavelet Based Feature Extraction for Clustering of Be Stars

Pavla Bromová; Petr Skoda; Jaroslav Zendulka

The goal of our work is to create a feature extraction method for classification of Be stars. Be stars are characterized by prominent emission lines in their spectrum. We focus on the automated classification of Be stars based on typical shapes of their emission lines. We aim to design a reduced, specific set of features characterizing and discriminating the shapes of Be lines. In this paper, we present a feature extraction method based on the wavelet transform and its power spectrum. Both the discrete and continuous wavelet transform are used. Different feature vectors are created and compared on clustering of Be stars spectra from the archive of the Astronomical Institute of the Academy of Sciences of the Czech Republic. The clustering is performed using the kmeans algorithm. The results of our method are promising and encouraging to more detailed analysis.


Nucleic Acids Research | 2017

FireProt: web server for automated design of thermostable proteins

Miloš Musil; Jan Štourač; Jaroslav Bendl; Jan Brezovský; Zbyněk Prokop; Jaroslav Zendulka; Tomáš Martínek; David Bednář; Jiří Damborský

Abstract There is a continuous interest in increasing proteins stability to enhance their usability in numerous biomedical and biotechnological applications. A number of in silico tools for the prediction of the effect of mutations on protein stability have been developed recently. However, only single-point mutations with a small effect on protein stability are typically predicted with the existing tools and have to be followed by laborious protein expression, purification, and characterization. Here, we present FireProt, a web server for the automated design of multiple-point thermostable mutant proteins that combines structural and evolutionary information in its calculation core. FireProt utilizes sixteen tools and three protein engineering strategies for making reliable protein designs. The server is complemented with interactive, easy-to-use interface that allows users to directly analyze and optionally modify designed thermostable mutants. FireProt is freely available at http://loschmidt.chemi.muni.cz/fireprot.


Bioinformatics | 2017

pqsfinder: an exhaustive and imperfection-tolerant search tool for potential quadruplex-forming sequences in R

Jiří Hon; Tomáš Martínek; Jaroslav Zendulka; Matej Lexa

Motivation G‐quadruplexes (G4s) are one of the non‐B DNA structures easily observed in vitro and assumed to form in vivo. The latest experiments with G4‐specific antibodies and G4‐unwinding helicase mutants confirm this conjecture. These four‐stranded structures have also been shown to influence a range of molecular processes in cells. As G4s are intensively studied, it is often desirable to screen DNA sequences and pinpoint the precise locations where they might form. Results We describe and have tested a newly developed Bioconductor package for identifying potential quadruplex‐forming sequences (PQS). The package is easy‐to‐use, flexible and customizable. It allows for sequence searches that accommodate possible divergences from the optimal G4 base composition. A novel aspect of our research was the creation and training (parametrization) of an advanced scoring model which resulted in increased precision compared to similar tools. We demonstrate that the algorithm behind the searches has a 96% accuracy on 392 currently known and experimentally observed G4 structures. We also carried out searches against the recent G4‐seq data to verify how well we can identify the structures detected by that technology. The correlation with pqsfinder predictions was 0.622, higher than the correlation 0.491 obtained with the second best G4Hunter. Availability and implementation http://bioconductor.org/packages/pqsfinder/ This paper is based on pqsfinder‐1.4.1. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


cooperative information systems | 2003

Mining Association Rules from Relational Data – Average Distance Based Method

Vladimír Bartík; Jaroslav Zendulka

The paper describes a new method for association rule discovery in relational databases, which contain both quantitative and categorical attributes. Most of the methods developed in the past are based on initial equi-depth discretization of quantitative attributes. These approaches bring the loss of information. Distance-based methods are another kind of methods. They try to respect the semantics of data. The basic idea of the new method is to separate processing of categorical and quantitative attributes. The first step finds frequent itemsets containing only values of categorical attributes and then quantitative attributes are processed one by one. Discretization of values during quantitative attributes processing is distance-based. A new measure called average distance is introduced for these purposes. The paper describes the method and results of several experiments on real world data.


International Journal of Machine Learning and Computing | 2013

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Martin Hlosta; Rostislav Stríž; Jan Kupčík; Jaroslav Zendulka; Tomáš Hruška

in data classification is a frequently discussed problem that is not well handled by classical classification techniques. The problem we tackled was to learn binary classification model from large data with accuracy constraint for the minority class. We propose a new meta-learning method that creates initial models using cost-sensitive learning by logistic regression and uses these models as initial chromosomes for genetic algorithm. The method has been successfully tested on a large real-world data set from our internet security research. Experiments prove that our method always leads to better results than usage of logistic regression or genetic algorithm alone. Moreover, this method produces easily understandable classification model.


Central European Journal of Computer Science | 2012

Mining moving object data

Jaroslav Zendulka; Martin Pešek

Currently many devices provide information about moving objects and location-based services that accumulate a huge volume of moving object data, including trajectories. This paper deals with two useful analysis tasks — mining moving object patterns and trajectory outlier detection. We also present our experience with the TOP-EYE trajectory outlier detection algorithm, which we applied to two real-world data sets.


database and expert systems applications | 2007

Visual Surveillance Metadata Management

Petr Chmelar; Jaroslav Zendulka

The paper deals with a solution for visual surveillance metadata management. Data coming from many cameras is annotated using computer vision units to produce metadata representing moving objects in their states. It is assumed that the data is often uncertain, noisy and some states are missing. The solution consists of the following three layers: (a) data cleaning layer - improves quality of the data by smoothing it and by filling in missing states in short sequences referred to as tracks that represent a composite state of a moving object in a spatiotemporal subspace followed by one camera, (b) Data integration layer - assigns a global identity to tracks that represent the same object, (c) Persistence layer - manages the metadata in a database so that it can be used for online identification and offline querying, analyzing and mining. A Kalman filter technique is used to solve (a) and a classification based on the moving objects state and its visual properties is used in (b). An object model for layer (c) is presented too.


Knowledge Based Systems | 2018

Are we meeting a deadline? classification goal achievement in time in the presence of imbalanced data

Martin Hlosta; Zdenek Zdrahal; Jaroslav Zendulka

Abstract This paper addresses the problem of a finite set of entities which are required to achieve a goal within a predefined deadline. For example, a group of students is supposed to submit a homework by a specified cutoff. Further, we are interested in predicting which entities will achieve the goal within the deadline. The predictive models are built based only on the data from that population. The predictions are computed at various time instants by taking into account updated data about the entities. The first contribution of the paper is a formal description of the problem. The important characteristic of the proposed method for model building is the use of the properties of entities that have already achieved the goal. We call such an approach “Self-Learning”. Since typically only a few entities have achieved the goal at the beginning and their number gradually grows, the problem is inherently imbalanced. To mitigate the curse of imbalance, we improved the Self-Learning method by tackling information loss and by several sampling techniques. The original Self-Learning and the modifications have been evaluated in a case study for predicting submission of the first assessment in distance higher education courses. The results show that the proposed improvements outperform the specified two base-line models and the original Self-Learner, and also that the best results are achieved if domain-driven techniques are utilised to tackle the imbalance problem. We also showed that these improvements are statistically significant using Wilcoxon signed rank test.

Collaboration


Dive into the Jaroslav Zendulka's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michal Šebek

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Petr Chmelar

Brno University of Technology

View shared research outputs
Top Co-Authors

Avatar

Tomáš Martínek

Brno University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge