Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrea Szabóová is active.

Publication


Featured researches published by Andrea Szabóová.


BMC Bioinformatics | 2012

Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

Andrea Szabóová; Ondřej Kuželka; Filip Železný; Jakub Tolar

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.


european conference on machine learning | 2011

Gaussian logic for predictive classification

Ondřej Kuželka; Andrea Szabóová; Matěj Holec; Filip Železný

We describe a statistical relational learning framework called Gaussian Logic capable to work efficiently with combinations of relational and numerical data. The framework assumes that, for a fixed relational structure, the numerical data can be modelled by a multivariate normal distribution. We demonstrate how the Gaussian Logic framework can be applied to predictive classification problems. In experiments, we first show an application of the framework for the prediction of DNAbinding propensity of proteins. Next, we show how the Gaussian Logic framework can be used to find motifs describing highly correlated gene groups in gene-expression data which are then used in a set-level-based classification method.


Proteome Science | 2012

Prediction of DNA-binding proteins from relational features

Andrea Szabóová; Ondřej Kuželka; Filip Železný; Jakub Tolar

BackgroundThe process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins.ResultsPrediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins.ConclusionsWe introduced a novel approach for DNA-binding propensity prediction using relational machine learning which could potentially be used also for protein function prediction in general.


inductive logic programming | 2012

Bounded Least General Generalization

Ondřej Kuželka; Andrea Szabóová; Filip Železný

We study a generalization of Plotkin’s least general generalization. We introduce a novel concept called bounded least general generalization w.r.t. a set of clauses and show an instance of it for which polynomial-time reduction procedures exist. We demonstrate the practical utility of our approach in experiments on several relational learning datasets.


Journal of Intelligent Information Systems | 2014

A method for reduction of examples in relational learning

Ondřej Kuželka; Andrea Szabóová; Filip Železný

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.


NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns | 2012

Reducing examples in relational learning with bounded-treewidth hypotheses

Ondřej Kuželka; Andrea Szabóová; Filip Železný

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. The bounded treewidth bias can be replaced by other assumptions such as acyclicity with similar benefits. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster for nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate for Aleph.


bioinformatics and biomedicine | 2012

Extending the ball-histogram method with continuous distributions and an application to prediction of DNA-binding proteins

Ondrej Kuzelka; Andrea Szabóová; Filip Zelezny

We introduce a novel method for prediction of DNA-binding propensity of proteins which extends our recently introduced ball-histogram method (Szabóova et al. 2012). Unlike the original ball-histogram method, it allows handling of continuous properties of protein regions. In experiments on four datasets of proteins, we show that the method improves upon the original ball-histogram method as well as other existing methods in terms of predictive accuracy.


international symposium on bioinformatics research and applications | 2011

Prediction of DNA-binding propensity of proteins by the ball-histogram method

Andrea Szabóová; Ondřej Kuželka; E Sergio Morales; Filip Železný; Jakub Tolar

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing the charged patches of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of charged amino acids, capturing joint probabilities of specified amino acids occurring in certain distances from each other. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, achieving favorable accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.


bioinformatics and biomedicine | 2011

Brdicka curve — A new source of biomarkers

Lenka Vyslouzilova; Vojtech Adam; Andrea Szabóová; Olga Stepankova; Rene Kizek; Jiri Anyz

This paper is devoted to analysis of voltammograms resulting from Brdicka reaction - the graphs that are currently used for determination of content of metallothioneins (MT) in tissue samples most often. We describe our search for typical patterns in the considered curves that would make it possible to distinguish among voltammograms produced by samples taken from different body parts. We suggest a rather compact representation of information contained in the considered graphs that is based on Haars Simple Wavelet transformation. The resulting representation is successfully tested for classification of real data obtained from 8 rats and their 9 body parts. The preliminary experiments confirm that the suggested derived attributes of Brdicka curves seem to be good candidates for becoming numerical biomarkers exhibiting an important advantage: the process leading to their calculation can be fully automated.


Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine | 2011

Gaussian logic and its applications in bioinformatics

Ondřej Kuželka; Andrea Szabóová; Filip Železný

We describe a novel statistical relational learning framework capable to work efficiently with combinations of relational and numerical data which is especially valuable in bioinformatics applications. We show how this model can be applied to modelling of gene expression data and to problems from proteomics.

Collaboration


Dive into the Andrea Szabóová's collaboration.

Top Co-Authors

Avatar

Filip Železný

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Ondřej Kuželka

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Ondrej Kuzelka

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Jakub Tolar

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Filip Zelezny

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Jiri Anyz

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Lenka Vyslouzilova

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Matěj Holec

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Olga Stepankova

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Rene Kizek

University of Veterinary and Pharmaceutical Sciences Brno

View shared research outputs
Researchain Logo
Decentralizing Knowledge