Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ondřej Kuželka is active.

Publication


Featured researches published by Ondřej Kuželka.


Machine Learning | 2011

Block-wise construction of tree-like relational features with monotone reducibility and redundancy

Ondřej Kuželka; Filip Železný

We describe an algorithm for constructing a set of tree-like conjunctive relational features by combining smaller conjunctive blocks. Unlike traditional level-wise approaches which preserve the monotonicity of frequency, our block-wise approach preserves monotonicity of feature reducibility and redundancy, which are important in propositionalization employed in the context of classification learning. With pruning based on these properties, our block-wise approach efficiently scales to features including tens of first-order atoms, far beyond the reach of state-of-the art propositionalization or inductive logic programming systems.


BMC Bioinformatics | 2012

Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

Andrea Szabóová; Ondřej Kuželka; Filip Železný; Jakub Tolar

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.


international conference on machine learning | 2009

Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties

Ondřej Kuželka; Filip Železný

We describe an algorithm for constructing a set of acyclic conjunctive relational features by combining smaller conjunctive blocks. Unlike traditional level-wise approaches which preserve the monotonicity of frequency, our block-wise approach preserves a form of monotonicity of the irreducibility and relevancy feature properties, which are important in propositionalization employed in the context of classification learning. With pruning based on these properties, our block-wise approach efficiently scales to features including tens of first-order literals, far beyond the reach of state-of-the art propositionalization or inductive logic programming systems.


european conference on machine learning | 2011

Gaussian logic for predictive classification

Ondřej Kuželka; Andrea Szabóová; Matěj Holec; Filip Železný

We describe a statistical relational learning framework called Gaussian Logic capable to work efficiently with combinations of relational and numerical data. The framework assumes that, for a fixed relational structure, the numerical data can be modelled by a multivariate normal distribution. We demonstrate how the Gaussian Logic framework can be applied to predictive classification problems. In experiments, we first show an application of the framework for the prediction of DNAbinding propensity of proteins. Next, we show how the Gaussian Logic framework can be used to find motifs describing highly correlated gene groups in gene-expression data which are then used in a set-level-based classification method.


Proteome Science | 2012

Prediction of DNA-binding proteins from relational features

Andrea Szabóová; Ondřej Kuželka; Filip Železný; Jakub Tolar

BackgroundThe process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins.ResultsPrediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins.ConclusionsWe introduced a novel approach for DNA-binding propensity prediction using relational machine learning which could potentially be used also for protein function prediction in general.


inductive logic programming | 2012

Bounded Least General Generalization

Ondřej Kuželka; Andrea Szabóová; Filip Železný

We study a generalization of Plotkin’s least general generalization. We introduce a novel concept called bounded least general generalization w.r.t. a set of clauses and show an instance of it for which polynomial-time reduction procedures exist. We demonstrate the practical utility of our approach in experiments on several relational learning datasets.


inductive logic programming | 2010

Seeing the world through homomorphism: an experimental study on reducibility of examples

Ondřej Kuželka; Filip Železný

We study reducibility of examples in several typical inductive logic programming benchmarks. The notion of reducibility that we use is related to theta-reduction, commonly used to reduce hypotheses in ILP. Whereas examples are usually not reducible on their own, they often become implicitly reducible when language for constructing hypotheses is fixed.We show that number of ground facts in a dataset can be almost halved for some real-world molecular datasets. Furthermore, we study the impact this has on a popular ILP system Aleph.


international conference on machine learning | 2008

Fast estimation of first-order clause coverage through randomization and maximum likelihood

Ondřej Kuželka; Filip Železný

In inductive logic programming, θ-subsumption is a widely used coverage test. Unfortunately, testing θ-subsumption is NP-complete, which represents a crucial efficiency bottleneck for many relational learners. In this paper, we present a probabilistic estimator of clause coverage, based on a randomized restarted search strategy. Under a distribution assumption, our algorithm can estimate clause coverage without having to decide subsumption for all examples. We implement this algorithm in program ReCovEr. On generated graph data and real-world datasets, we show that ReCovEr provides reasonably accurate estimates while achieving dramatic runtimes improvements compared to a state-of-the-art algorithm.


Journal of Intelligent Information Systems | 2014

A method for reduction of examples in relational learning

Ondřej Kuželka; Andrea Szabóová; Filip Železný

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.


NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns | 2012

Reducing examples in relational learning with bounded-treewidth hypotheses

Ondřej Kuželka; Andrea Szabóová; Filip Železný

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. The bounded treewidth bias can be replaced by other assumptions such as acyclicity with similar benefits. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster for nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate for Aleph.

Collaboration


Dive into the Ondřej Kuželka's collaboration.

Top Co-Authors

Avatar

Filip Železný

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Andrea Szabóová

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Gustav Šourek

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Jakub Tolar

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Filip Železny

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Martin Svatoš

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Matěj Holec

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Radomír Černoch

Czech Technical University in Prague

View shared research outputs
Top Co-Authors

Avatar

Roman Barták

Charles University in Prague

View shared research outputs
Top Co-Authors

Avatar

Jan Ramon

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge