Michiel Stock | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michiel Stock is active.

Explore More

Publication

Featured researches published by Michiel Stock.

Research in Microbiology | 2013

Exploration and prediction of interactions between methanotrophs and heterotrophs

Michiel Stock; Sven Hoefman; Frederiek-Maarten Kerckhof; Nico Boon; Paul De Vos; Bernard De Baets; Kim Heylen; Willem Waegeman

Methanotrophs can form the basis of a methane-driven food web on which heterotrophic microorganisms can feed. In return, these heterotrophs can stimulate growth of methanotrophs in co-culture by providing growth additives. However, only a few specific interactions are currently known. We incubated nine methanotrophs with 25 heterotrophic strains in a pairwise miniaturized co-cultivation setup. Through principal component analysis and k-means clustering, methanotrophs and heterotrophs could be grouped according to their interaction behaviour, suggesting strain-dependent methanotroph-heterotroph complementarity. Co-cultivation significantly enhanced the growth parameters of three methanotrophs. This was most pronounced for Methylomonas sp. M5, with a threefold increase in maximum density and a fourfold increase in maximum increase in density in co-culture with Cupriavidus taiwanensis LMG 19424. In contrast, co-cultivation with Methylobacterium radiotolerans LMG 2269 and Pseudomonas aeruginosa LMG 12228 inhibited growth of most methanotrophs. Functional genomic analysis suggested the importance of vitamin metabolism for co-cultivation success. The generated data set was then successfully exploited as a proof-of-principle for predictive modelling of co-culture responses based on other interactions of the same heterotrophs and methanotrophs, yielding values of the area under the receiver operating characteristic curve of 0.73 upon 50% missing values for the maximum increase in density parameter. As such, these modelling-based tools were shown to hold great promise in reducing the amount of data that needs to be generated when conducting large co-cultivation studies.

Machine Learning | 2013

Efficient regularized least-squares algorithms for conditional ranking on relational data

Tapio Pahikkala; Antti Airola; Michiel Stock; Bernard De Baets; Willem Waegeman

In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance.

IEEE Transactions on Fuzzy Systems | 2012

A Kernel-Based Framework for Learning Graded Relations From Data

Willem Waegeman; Tapio Pahikkala; Antti Airola; Tapio Salakoski; Michiel Stock; B. De Baets

Driven by a large number of potential applications in areas, such as bioinformatics, information retrieval, and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data. The results indicate that incorporating domain knowledge about relations improves the predictive performance.

european conference on machine learning | 2014

A two-step learning approach for solving full and almost full cold start problems in dyadic prediction

Tapio Pahikkala; Michiel Stock; Antti Airola; Tero Aittokallio; Bernard De Baets; Willem Waegeman

Dyadic prediction methods operate on pairs of objects (dyads), aiming to infer labels for out-of-sample dyads. We consider the full and almost full cold start problem in dyadic prediction, a setting that occurs when both objects in an out-of-sample dyad have not been observed during training, or if one of them has been observed, but very few times. A popular approach for addressing this problem is to train a model that makes predictions based on a pairwise feature representation of the dyads, or, in case of kernel methods, based on a tensor product pairwise kernel. As an alternative to such a kernel approach, we introduce a novel two-step learning algorithm that borrows ideas from the fields of pairwise learning and spectral filtering. We show theoretically that the two-step method is very closely related to the tensor product kernel approach, and experimentally that it yields a slightly better predictive performance. Moreover, unlike existing tensor product kernel methods, the two-step method allows closed-form solutions for training and parameter selection via cross-validation estimates both in the full and almost full cold start settings, making the approach much more efficient and straightforward to implement.

Nucleic Acids Research | 2016

miSTAR : miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure

Gert Van Peer; Ayla De Paepe; Michiel Stock; Jasper Anckaert; Pieter-Jan Volders; Jo Vandesompele; Bernard De Baets; Willem Waegeman

Abstract In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model, in which the influence of binding site context is separately modeled. Using logistic regression and random forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA–mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR (miRNA stacked model target prediction; www.mi-star.org), displays a higher general performance and precision on top scoring predictions. More importantly, our model outperforms published and widely used miRNA target prediction algorithms. Finally, we highlight flaws in cross-validation schemes for evaluation of miRNA target prediction models and adopt a more fair and stringent approach.

Neural Computation | 2018

A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression

Michiel Stock; Tapio Pahikkala; Antti Airola; Bernard De Baets; Willem Waegeman

Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.

Scientific Reports | 2017

Linear filtering reveals false negatives in species interaction data

Michiel Stock; Timothée Poisot; Willem Waegeman; Bernard De Baets

Species interaction datasets, often represented as sparse matrices, are usually collected through observation studies targeted at identifying species interactions. Due to the extensive required sampling effort, species interaction datasets usually contain many false negatives, often leading to bias in derived descriptors. We show that a simple linear filter can be used to detect false negatives by scoring interactions based on the structure of the interaction matrices. On 180 different datasets of various sizes, sparsities and ecological interaction types, we found that on average in about 75% of the cases, a false negative interaction got a higher score than a true negative interaction. Furthermore, we show that this filter is very robust, even when the interaction matrix contains a very large number of false negatives. Our results demonstrate that unobserved interactions can be detected in species interaction datasets, even without resorting to information about the species involved.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2014

Identification of functionally related enzymes by learning-to-rank methods

Michiel Stock; Thomas Fober; Eyke Hüllermeier; Serghei Glinca; Gerhard Klebe; Tapio Pahikkala; Antti Airola; Bernard De Baets; Willem Waegeman

Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work, we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes.

Briefings in Bioinformatics | 2018

Algebraic shortcuts for leave-one-out cross-validation in supervised network inference

Michiel Stock; Tapio Pahikkala; Antti Airola; Willem Waegeman; Bernard De Baets

Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulatory networks. Many supervised techniques for network prediction use linear models on a possibly nonlinear pairwise feature representation of edges. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using a model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. This distinction matters because (i) the performance might dramatically differ between the prediction settings and (ii) tuning the model hyperparameters to obtain the best possible model depends on the setting of interest. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings.In this work we discuss a state-of-the-art kernel-based network inference technique called two-step kernel ridge regression. We show that this regression model can be trained efficiently, with a time complexity scaling with the number of vertices rather than the number of edges. Furthermore, this framework leads to a series of cross-validation shortcuts that allow one to rapidly estimate the model performance for any relevant network prediction setting. This allows computational biologists to fully assess the capabilities of their models. The machine learning techniques with the algebraic shortcuts are implemented in the RLScore software package: https://github.com/aatapa/RLScore.

european conference on machine learning | 2016

Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models

Michiel Stock; Krzysztof Dembczyński; Bernard De Baets; Willem Waegeman

Many complex multi-target prediction problems that concern large target spaces are characterised by a need for efficient prediction strategies that avoid the computation of predictions for all targets explicitly. Examples of such problems emerge in several subfields of machine learning, such as collaborative filtering, multi-label classification, dyadic prediction and biological network inference. In this article we analyse efficient and exact algorithms for computing the top-K predictions in the above problem settings, using a general class of models that we refer to as separable linear relational models. We show how to use those inference algorithms, which are modifications of well-known information retrieval methods, in a variety of machine learning settings. Furthermore, we study the possibility of scoring items incompletely, while still retaining an exact top-K retrieval. Experimental results in several application domains reveal that the so-called threshold algorithm is very scalable, performing often many orders of magnitude more efficiently than the naive approach.

Explore More