Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stefan Maetschke is active.

Publication


Featured researches published by Stefan Maetschke.


Genome Medicine | 2012

Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets

Piyush B. Madhamshettiwar; Stefan Maetschke; Melissa J. Davis; Antonio Reverter; Mark A. Ragan

BackgroundAltered networks of gene regulation underlie many complex conditions, including cancer. Inferring gene regulatory networks from high-throughput microarray expression data is a fundamental but challenging task in computational systems biology and its translation to genomic medicine. Although diverse computational and statistical approaches have been brought to bear on the gene regulatory network inference problem, their relative strengths and disadvantages remain poorly understood, largely because comparative analyses usually consider only small subsets of methods, use only synthetic data, and/or fail to adopt a common measure of inference quality.MethodsWe report a comprehensive comparative evaluation of nine state-of-the art gene regulatory network inference methods encompassing the main algorithmic approaches (mutual information, correlation, partial correlation, random forests, support vector machines) using 38 simulated datasets and empirical serous papillary ovarian adenocarcinoma expression-microarray data. We then apply the best-performing method to infer normal and cancer networks. We assess the druggability of the proteins encoded by our predicted target genes using the CancerResource and PharmGKB webtools and databases.ResultsWe observe large differences in the accuracy with which these methods predict the underlying gene regulatory network depending on features of the data, network size, topology, experiment type, and parameter settings. Applying the best-performing method (the supervised method SIRENE) to the serous papillary ovarian adenocarcinoma dataset, we infer and rank regulatory interactions, some previously reported and others novel. For selected novel interactions we propose testable mechanistic models linking gene regulation to cancer. Using network analysis and visualization, we uncover cross-regulation of angiogenesis-specific genes through three key transcription factors in normal and cancer conditions. Druggabilty analysis of proteins encoded by the 10 highest-confidence target genes, and by 15 genes with differential regulation in normal and cancer conditions, reveals 75% to be potential drug targets.ConclusionsOur study represents a concrete application of gene regulatory network inference to ovarian cancer, demonstrating the complete cycle of computational systems biology research, from genome-scale data analysis via network inference, evaluation of methods, to the generation of novel testable hypotheses, their prioritization for experimental validation, and discovery of potential drug targets.


Briefings in Bioinformatics | 2014

Supervised, semi-supervised and unsupervised inference of gene regulatory networks

Stefan Maetschke; Piyush B. Madhamshettiwar; Melissa J. Davis; Mark A. Ragan

Inference of gene regulatory network from expression data is a challenging task. Many methods have been developed to this purpose but a comprehensive evaluation that covers unsupervised, semi-supervised and supervised methods, and provides guidelines for their practical application, is lacking. We performed an extensive evaluation of inference methods on simulated and experimental expression data. The results reveal low prediction accuracies for unsupervised techniques with the notable exception of the Z-SCORE method on knockout data. In all other cases, the supervised approach achieved the highest accuracies and even in a semi-supervised setting with small numbers of only positive samples, outperformed the unsupervised techniques.


BMC Bioinformatics | 2009

Exploiting structural and topological information to improve prediction of RNA-protein binding sites

Stefan Maetschke; Zheng Yuan

BackgroundRNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy.ResultsWe have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at http://www.qfab.org/PRIP.ConclusionGraph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions.


Bioinformatics | 2012

Gene Ontology-driven inference of protein–protein interactions using inducers

Stefan Maetschke; Martin Simonsen; Melissa J. Davis; Mark A. Ragan

MOTIVATION Protein-protein interactions (PPIs) are pivotal for many biological processes and similarity in Gene Ontology (GO) annotation has been found to be one of the strongest indicators for PPI. Most GO-driven algorithms for PPI inference combine machine learning and semantic similarity techniques. We introduce the concept of inducers as a method to integrate both approaches more effectively, leading to superior prediction accuracies. RESULTS An inducer (ULCA) in combination with a Random Forest classifier compares favorably to several sequence-based methods, semantic similarity measures and multi-kernel approaches. On a newly created set of high-quality interaction data, the proposed method achieves high cross-species prediction accuracies (Area under the ROC curve ≤ 0.88), rendering it a valuable companion to sequence-based methods. AVAILABILITY Software and datasets are available at http://bioinformatics.org.au/go2ppi/ CONTACT [email protected].


Proteins | 2007

Identifying novel peroxisomal proteins.

John Hawkins; Donna Mahony; Stefan Maetschke; Mark Wakabayashi; Rohan D. Teasdale; Mikael Bodén

Peroxisomes are small subcellular compartments responsible for a range of essential metabolic processes. Efforts in predicting peroxisomal protein import are challenged by species variation and sparse sequence data sets with experimentally confirmed localization.


Bioinformatics | 2010

A visual framework for sequence analysis using n-grams and spectral rearrangement

Stefan Maetschke; Karin S. Kassahn; Jasmyn A. Dunn; Siew Ping Han; Eva Z. Curley; Katryn J. Stacey; Mark A. Ragan

MOTIVATION Protein sequences are often composed of regions that have distinct evolutionary histories as a consequence of domain shuffling, recombination or gene conversion. New approaches are required to discover, visualize and analyze these sequence regions and thus enable a better understanding of protein evolution. RESULTS Here, we have developed an alignment-free and visual approach to analyze sequence relationships. We use the number of shared n-grams between sequences as a measure of sequence similarity and rearrange the resulting affinity matrix applying a spectral technique. Heat maps of the affinity matrix are employed to identify and visualize clusters of related sequences or outliers, while n-gram-based dot plots and conservation profiles allow detailed analysis of similarities among selected sequences. Using this approach, we have identified signatures of domain shuffling in an otherwise poorly characterized family, and homology clusters in another. We conclude that this approach may be generally useful as a framework to analyze related, but highly divergent protein sequences. It is particularly useful as a fast method to study sequence relationships prior to much more time-consuming multiple sequence alignment and phylogenetic analysis. AVAILABILITY A software implementation (MOSAIC) of the framework described here can be downloaded from http://bioinformatics.org.au/mosaic/ CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


asia-pacific bioinformatics conference | 2005

BLOMAP: An encoding of amino acids which improves signal peptide cleavage site prediction

Stefan Maetschke; Michael W. Towsey; Mikael Bodén

Research on cleavage site prediction for signal peptides has focused mainly on the application of different classification algorithms to achieve improved prediction accuracies. This paper addresses the fundamental issue of amino acid encoding to present amino acid sequences in the most beneficial way for machine learning algorithms. A comparison of several standard encoding methods shows, that for cleavage site prediction the frequently used orthonormal encoding is inferior compared to other methods. The best results are achieved with a new encoding method named BLOMAP – based on the BLOSUM62 substitution matrix – using a Naive Bayes classifier.


Bioinformatics | 2014

Characterizing cancer subtypes as attractors of Hopfield networks

Stefan Maetschke; Mark A. Ragan

MOTIVATION Cancer is a heterogeneous progressive disease caused by perturbations of the underlying gene regulatory network that can be described by dynamic models. These dynamics are commonly modeled as Boolean networks or as ordinary differential equations. Their inference from data is computationally challenging, and at least partial knowledge of the regulatory network and its kinetic parameters is usually required to construct predictive models. RESULTS Here, we construct Hopfield networks from static gene-expression data and demonstrate that cancer subtypes can be characterized by different attractors of the Hopfield network. We evaluate the clustering performance of the network and find that it is comparable with traditional methods but offers additional advantages including a dynamic model of the energy landscape and a unification of clustering, feature selection and network inference. We visualize the Hopfield attractor landscape and propose a pruning method to generate sparse networks for feature selection and improved understanding of feature relationships.


Bioinformatics | 2012

Automatic selection of reference taxa for protein–protein interaction prediction with phylogenetic profiling

Martin Simonsen; Stefan Maetschke; Mark A. Ragan

MOTIVATION Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available. RESULTS We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.


Journal of Clinical Bioinformatics | 2012

mCOPA : analysis of heterogeneous features in cancer expression data

Chenwei Wang; Alperen Taciroglu; Stefan Maetschke; Colleen C. Nelson; Mark A. Ragan; Melissa J. Davis

BackgroundCancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset.ResultsWe compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours.ConclusionsWe demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers.

Collaboration


Dive into the Stefan Maetschke's collaboration.

Top Co-Authors

Avatar

Mark A. Ragan

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Melissa J. Davis

Walter and Eliza Hall Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar

Mikael Bodén

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Michael W. Towsey

Queensland University of Technology

View shared research outputs
Top Co-Authors

Avatar

James M. Hogan

Queensland University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antonio Reverter

Commonwealth Scientific and Industrial Research Organisation

View shared research outputs
Top Co-Authors

Avatar

Chenwei Wang

Queensland University of Technology

View shared research outputs
Top Co-Authors

Avatar

Colleen C. Nelson

Queensland University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge