Maike Tech | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maike Tech is active.

Explore More

Publication

Featured researches published by Maike Tech.

Nucleic Acids Research | 2009

Orphelia: predicting genes in metagenomic sequencing reads

Katharina Hoff; Thomas Lingner; Peter Meinicke; Maike Tech

Metagenomic sequencing projects yield numerous sequencing reads of a diverse range of uncultivated and mostly yet unknown microorganisms. In many cases, these sequencing reads cannot be assembled into longer contigs. Thus, gene prediction tools that were originally developed for whole-genome analysis are not suitable for processing metagenomes. Orphelia is a program for predicting genes in short DNA sequences that is available through a web server application (http://orphelia.gobics.de). Orphelia utilizes prediction models that were created with machine learning techniques on the basis of a wide range of annotated genomes. In contrast to other methods for metagenomic gene prediction, Orphelia has fragment length-specific prediction models for the two most popular sequencing techniques in metagenomics, chain termination sequencing and pyrosequencing. These models ensure highly specific gene predictions.

BMC Bioinformatics | 2008

Gene prediction in metagenomic fragments: A large scale machine learning approach

Katharina Hoff; Maike Tech; Thomas Lingner; Rolf Daniel; Burkhard Morgenstern; Peter Meinicke

BackgroundMetagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions.ResultsWe introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability.ConclusionLarge scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).

BMC Bioinformatics | 2004

Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites

Peter Meinicke; Maike Tech; Burkhard Morgenstern; Rainer Merkl

BackgroundKernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations in terms of positional and compositional variability of the underlying biological signals.ResultsWe propose a kernel-based approach to datamining on biological sequences. With our method it is possible to model and analyze positional variability of oligomers of any length in a natural way. On one hand this is achieved by mapping the sequences to an intuitive but high-dimensional feature space, well-suited for interpretation of the learnt models. On the other hand, by means of the kernel trick we can provide a general learning algorithm for that high-dimensional representation because all required statistics can be computed without performing an explicit feature space mapping of the sequences. By introducing a kernel parameter that controls the degree of position-dependency, our feature space representation can be tailored to the characteristics of the biological problem at hand. A regularized learning scheme enables application even to biological problems for which only small sets of example sequences are available. Our approach includes a visualization method for transparent representation of characteristic sequence features. Thereby importance of features can be measured in terms of discriminative strength with respect to classification of the underlying sequences. To demonstrate and validate our concept on a biochemically well-defined case, we analyze E. coli translation initiation sites in order to show that we can find biologically relevant signals. For that case, our results clearly show that the Shine-Dalgarno sequence is the most important signal upstream a start codon. The variability in position and composition we found for that signal is in accordance with previous biological knowledge. We also find evidence for signals downstream of the start codon, previously introduced as transcriptional enhancers. These signals are mainly characterized by occurrences of adenine in a region of about 4 nucleotides next to the start codon.ConclusionsWe showed that the oligo kernel can provide a valuable tool for the analysis of relevant signals in biological sequences. In the case of translation initiation sites we could clearly deduce the most discriminative motifs and their positional variation from example sequences. Attractive features of our approach are its flexibility with respect to oligomer length and position conservation. By means of these two parameters oligo kernels can easily be adapted to different biological problems.

Nucleic Acids Research | 2015

iBeetle-Base: a database for RNAi phenotypes in the red flour beetle Tribolium castaneum

Jürgen Dönitz; Christian Schmitt-Engel; Daniela Grossmann; Lizzy Gerischer; Maike Tech; Michael Schoppmeier; Martin Klingler; Gregor Bucher

The iBeetle-Base (http://ibeetle-base.uni-goettingen.de) makes available annotations of RNAi phenotypes, which were gathered in a large scale RNAi screen in the red flour beetle Tribolium castaneum (iBeetle screen). In addition, it provides access to sequence information and links for all Tribolium castaneum genes. The iBeetle-Base contains the annotations of phenotypes of several thousands of genes knocked down during embryonic and metamorphic epidermis and muscle development in addition to phenotypes linked to oogenesis and stink gland biology. The phenotypes are described according to the EQM (entity, quality, modifier) system using controlled vocabularies and the Tribolium morphological ontology (TrOn). Furthermore, images linked to the respective annotations are provided. The data are searchable either for specific phenotypes using a complex ‘search for morphological defects’ or a ‘quick search’ for gene names and IDs. The red flour beetle Tribolium castaneum has become an important model system for insect functional genetics and is a representative of the most species rich taxon, the Coleoptera, which comprise several devastating pests. It is used for studying insect typical development, the evolution of development and for research on metabolism and pest control. Besides Drosophila, Tribolium is the first insect model organism where large scale unbiased screens have been performed.

New Phytologist | 2014

Verticillium transcription activator of adhesion Vta2 suppresses microsclerotia formation and is required for systemic infection of plant roots

Van-Tuan Tran; Susanna A. Braus-Stromeyer; Harald Kusch; Michael Reusche; Alexander Kaever; Anika Kühn; Oliver Valerius; Manuel Landesfeind; Kathrin Petra Aßhauer; Maike Tech; Katharina Hoff; Tonatiuh Pena‐Centeno; Mario Stanke; Volker Lipka; Gerhard H. Braus

Six transcription regulatory genes of the Verticillium plant pathogen, which reprogrammed nonadherent budding yeasts for adhesion, were isolated by a genetic screen to identify control elements for early plant infection. Verticillium transcription activator of adhesion Vta2 is highly conserved in filamentous fungi but not present in yeasts. The Magnaporthe grisea ortholog conidiation regulator Con7 controls the formation of appressoria which are absent in Verticillium species. Vta2 was analyzed by using genetics, cell biology, transcriptomics, secretome proteomics and plant pathogenicity assays. Nuclear Vta2 activates the expression of the adhesin-encoding yeast flocculin genes FLO1 and FLO11. Vta2 is required for fungal growth of Verticillium where it is a positive regulator of conidiation. Vta2 is mandatory for accurate timing and suppression of microsclerotia as resting structures. Vta2 controls expression of 270 transcripts, including 10 putative genes for adhesins and 57 for secreted proteins. Vta2 controls the level of 125 secreted proteins, including putative adhesins or effector molecules and a secreted catalase-peroxidase. Vta2 is a major regulator of fungal pathogenesis, and controls host-plant root infection and H2 O2 detoxification. Verticillium impaired in Vta2 is unable to colonize plants and induce disease symptoms. Vta2 represents an interesting target for controlling the growth and development of these vascular pathogens.

Nature Communications | 2015

The iBeetle large-scale RNAi screen reveals gene functions for insect development and physiology

Christian Schmitt-Engel; Dorothea Schultheis; Nadi Ströhlein; Nicole Troelenberg; Upalparna Majumdar; Van Anh Dao; Daniela Grossmann; Tobias Richter; Maike Tech; Jürgen Dönitz; Lizzy Gerischer; Mirko Theis; Inga Schild; Jochen Trauner; Nikolaus Koniszewski; Elke Küster; Sebastian Kittelmann; Yonggang Hu; Sabrina Lehmann; Janna Siemanowski; Julia Ulrich; Kristen A. Panfilio; Reinhard Schröder; Burkhard Morgenstern; Mario Stanke; Frank Buchhholz; Manfred Frasch; Siegfried Roth; Ernst A. Wimmer; Michael Schoppmeier

Genetic screens are powerful tools to identify the genes required for a given biological process. However, for technical reasons, comprehensive screens have been restricted to very few model organisms. Therefore, although deep sequencing is revealing the genes of ever more insect species, the functional studies predominantly focus on candidate genes previously identified in Drosophila, which is biasing research towards conserved gene functions. RNAi screens in other organisms promise to reduce this bias. Here we present the results of the iBeetle screen, a large-scale, unbiased RNAi screen in the red flour beetle, Tribolium castaneum, which identifies gene functions in embryonic and postembryonic development, physiology and cell biology. The utility of Tribolium as a screening platform is demonstrated by the identification of genes involved in insect epithelial adhesion. This work transcends the restrictions of the candidate gene approach and opens fields of research not accessible in Drosophila.

BMC Bioinformatics | 2006

An unsupervised classification scheme for improving predictions of prokaryotic TIS

Maike Tech; Peter Meinicke

BackgroundAlthough it is not difficult for state-of-the-art gene finders to identify coding regions in prokaryotic genomes, exact prediction of the corresponding translation initiation sites (TIS) is still a challenging problem. Recently a number of post-processing tools have been proposed for improving the annotation of prokaryotic TIS. However, inherent difficulties of these approaches arise from the considerable variation of TIS characteristics across different species. Therefore prior assumptions about the properties of prokaryotic gene starts may cause suboptimal predictions for newly sequenced genomes with TIS signals differing from those of well-investigated genomes.ResultsWe introduce a clustering algorithm for completely unsupervised scoring of potential TIS, based on positionally smoothed probability matrices. The algorithm requires an initial gene prediction and the genomic sequence of the organism to perform the reannotation. As compared with other methods for improving predictions of gene starts in bacterial genomes, our approach is not based on any specific assumptions about prokaryotic TIS. Despite the generality of the underlying algorithm, the prediction rate of our method is competitive on experimentally verified test data from E. coli and B. subtilis. Regarding genomes with high G+C content, in contrast to some previously proposed methods, our algorithm also provides good performance on P. aeruginosa, B. pseudomallei and R. solanacearum.ConclusionOn reliable test data we showed that our method provides good results in post-processing the predictions of the widely-used program GLIMMER. The underlying clustering algorithm is robust with respect to variations in the initial TIS annotation and does not require specific assumptions about prokaryotic gene starts. These features are particularly useful on genomes with high G+C content. The algorithm has been implemented in the tool »TICO«(TIs COrrector) which is publicly available from our web site.

Nucleic Acids Research | 2006

TICO: a tool for postprocessing the predictions of prokaryotic translation initiation sites

Maike Tech; Burkhard Morgenstern; Peter Meinicke

Exact localization of the translation initiation sites (TIS) in prokaryotic genomes is difficult to achieve using conventional gene finders. We recently introduced the program TICO for postprocessing TIS predictions based on a completely unsupervised learning algorithm. The program can be utilized through our web interface at and it is also freely available as a commandline version for Linux and Windows. The latest version of our program provides a tool for visualization of the resulting TIS model. Although the underlying method is not based on any specific assumptions about characteristic sequence features of prokaryotic TIS the prediction rates of our tool are competitive on experimentally verified test data.

in Silico Biology | 2003