Antje Krause
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Antje Krause.
Nucleic Acids Research | 2000
Antje Krause; Jens Stoye; Martin Vingron
The SYSTERS (short for SYSTEmatic Re-Searching) protein sequence cluster set consists of the classification of all sequences from SWISS-PROT and PIR into disjoint protein family clusters and hierarchically into superfamily and subfamily clusters. The cluster set can be searched with a sequence using the SSMAL search tool or a traditional database search tool like BLAST or FASTA. Additionally a multiple alignment is generated for each cluster and annotated with domain information from the Pfam database of protein domain families. A taxonomic overview of the organisms covered by a cluster is given based on the NCBI taxonomy. The cluster set is available for querying and browsing at http://www.dkfz-heidelberg. de/tbi/services/cluster/systersform
BMC Bioinformatics | 2005
Antje Krause; Jens Stoye; Martin Vingron
BackgroundSearching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to.ResultsWe report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences resulting in a hierarchical clustering which is being made available for querying and browsing at http://systers.molgen.mpg.de/.ConclusionsComparisons with other widely used clustering methods on various data sets show the abilities and strengths of our clustering methods in producing a biologically meaningful grouping of protein sequences.
Nucleic Acids Research | 2002
Antje Krause; Stefan A. Haas; Eivind Coward; Martin Vingron
We have integrated the protein families from SYSTERS and the expressed sequence tag (EST) clusters from our database GeneNest with SpliceNest, a new database mapping EST contigs into genomic DNA. The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT, TrEMBL and PIR databases into disjoint protein family and superfamily clusters. GeneNest is a database and software package for producing and visualizing gene indices from ESTs and mRNAs. Currently, the database comprises gene indices of human, mouse, Arabidopsis thaliana and zebrafish. SpliceNest is a web-based graphical tool to explore gene structure, including alternative splicing, based on a mapping of the EST consensus sequences from GeneNest to the complete human genome. The integration of SYSTERS, GeneNest and SpliceNest into one framework now permits an overall exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA. The databases are available for querying and browsing at http://cmb.molgen.mpg.de.
Bioinformatics | 1998
Antje Krause; Martin Vingron
MOTIVATION In this paper, we introduce an iterative method of database searching and apply it to design a database clustering algorithm applicable to an entire protein database. The clustering procedure relies on the quality of the database searching routine and further improves its results based on a set-theoretic analysis of a highly redundant yet efficient to generate cluster system. RESULTS Overall, we achieve unambiguous assignment of 80% of SWISS-PROT sequences to non-overlapping sequence clusters in an entirely automatic fashion. Our results are compared to an expert-generated clustering for validation. The database searching method is fast and the clustering technique does not require time-consuming all-against-all comparison. This allows for fast clustering of large amounts of sequences. AVAILABILITY The resulting clustering for the PIR1 (Release 51) and SWISS-PROT (Release 34) databases is available over the Internet from http://www.dkfz-heidelberg.de/tbi/services/modest/b rowsesysters.pl. CONTACT [email protected]; [email protected]
Trends in Genetics | 2000
Stefan A. Haas; Tim Beissbarth; Eric Rivals; Antje Krause; Martin Vingron
positive feedback from these externalusers, we recommend our RUMMAGEannotation service to everyone whowants to get a quick and comprehen-sive overview of genomic sequencedata.The RUMMAGE SequenceAnnotation Service is available athttp://gen100.imb-jena.de/~baumgart/rummage/register.html. The URL leadsto a registration form that has to be sub-mitted before the first use. This is necessary to provide a user-specific pass-word, which ensures confidential treat-ment of the sequence data and the corresponding annotation results. Assoon as the password is assigned, eachuser may run as many jobs as desired.
Nucleic Acids Research | 2004
Thomas Meinel; Antje Krause; Hannes Luz; Martin Vingron; Eike Staub
The SYSTERS project aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure. A refined two-step algorithm assigns each protein to a family and a superfamily. The sequence data underlying SYSTERS release 4 now comprise several protein sequence databases derived from completely sequenced genomes (ENSEMBL, TAIR, SGD and GeneDB), in addition to the comprehensive Swiss-Prot/TrEMBL databases. The SYSTERS web server (http://systers.molgen.mpg.de) provides access to 158 153 SYSTERS protein families. To augment the automatically derived results, information from external databases like Pfam and Gene Ontology are added to the web server. Furthermore, users can retrieve pre-processed analyses of families like multiple alignments and phylogenetic trees. New query options comprise a batch retrieval tool for functional inference about families based on automatic keyword extraction from sequence annotations. A new access point, PhyloMatrix, allows the retrieval of phylogenetic profiles of SYSTERS families across organisms with completely sequenced genomes.
german conference on bioinformatics | 1999
Antje Krause; Pierre Nicodème; Erich Bornberg-Bauer; Marc Rehmsmeier; Martin Vingron
SUMMARY We present a Web server where the SYSTERS cluster set of the non-redundant protein database consisting of sequences from SWISS-PROT and PIR is being made available for querying and browsing. The cluster set can be searched with a new sequence using the SSMAL search tool. Additionally, a multiple alignment is generated for each cluster and annotated with domain information from the Pfam protein family database. AVAILABILITY The server address is http://www.dkfz-heidelberg.de/tbi/services/cluster/ systersform
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012
Chong Wang; Peter Beyerlein; Heike Pospisil; Antje Krause; Chris D. Nugent; Werner Dubitzky
Characterization of the kinetic and conformational properties of channel proteins is a crucial element in the integrative study of congenital cardiac diseases. The proteins of the ion channels of cardiomyocytes represent an important family of biological components determining the physiology of the heart. Some computational studies aiming to understand the mechanisms of the ion channels of cardiomyocytes have concentrated on Markovian stochastic approaches. Mathematically, these approaches employ Chapman-Kolmogorov equations coupled with partial differential equations. As the scale and complexity of such subcellular and cellular models increases, the balance between efficiency and accuracy of algorithms becomes critical. We have developed a novel two-stage splitting algorithm to address efficiency and accuracy issues arising in such modeling and simulation scenarios. Numerical experiments were performed based on the incorporation of our newly developed conformational kinetic model for the rapid delayed rectifier potassium channel into the dynamic models of human ventricular myocytes. Our results show that the new algorithm significantly outperforms commonly adopted adaptive Runge-Kutta methods. Furthermore, our parallel simulations with coupled algorithms for multicellular cardiac tissue demonstrate a high linearity in the speedup of large-scale cardiac simulations.
Evolutionary Bioinformatics | 2012
Thomas Meinel; Antje Krause
In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, disruptive biological effects and the steadily increasing number of genomes have led to a huge diversity in published phylogenies. Comparison of those and, moreover, identification of the impact of inference properties (underlying data model, inference technique) on particular reconstructions is almost impossible. In this work, we introduce tree topology profiling as a method to compare already published whole-genome phylogenies. This method requires visual determination of the particular topology in a drawn whole-genome phylogeny for a set of particular bacterial clans. For each clan, neighborhoods to other bacteria are collected into a catalogue of generalized alternative topologies. Particular topology alternatives found for an ordered list of bacterial clans reveal a topology profile that represents the analyzed phylogeny. To simulate the inhomogeneity of published gene content phylogenies we generate a set of seven phylogenies using different inference techniques and the SYSTERS-PhyloMatrix data model. After tree topology profiling on in total 54 selected published and newly inferred phylogenies, we separate artefactual from biologically meaningful phylogenies and associate particular inference results (phylogenies) with inference background (inference techniques as well as data models). Topological relationships of particular bacterial species groups are presented. With this work we introduce tree topology profiling into the scientific field of comparative phylogenomics.
international conference on bioinformatics | 2009
Stephanie Tscherneck; Sarah Strunk; Christian Schmidt; Paul Hammer; Ronny Amberg; Chong Wang; Rainer Gillert; Antje Krause; Gabriele Petznick; Peter Beyerlein
We are exploring a novel information theory based approach to analyse the relation of the number of protein interactions (interaction perplexity) and protein function. The interaction perplexity is translated into a smooth sigmoid function, indicating if the protein is a hub candidate or not (hub likelihood). The overall mutual information between the hub likelihood and the protein function exhibits a clear maximum at an interaction perplexity of 21. Moreover the conditional probability of observing a certain protein function given the hub likelihood shows discriminative power. In addition our findings support that the mutual information between hub likelihood and protein function depends upon the cell compartment under investigation, its maximum ranging up to 29% of the achievable information value at interaction perplexities, ranging from 15 to 50.