Thomas Nordahl Petersen
Technical University of Denmark
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Thomas Nordahl Petersen.
Nature Methods | 2011
Thomas Nordahl Petersen; Søren Brunak; Gunnar von Heijne; Henrik Nielsen
We benchmarked SignalP 4.0 against SignalP 3.0 and ten other signal peptide prediction algorithms (Fig. 1). We compared prediction performance using the Matthews correlation coefficient16, for which each sequence was counted as a true or false positive or negative. To test SignalP 4.0 performance, we did not use data that had been used in training the networks or selecting the optimal architecture, and the test data did not contain homologs to the training and optimization data (Supplementary Methods). The test set for SignalP 3.0 was also independent of the training set because we removed sequences used to construct SignalP 3.0 and their homologs from the benchmark data. For other algorithms more recent than SignalP 3.0, the benchmark data may include data used to train the methods, possibly leading to slight overestimations of their performance. Our results show that SignalP 4.0 was the best signal-peptide predictor for all three organism types (Fig. 1). This comes at a price, however, because SignalP 4.0 was not in all cases as good as SignalP 3.0 according to cleavage-site sensitivity or signal-peptide correlation when there are no transmembrane proteins present (Supplementary Results). An ideal method would have the best SignalP 4.0: discriminating signal peptides from transmembrane regions
BMC Structural Biology | 2009
Bent Petersen; Thomas Nordahl Petersen; Pernille Andersen; Morten Nielsen; Claus Lundegaard
BackgroundEstimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score.ResultsAn ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output.ConclusionThe performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearsons correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearsons correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
Nucleic Acids Research | 2010
Morten Nielsen; Claus Lundegaard; Ole Lund; Thomas Nordahl Petersen
CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile–profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 Å when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 Å. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is <20 min. The web server is available at http://www.cbs.dtu.dk/services/CPHmodels/.
Proteins | 2000
Thomas Nordahl Petersen; Claus Lundegaard; Morten Nielsen; Henrik Bohr; Jakob Bohr; Søren Brunak; Garry P. Gippert; Ole Lund
Secondary structure prediction involving up to 800 neural network predictions has been developed, by use of novel methods such as output expansion and a unique balloting procedure. An overall performance of 77.2%–80.2% (77.9%–80.6% mean per‐chain) for three‐state (helix, strand, coil) prediction was obtained when evaluated on a commonly used set of 126 protein chains. The method uses profiles made by position‐specific scoring matrices as input, while at the output level it predicts on three consecutive residues simultaneously. The predictions arise from tenfold, cross validated training and testing of 1032 protein sequences, using a scheme with primary structure neural networks followed by structure filtering neural networks. With respect to blind prediction, this work is preliminary and awaits evaluation by CASP4. Proteins 2000;41:17–20.
Journal of Biological Chemistry | 2011
Katrine T. Schjoldager; Malene Bech Vester-Christensen; Christoffer K. Goth; Thomas Nordahl Petersen; Søren Brunak; Eric P. Bennett; Steven B. Levery; Henrik Clausen
Background: GalNAc-type O-glycosylation is emerging as a co-regulator of proprotein convertase processing of proteins. Results: O-Glycosylation within at least ±3 residues of the RXXR substrate motif for furin affected processing. Conclusion: Site-specific O-glycosylation by 20 polypeptide GalNAc transferases have wide co-regulatory functions in proprotein processing. Significance: This is the first systematic study that paves the way for wider co-regulatory functions of O-glycosylation in protein processing. Site-specific GalNAc-type O-glycosylation is emerging as an important co-regulator of proprotein convertase (PC) processing of proteins. PC processing is crucial in regulating many fundamental biological pathways and O-glycans in or immediately adjacent to processing sites may affect recognition and function of PCs. Thus, we previously demonstrated that deficiency in site-specific O-glycosylation in a PC site of the fibroblast growth factor, FGF23, resulted in marked reduction in secretion of active unprocessed FGF23, which cause familial tumoral calcinosis and hyperostosis hyperphosphatemia. GalNAc-type O-glycosylation is found on serine and threonine amino acids and up to 20 distinct polypeptide GalNAc transferases catalyze the first addition of GalNAc to proteins making this step the most complex and differentially regulated steps in protein glycosylation. There is no reliable prediction model for O-glycosylation especially of isolated sites, but serine and to a lesser extent threonine residues are frequently found adjacent to PC processing sites. In the present study we used in vitro enzyme assays and ex vivo cell models to systematically address the boundaries of the region within site-specific O-glycosylation affect PC processing. The results demonstrate that O-glycans within at least ±3 residues of the RXXR furin cleavage site may affect PC processing suggesting that site-specific O-glycosylation is a major co-regulator of PC processing.
Experimental Neurology | 2007
Nicolaj S. Christophersen; Mette Grønborg; Thomas Nordahl Petersen; Lone Fjord-Larsen; Jesper Roland Jørgensen; Bengt Juliusson; Nikolaj Blom; Carl Rosenblad; Patrik Brundin
Affymetrix GeneChip technology and quantitative real-time PCR (Q-PCR) were used to examine changes in gene expression in the adult murine substantia nigra pars compacta (SNc) following lentiviral glial cell line-derived neurotrophic factor (GDNF) delivery in adult striatum. We identified several genes that were upregulated after GDNF treatment. Among these, the gene encoding the transmembrane protein Delta-like 1 homologue (Dlk1) was upregulated with a greater than 4-fold increase in mRNA encoding this protein. Immunohistochemistry with a Dlk1-specific antibody confirmed the observed upregulation with increased positive staining of cell bodies in the SNc and fibers in the striatum. Analysis of the developmental regulation of Dlk1 in the murine ventral midbrain showed that the upregulation of Dlk1 mRNA correlated with the generation of tyrosine hydroxylase (TH)-positive neurons. Furthermore, Dlk1 expression was analyzed in MesC2.10 cells, which are derived from embryonic human mesencephalon and capable of undergoing differentiation into dopaminergic neurons. We detected upregulation of Dlk1 mRNA and protein under conditions where MesC2.10 cells differentiate into a dopaminergic phenotype (41.7+/-7.1% Dlk1+ cells). In contrast, control cultures subjected to default differentiation into non-dopaminergic neurons only expressed very few (3.7+/-1.3%) Dlk1-immunopositive cells. The expression of Dlk1 in MesC2.10 cells was specifically upregulated by the addition of GDNF. Thus, our data suggest that Dlk1 expression precedes the appearance of TH in mesencephalic cells and that levels of Dlk1 are regulated by GDNF.
Scientific Reports | 2015
Thomas Nordahl Petersen; Simon Rasmussen; Henrik Hasman; Christian Carøe; Jacob Bælum; Anna Charlotte Schultz; Lasse Bergmark; Christina Aaby Svendsen; Ole Lund; Thomas Sicheritz-Pontén; Frank Møller Aarestrup
Human populations worldwide are increasingly confronted with infectious diseases and antimicrobial resistance spreading faster and appearing more frequently. Knowledge regarding their occurrence and worldwide transmission is important to control outbreaks and prevent epidemics. Here, we performed shotgun sequencing of toilet waste from 18 international airplanes arriving in Copenhagen, Denmark, from nine cities in three world regions. An average of 18.6 Gb (14.8 to 25.7 Gb) of raw Illumina paired end sequence data was generated, cleaned, trimmed and mapped against reference sequence databases for bacteria and antimicrobial resistance genes. An average of 106,839 (0.06%) reads were assigned to resistance genes with genes encoding resistance to tetracycline, macrolide and beta-lactam resistance genes as the most abundant in all samples. We found significantly higher abundance and diversity of genes encoding antimicrobial resistance, including critical important resistance (e.g. blaCTX-M) carried on airplanes from South Asia compared to North America. Presence of Salmonella enterica and norovirus were also detected in higher amounts from South Asia, whereas Clostridium difficile was most abundant in samples from North America. Our study provides a first step towards a potential novel strategy for global surveillance enabling simultaneous detection of multiple human health threatening genetic elements, infectious agents and resistance genes.
Experimental Neurology | 2006
Jesper Roland Jørgensen; Bengt Juliusson; Karen Friis Henriksen; Claus Hansen; Steen Knudsen; Thomas Nordahl Petersen; Nikolaj Blom; Åke Seiger; Lars Wahlberg
In the human embryo, from approximately 6 weeks gestational age (GA), dopaminergic (DA) neurons can be found in the ventral mesencephalon (VM). More specifically, the post-mitotic neurons are located in the ventral part of the tegmentum (VT), whereas no mature DA neurons are found in the neighboring dorsal part. We used Affymetrix HG-U133 GeneChip technology to compare genome-wide expression profiles of ventral and dorsal tegmentum from 8 weeks GA human embryos, in order to identify genes involved in specification, differentiation, and survival of mesencephalic DA (mDA) neurons. Known mDA marker genes including ALDH1A1, DAT1, VMAT2, TH, CALB1, NURR1, FOXA1, GIRK2, PITX3, RET, and DRD2 topped the list of 96 genes from HG-U133A with higher expression in VT, validating the experimental set-up. In addition, 28 probes from HG-U133B were identified whereof most are annotated to UniGene clusters with no gene associated or to genes of unknown function. Of these, the fifteen most regulated transcripts, representing changes down to 56% could be verified by quantitative real-time PCR (Q-PCR) on a developmental series of subdissected human embryonic and fetal brain material, resulting in not only a regional but also a temporal expression profile. This revealed a distinct DA-associated profile for in particular a putative transcription factor (FLJ45455) and the uncharacterized transmembrane proteins KIAA1145 and SLC10A4. The data presented here may help to device cell replacement and regenerative therapies for Parkinsons disease (PD).
Proteins | 2014
Henrik Marcus Geertz-Hansen; Nikolaj Blom; Adam M. Feist; Søren Brunak; Thomas Nordahl Petersen
Obtaining optimal cofactor balance to drive production is a challenge in metabolically engineered microbial production strains. To facilitate identification of heterologous enzymes with desirable altered cofactor requirements from native content, we have developed Cofactory, a method for prediction of enzyme cofactor specificity using only primary amino acid sequence information. The algorithm identifies potential cofactor binding Rossmann folds and predicts the specificity for the cofactors FAD(H2), NAD(H), and NADP(H). The Rossmann fold sequence search is carried out using hidden Markov models whereas artificial neural networks are used for specificity prediction. Training was carried out using experimental data from protein–cofactor structure complexes. The overall performance was benchmarked against an independent evaluation set obtaining Matthews correlation coefficients of 0.94, 0.79, and 0.65 for FAD(H2), NAD(H), and NADP(H), respectively. The Cofactory method is made publicly available at http://www.cbs.dtu.dk/services/Cofactory. Proteins 2014; 82:1819–1828.
PLOS ONE | 2017
Thomas Nordahl Petersen; Oksana Lukjancenko; Martin Christen Frølund Thomsen; Maria Maddalena Sperotto; Ole Lund; Frank Møller Aarestrup; Thomas Sicheritz-Pontén
An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.