Torgeir R. Hvidsten
Norwegian University of Life Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Torgeir R. Hvidsten.
Nature | 2013
Björn Nystedt; Nathaniel R. Street; Anna Wetterbom; Andrea Zuccolo; Yao-Cheng Lin; Douglas G. Scofield; Francesco Vezzi; Nicolas Delhomme; Stefania Giacomello; Andrey Alexeyenko; Riccardo Vicedomini; Kristoffer Sahlin; Ellen Sherwood; Malin Elfstrand; Lydia Gramzow; Kristina Holmberg; Jimmie Hällman; Olivier Keech; Lisa Klasson; Maxim Koriabine; Melis Kucukoglu; Max Käller; Johannes Luthman; Fredrik Lysholm; Totte Niittylä; Åke Olson; Nemanja Rilakovic; Carol Ritland; Josep A. Rosselló; Juliana Stival Sena
Conifers have dominated forests for more than 200 million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000 base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding.
Clinical Cancer Research | 2005
Jayne L. Dennis; Torgeir R. Hvidsten; Ernst Wit; Jan Komorowski; Alexandra K. Bell; Ian Downie; Jacqueline Mooney; Caroline Verbeke; Christopher Bellamy; W. Nicol Keith; Karin A. Oien
Purpose: Patients with metastatic adenocarcinoma of unknown origin are a common clinical problem. Knowledge of the primary site is important for their management, but histologically, such tumors appear similar. Better diagnostic markers are needed to enable the assignment of metastases to likely sites of origin on pathologic samples. Experimental Design: Expression profiling of 27 candidate markers was done using tissue microarrays and immunohistochemistry. In the first (training) round, we studied 352 primary adenocarcinomas, from seven main sites (breast, colon, lung, ovary, pancreas, prostate and stomach) and their differential diagnoses. Data were analyzed in Microsoft Access and the Rosetta system, and used to develop a classification scheme. In the second (validation) round, we studied 100 primary adenocarcinomas and 30 paired metastases. Results: In the first round, we generated expression profiles for all 27 candidate markers in each of the seven main primary sites. Data analysis led to a simplified diagnostic panel and decision tree containing 10 markers only: CA125, CDX2, cytokeratins 7 and 20, estrogen receptor, gross cystic disease fluid protein 15, lysozyme, mesothelin, prostate-specific antigen, and thyroid transcription factor 1. Applying the panel and tree to the original data provided correct classification in 88%. The 10 markers and diagnostic algorithm were then tested in a second, independent, set of primary and metastatic tumors and again 88% were correctly classified. Conclusions: This classification scheme should enable better prediction on biopsy material of the primary site in patients with metastatic adenocarcinoma of unknown origin, leading to improved management and therapy.
Nature | 2016
Sigbjørn Lien; Ben F. Koop; Simen Rød Sandve; Jason R. Miller; Matthew Kent; Torfinn Nome; Torgeir R. Hvidsten; Jong Leong; David R. Minkley; Aleksey V. Zimin; Fabian Grammes; Harald Grove; Arne B. Gjuvsland; Brian Walenz; Russell A. Hermansen; Kristian R. von Schalburg; Eric B. Rondeau; Alex Di Genova; Jeevan Karloss Antony Samy; Jon Olav Vik; Magnus Dehli Vigeland; Lis Caler; Unni Grimholt; Sissel Jentoft; Dag Inge Våge; Pieter J. de Jong; Thomas Moen; Matthew Baranski; Yniv Palti; Douglas W. Smith
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Science | 2014
Matthias Pfeifer; Karl G. Kugler; Simen Rød Sandve; Bujie Zhan; Heidi Rudi; Torgeir R. Hvidsten; Klaus F. X. Mayer; Odd-Arne Olsen
Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat genomes (A, B, and D) has impeded expression analysis of the grain transcriptome. We used previously unknown genome information to analyze the cell type–specific expression of homeologous genes in the developing wheat grain and identified distinct co-expression clusters reflecting the spatiotemporal progression during endosperm development. We observed no global but cell type– and stage-dependent genome dominance, organization of the wheat genome into transcriptionally active chromosomal regions, and asymmetric expression in gene families related to baking quality. Our findings give insight into the transcriptional dynamics and genome interplay among individual grain cell types in a polyploid cereal genome.
Bioinformatics | 2003
Torgeir R. Hvidsten; Astrid Lægreid; Jan Komorowski
MOTIVATION Microarray technology enables large-scale inference of the participation of genes in biological process from similar expression profiles. Our aim is to induce classificatory models from expression data and biological knowledge that can automatically associate genes with novel hypotheses of biological process. RESULTS We report a systematic supervised learning approach to predicting biological process from time series of gene expression data and biological knowledge. Biological knowledge is expressed using gene ontology and this knowledge is associated with discriminatory expression-based features to form minimal decision rules. The resulting rule model is first evaluated on genes coding for proteins with known biological process roles using cross validation. Then it is used to generate hypotheses for genes for which no knowledge of participation in biological process could be found. The theoretical foundation for the methodology based on rough sets is outlined in the paper, and its practical application demonstrated on a data set previously published by Cho et al. (Nat. Genet., 27, 48-54, 2001). AVAILABILITY The Rosetta system is available at http://www.idi.ntnu.no/~aleks/rosetta. SUPPLEMENTARY INFORMATION http://www.lcb.uu.se/~hvidsten/bioinf_cho/
pacific symposium on biocomputing | 2000
Torgeir R. Hvidsten; Jan Komorowski; Arne K. Sandvik; Astrid Lægreid
We introduce a methodology for inducing predictive rule models for functional classification of gene expressions from microarray hybridisation experiments. The basic learning method is the rough set framework for rule induction. The methodology is different from the commonly used unsupervised clustering approaches in that it exploits background knowledge of gene function in a supervised manner. Genes are annotated using Ashburners Gene Ontology and the functional classes used for learning are mined from these annotations. From the original expression data, we extract a set of biologically meaningful features that are used for learning. A rule model is induced from the data described in terms of these features. Its predictive quality is fine-turned via cross-validation on subsets of the known genes prior to classification of unknown genes. The predictive and descriptive quality of such a rule model is demonstrated on the fibroblast serum response data previously analysed by Iyer et. al. Our analysis shows that the rules are capable of representing the complex relationship between gene expressions and function, and that it is possible to put forward high quality hypotheses about the function of unknown genes.
Proceedings of the National Academy of Sciences of the United States of America | 2011
Kyoko Baba; Anna Karlberg; Julien Schmidt; Jarmo Schrader; Torgeir R. Hvidsten; László Bakó; Rishikesh P. Bhalerao
The molecular basis of short-day–induced growth cessation and dormancy in the meristems of perennial plants (e.g., forest trees growing in temperate and high-latitude regions) is poorly understood. Using global transcript profiling, we show distinct stage-specific alterations in auxin responsiveness of the transcriptome in the stem tissues during short-day–induced growth cessation and both the transition to and establishment of dormancy in the cambial meristem of hybrid aspen trees. This stage-specific modulation of auxin signaling appears to be controlled via distinct mechanisms. Whereas the induction of growth cessation in the cambium could involve induction of repressor auxin response factors (ARFs) and down-regulation of activator ARFs, dormancy is associated with perturbation of the activity of the SKP-Cullin-F-boxTIR (SCFTIR) complex, leading to potential stabilization of repressor auxin (AUX)/indole-3-acetic acid (IAA) proteins. Although the role of hormones, such as abscisic acid (ABA) and gibberellic acid (GA), in growth cessation and dormancy is well established, our data now implicate auxin in this process. Importantly, in contrast to most developmental processes in which regulation by auxin involves changes in cellular auxin contents, day-length–regulated induction of cambial growth cessation and dormancy involves changes in auxin responses rather than auxin content.
New Phytologist | 2015
David Sundell; Chanaka Mannapperuma; Sergiu Netotea; Nicolas Delhomme; Yao-Cheng Lin; Andreas Sjödin; Yves Van de Peer; Stefan Jansson; Torgeir R. Hvidsten; Nathaniel R. Street
Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight.
Proteins | 2006
Helena Strömbergsson; Andriy Kryshtafovych; Peteris Prusis; Krzysztof Fidelis; Jarl E. S. Wikberg; Jan Komorowski; Torgeir R. Hvidsten
Modeling and understanding protein–ligand interactions is one of the most important goals in computational drug discovery. To this end, proteochemometrics uses structural and chemical descriptors from several proteins and several ligands to induce interaction‐models. Here, we present a new and generalized approach in which proteins varying greatly in terms of sequence and structure are represented by a library of local substructures. Using linear regression and rule‐based learning, we combine such local substructures with chemical descriptors from the ligands to model binding affinity for a training set of hydrolase and lyase enzymes. We evaluate the predictive performance of these models using cross validation and sets of unseen ligand with unknown three‐dimensional structure. The models are shown to generalize by outperforming models using descriptors from only proteins or only ligands, or models using global structure similarities rather than local similarities. Thus, we demonstrate that this approach is capable of describing dependencies between local structural properties and ligands in otherwise dissimilar protein structures. These dependencies are often, but not always, associated with local substructures that are in contact with the ligands. Finally, we show that strongly bound enzyme–ligand complexes require the presence of particular local substructures, while weakly bound complexes may be described by the absence of certain properties. The results demonstrate that the alignment‐independent approach using local substructures is capable of describing protein–ligand interaction for largely different proteins and hence opens up for proteochemometrics‐analysis of the interaction‐space of entire proteomes. Current approaches are limited to families of closely related proteins. families of closely related proteins. Proteins 2006.
Bioinformatics | 2008
Robin Andersson; Carl E.G. Bruder; Arkadiusz Piotrowski; Uwe Menzel; Helena Nord; Johanna Sandgren; Torgeir R. Hvidsten; Teresita Diaz de Ståhl; Jan P. Dumanski; Jan Komorowski
MOTIVATION Copy number profiling methods aim at assigning DNA copy numbers to chromosomal regions using measurements from microarray-based comparative genomic hybridizations. Among the proposed methods to this end, Hidden Markov Model (HMM)-based approaches seem promising since DNA copy number transitions are naturally captured in the model. Current discrete-index HMM-based approaches do not, however, take into account heterogeneous information regarding the genomic overlap between clones. Moreover, the majority of existing methods are restricted to chromosome-wise analysis. RESULTS We introduce a novel Segmental Maximum A Posteriori approach, SMAP, for DNA copy number profiling. Our method is based on discrete-index Hidden Markov Modeling and incorporates genomic distance and overlap between clones. We exploit a priori information through user-controllable parameterization that enables the identification of copy number deviations of various lengths and amplitudes. The model parameters may be inferred at a genome-wide scale to avoid overfitting of model parameters often resulting from chromosome-wise model inference. We report superior performances of SMAP on synthetic data when compared with two recent methods. When applied on our new experimental data, SMAP readily recognizes already known genetic aberrations including both large-scale regions with aberrant DNA copy number and changes affecting only single features on the array. We highlight the differences between the prediction of SMAP and the compared methods and show that SMAP accurately determines copy number changes and benefits from overlap consideration.