Janita Thusberg
Buck Institute for Research on Aging
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Janita Thusberg.
Human Mutation | 2011
Janita Thusberg; Ayodeji Olatubosun; Mauno Vihinen
Single nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans. The number of SNPs identified in the human genome is growing rapidly, but attaining experimental knowledge about the possible disease association of variants is laborious and time‐consuming. Several computational methods have been developed for the classification of SNPs according to their predicted pathogenicity. In this study, we have evaluated the performance of nine widely used pathogenicity prediction methods available on the Internet. The evaluated methods were MutPred, nsSNPAnalyzer, Panther, PhD‐SNP, PolyPhen, PolyPhen2, SIFT, SNAP, and SNPs&GO. The methods were tested with a set of over 40,000 pathogenic and neutral variants. We also assessed whether the type of original or substituting amino acid residue, the structural class of the protein, or the structural environment of the amino acid substitution, had an effect on the prediction performance. The performances of the programs ranged from poor (MCC 0.19) to reasonably good (MCC 0.65), and the results from the programs correlated poorly. The overall best performing methods in this study were SNPs&GO and MutPred, with accuracies reaching 0.82 and 0.81, respectively. Hum Mutat 32:1–11, 2011.
Human Mutation | 2009
Janita Thusberg; Mauno Vihinen
Many gene defects are relatively easy to identify experimentally, but obtaining information about the effects of sequence variations and elucidation of the detailed molecular mechanisms of genetic diseases will be among the next major efforts in mutation research. Amino acid substitutions may have diverse effects on protein structure and function; thus, a detailed analysis of the mutations is essential. Experimental study of the molecular effects of mutations is laborious, whereas useful and reliable information about the effects of amino acid substitutions can readily be obtained by theoretical methods. Experimentally defined structures and molecular modeling can be used as a basis for interpretation of the mutations. The effects of missense mutations can be analyzed even when the 3D structure of the protein has not been determined, although structure‐based analyses are more reliable. Structural analyses include studies of the contacts between residues, their implication for the stability of the protein, and the effects of the introduced residues. Investigations of steric and stereochemical consequences of substitutions provide insights on the molecular fit of the introduced residue. Mutations that change the electrostatic surface potential of a protein have wide‐ranging effects. Analyses of the effects of mutations on interactions with ligands and partners have been performed for elucidation of functional mutations. We have employed numerous methods for predicting the effects of amino acid substitutions. We discuss the applicability of these methods in the analysis of genes, proteins, and diseases to reveal protein structure–function relationships, which is essential to gain insights into disease genotype–phenotype correlations. Hum Mutat 0, 1–15, 2009.
Human Mutation | 2012
Ayodeji Olatubosun; Jouni Väliaho; Jani Härkönen; Janita Thusberg; Mauno Vihinen
High‐throughput sequencing data generation demands the development of methods for interpreting the effects of genomic variants. Numerous computational methods have been developed to assess the impact of variations because experimental methods are unable to cope with both the speed and volume of data generation. To harness the strength of currently available predictors, the Pathogenic‐or‐Not‐Pipeline (PON‐P) integrates five predictors to predict the probability that nonsynonymous variations affect protein function and may consequently be disease related. Random forest methodology‐based PON‐P shows consistently improved performance in cross‐validation tests and on independent test sets, providing ternary classification and statistical reliability estimate of results. Applied to missense variants in a melanoma cancer cell line, PON‐P predicts variants in 17 genes to affect protein function. Previous studies implicate nine of these genes in the pathogenesis of various forms of cancer. PON‐P may thus be used as a first step in screening and prioritizing variants to determine deleterious ones for further experimentation. Hum Mutat 33:1166–1174, 2012.
American Journal of Human Genetics | 2012
Christine Ackerman; Adam E. Locke; Eleanor Feingold; Benjamin Reshey; Karina Espana; Janita Thusberg; Sean D. Mooney; Lora J. H. Bean; Kenneth J. Dooley; Clifford L. Cua; Roger H. Reeves; Stephanie L. Sherman; Cheryl L. Maslen
About half of people with trisomy 21 have a congenital heart defect (CHD), whereas the remainder have a structurally normal heart, demonstrating that trisomy 21 is a significant risk factor but is not causal for abnormal heart development. Atrioventricular septal defects (AVSD) are the most commonly occurring heart defects in Down syndrome (DS), and ∼65% of all AVSD is associated with DS. We used a candidate-gene approach among individuals with DS and complete AVSD (cases = 141) and DS with no CHD (controls = 141) to determine whether rare genetic variants in genes involved in atrioventricular valvuloseptal morphogenesis contribute to AVSD in this sensitized population. We found a significant excess (p < 0.0001) of variants predicted to be deleterious in cases compared to controls. At the most stringent level of filtering, we found potentially damaging variants in nearly 20% of cases but fewer than 3% of controls. The variants with the highest probability of being damaging in cases only were found in six genes: COL6A1, COL6A2, CRELD1, FBLN2, FRZB, and GATA5. Several of the case-specific variants were recurrent in unrelated individuals, occurring in 10% of cases studied. No variants with an equal probability of being damaging were found in controls, demonstrating a highly specific association with AVSD. Of note, all of these genes are in the VEGF-A pathway, even though the candidate genes analyzed in this study represented numerous biochemical and developmental pathways, suggesting that rare variants in the VEGF-A pathway might contribute to the genetic underpinnings of AVSD in humans.
Proteins | 2008
Ilkka Lappalainen; Janita Thusberg; Bairong Shen; Mauno Vihinen
The authors have made a genome‐wide analysis of mutations in Src homology 2 (SH2) domains associated with human disease. Disease‐causing mutations have been detected in the SH2 domains of cytoplasmic signaling proteins Bruton tyrosine kinase (BTK), SH2D1A, Ras GTPase activating protein (RasGAP), ZAP‐70, SHP‐2, STAT1, STAT5B, and the p85α subunit of the PIP3. Mutations in the BTK, SH2D1A, ZAP70, STAT1, and STAT5B genes have been shown to cause diverse immunodeficiencies, whereas the mutations in RASA1 and PIK3R1 genes lead to basal carcinoma and diabetes, respectively. PTPN11 mutations cause Noonan sydrome and different types of cancer, depending mainly on whether the mutation is inherited or sporadic. We collected and analyzed all known pathogenic mutations affecting human SH2 domains by bioinformatics methods. Among the investigated protein properties are sequence conservation and covariance, structural stability, side chain rotamers, packing effects, surface electrostatics, hydrogen bond formation, accessible surface area, salt bridges, and residue contacts. The majority of the mutations affect positions essential for phosphotyrosine ligand binding and specificity. The structural basis of the SH2 domain diseases was elucidated based on the bioinformatic analysis. Proteins 2008.
BMC Genomics | 2014
Biao Li; Chet Seligman; Janita Thusberg; Jackson L Miller; Jim Auer; Michelle Whirl-Carrillo; Emidio Capriotti; Teri E. Klein; Sean D. Mooney
BackgroundMissense pharmacogenomic (PGx) variants refer to amino acid substitutions that potentially affect the pharmacokinetic (PK) or pharmacodynamic (PD) response to drug therapies. The PGx variants, as compared to disease-associated variants, have not been investigated as deeply. The ability to computationally predict future PGx variants is desirable; however, it is not clear what data sets should be used or what features are beneficial to this end. Hence we carried out a comparative characterization of PGx variants with annotated neutral and disease variants from UniProt, to test the predictive power of sequence conservation and structural information in discriminating these three groups.Results126 PGx variants of high quality from PharmGKB were selected and two data sets were created: one set contained 416 variants with structural and sequence information, and, the other set contained 1,265 variants with sequence information only. In terms of sequence conservation, PGx variants are more conserved than neutral variants and much less conserved than disease variants. A weighted random forest was used to strike a more balanced classification for PGx variants. Generally structural features are helpful in discriminating PGx variant from the other two groups, but still classification of PGx from neutral polymorphisms is much less effective than between disease and neutral variants.ConclusionsWe found that PGx variants are much more similar to neutral variants than to disease variants in the feature space consisting of residue conservation, neighboring residue conservation, number of neighbors, and protein solvent accessibility. Such similarity poses great difficulty in the classification of PGx variants and polymorphisms.
Thrombosis Journal | 2008
Yue-Mei Fan; Pekka J. Karhunen; Mari Levula; Erkki Ilveskoski; Jussi Mikkelsson; Olli A. Kajander; Otso Järvinen; Niku Oksala; Janita Thusberg; Mauno Vihinen; Juha-Pekka Salenius; Leena Kytömäki; Juhani T. Soini; Reijo Laaksonen; Terho Lehtimäki
BackgroundDisturbed cellular cholesterol homeostasis may lead to accumulation of cholesterol in human atheroma plaques. Cellular cholesterol homeostasis is controlled by the sterol regulatory element-binding transcription factor 2 (SREBF-2) and the SREBF cleavage-activating protein (SCAP). We investigated whole genome expression in a series of human atherosclerotic samples from different vascular territories and studied whether the non-synonymous coding variants in the interacting domains of two genes, SREBF-2 1784G>C (rs2228314) and SCAP 2386A>G, are related to the progression of coronary atherosclerosis and the risk of pre-hospital sudden cardiac death (SCD).MethodsWhole genome expression profiling was completed in twenty vascular samples from carotid, aortic and femoral atherosclerotic plaques and six control samples from internal mammary arteries. Three hundred sudden pre-hospital deaths of middle-aged (33–69 years) Caucasian Finnish men were subjected to detailed autopsy in the Helsinki Sudden Death Study. Coronary narrowing and areas of coronary wall covered with fatty streaks or fibrotic, calcified or complicated lesions were measured and related to the SREBF-2 and SCAP genotypes.ResultsWhole genome expression profiling showed a significant (p = 0.02) down-regulation of SREBF-2 in atherosclerotic carotid plaques (types IV-V), but not in the aorta or femoral arteries (p = NS for both), as compared with the histologically confirmed non-atherosclerotic tissues. In logistic regression analysis, a significant interaction between the SREBF-2 1784G>C and the SCAP 2386A>G genotype was observed on the risk of SCD (p = 0.046). Men with the SREBF-2 C allele and the SCAP G allele had a significantly increased risk of SCD (OR 2.68, 95% CI 1.07–6.71), compared to SCAP AA homologous subjects carrying the SREBF-2 C allele. Furthermore, similar trends for having complicated lesions and for the occurrence of thrombosis were found, although the results were not statistically significant.ConclusionThe results suggest that the allelic variants (SREBF-2 1784G>C and SCAP 2386A>G) in the cholesterol homeostasis regulating SREBF-SCAP pathway may contribute to SCD in early middle-aged men.
The Scientific World Journal | 2013
Serena Catarzi; Anna Caciotti; Janita Thusberg; Rodolfo Tonin; Sabrina Malvagia; Giancarlo la Marca; Elisabetta Pasquini; Catia Cavicchi; Lorenzo Ferri; Maria Anna Donati; Federico Baronio; Renzo Guerrini; Sean D. Mooney; Amelia Morrone
Medium-chain acyl-CoA dehydrogenase deficiency (MCADD) is a disorder of fatty acid oxidation characterized by hypoglycemic crisis under fasting or during stress conditions, leading to lethargy, seizures, brain damage, or even death. Biochemical acylcarnitines data obtained through newborn screening by liquid chromatography-tandem mass spectrometry (LC-MS/MS) were confirmed by molecular analysis of the medium-chain acyl-CoA dehydrogenase (ACADM) gene. Out of 324.000 newborns screened, we identified 14 MCADD patients, in whom, by molecular analysis, we found a new nonsense c.823G>T (p.Gly275∗) and two new missense mutations: c.253G>C (p.Gly85Arg) and c.356T>A (p.Val119Asp). Bioinformatics predictions based on both phylogenetic conservation and functional/structural software were used to characterize the new identified variants. Our findings confirm the rising incidence of MCADD whose existence is increasingly recognized due to the efficacy of an expanded newborn screening panel by LC-MS/MS making possible early specific therapies that can prevent possible crises in at-risk infants. We noticed that the “common” p.Lys329Glu mutation only accounted for 32% of the defective alleles, while, in clinically diagnosed patients, this mutation accounted for 90% of defective alleles. Unclassified variants (UVs or VUSs) are especially critical when considering screening programs. The functional and pathogenic characterization of genetic variants presented here is required to predict their medical consequences in newborns.
Human Mutation | 2017
Binghuang Cai; Biao Li; Nikki Kiga; Janita Thusberg; Timothy Bergquist; Yun-Ching Chen; Noushin Niknafs; Hannah Carter; Collin Tokheim; Violeta Beleva-Guthrie; Christopher Douville; Rohit Bhattacharya; Hui Ting Grace Yeo; Jean Fan; Sohini Sengupta; Dewey Kim; Melissa S. Cline; Tychele N. Turner; Mark Diekhans; Jan Zaucha; Lipika R. Pal; Chen Cao; Chen-Hsin Yu; Yizhou Yin; Marco Carraro; Manuel Giollo; Carlo Ferrari; Emanuela Leonardi; Jason Bobe; Madeleine Ball
The advent of next‐generation sequencing has dramatically decreased the cost for whole‐genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics communitys ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.
international conference on bioinformatics | 2007
Miika Ahdesmäki; Janita Thusberg; Heikki Huttunen; Mauno Vihinen; Olli Yli-Harja
This paper presents a simple way of predicting locations in protein sequences that are prone to disease-causing mutations. These locations are found with the help of outlier analysis of recurrence quantification analysis (RQA) applied to protein solvent accessibility measurements. The detected locations are related to the deterministic patterns of the protein sequences. These deterministic patterns may be related to binding and folding and thus changes in these locations can cause disease by altering the structural and/or functional properties of the protein.