Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas A. Peterson is active.

Publication


Featured researches published by Thomas A. Peterson.


PLOS Pathogens | 2013

Crosstalk between the Circadian Clock and Innate Immunity in Arabidopsis

Chong Zhang; Qiguang Xie; Ryan G. Anderson; Gina Ng; Nicholas C. Seitz; Thomas A. Peterson; C. Robertson McClung; John M. McDowell; Dongdong Kong; June M. Kwak; Hua Lu

The circadian clock integrates temporal information with environmental cues in regulating plant development and physiology. Recently, the circadian clock has been shown to affect plant responses to biotic cues. To further examine this role of the circadian clock, we tested disease resistance in mutants disrupted in CCA1 and LHY, which act synergistically to regulate clock activity. We found that cca1 and lhy mutants also synergistically affect basal and resistance gene-mediated defense against Pseudomonas syringae and Hyaloperonospora arabidopsidis. Disrupting the circadian clock caused by overexpression of CCA1 or LHY also resulted in severe susceptibility to P. syringae. We identified a downstream target of CCA1 and LHY, GRP7, a key constituent of a slave oscillator regulated by the circadian clock and previously shown to influence plant defense and stomatal activity. We show that the defense role of CCA1 and LHY against P. syringae is at least partially through circadian control of stomatal aperture but is independent of defense mediated by salicylic acid. Furthermore, we found defense activation by P. syringae infection and treatment with the elicitor flg22 can feedback-regulate clock activity. Together this data strongly supports a direct role of the circadian clock in defense control and reveal for the first time crosstalk between the circadian clock and plant innate immunity.


Journal of Molecular Biology | 2013

Towards precision medicine: advances in computational approaches for the analysis of human variants.

Thomas A. Peterson; Emily Doughty; Maricel G. Kann

Variations and similarities in our individual genomes are part of our history, our heritage, and our identity. Some human genomic variants are associated with common traits such as hair and eye color, while others are associated with susceptibility to disease or response to drug treatment. Identifying the human variations producing clinically relevant phenotypic changes is critical for providing accurate and personalized diagnosis, prognosis, and treatment for diseases. Furthermore, a better understanding of the molecular underpinning of disease can lead to development of new drug targets for precision medicine. Several resources have been designed for collecting and storing human genomic variations in highly structured, easily accessible databases. Unfortunately, a vast amount of information about these genetic variants and their functional and phenotypic associations is currently buried in the literature, only accessible by manual curation or sophisticated text text-mining technology to extract the relevant information. In addition, the low cost of sequencing technologies coupled with increasing computational power has enabled the development of numerous computational methodologies to predict the pathogenicity of human variants. This review provides a detailed comparison of current human variant resources, including HGMD, OMIM, ClinVar, and UniProt/Swiss-Prot, followed by an overview of the computational methods and techniques used to leverage the available data to predict novel deleterious variants. We expect these resources and tools to become the foundation for understanding the molecular details of genomic variants leading to disease, which in turn will enable the promise of precision medicine.


Bioinformatics | 2011

Toward an automatic method for extracting cancer-and other disease-related point mutations from the biomedical literature

Emily Doughty; Attila Kertesz-Farkas; Olivier Bodenreider; Gary Thompson; Asa Adadey; Thomas A. Peterson; Maricel G. Kann

MOTIVATION A major goal of biomedical research in personalized medicine is to find relationships between mutations and their corresponding disease phenotypes. However, most of the disease-related mutational data are currently buried in the biomedical literature in textual form and lack the necessary structure to allow easy retrieval and visualization. We introduce a high-throughput computational method for the identification of relevant disease mutations in PubMed abstracts applied to prostate (PCa) and breast cancer (BCa) mutations. RESULTS We developed the extractor of mutations (EMU) tool to identify mutations and their associated genes. We benchmarked EMU against MutationFinder--a tool to extract point mutations from text. Our results show that both methods achieve comparable performance on two manually curated datasets. We also benchmarked EMUs performance for extracting the complete mutational information and phenotype. Remarkably, we show that one of the steps in our approach, a filter based on sequence analysis, increases the precision for that task from 0.34 to 0.59 (PCa) and from 0.39 to 0.61 (BCa). We also show that this high-throughput approach can be extended to other diseases. DISCUSSION Our method improves the current status of disease-mutation databases by significantly increasing the number of annotated mutations. We found 51 and 128 mutations manually verified to be related to PCa and Bca, respectively, that are not currently annotated for these cancer types in the OMIM or Swiss-Prot databases. EMUs retrieval performance represents a 2-fold improvement in the number of annotated mutations for PCa and BCa. We further show that our method can benefit from full-text analysis once there is an increase in Open Access availability of full-text articles. AVAILABILITY Freely available at: http://bioinf.umbc.edu/EMU/ftp.


Bioinformatics | 2010

DMDM: Domain Mapping of Disease Mutations

Thomas A. Peterson; Asa Adadey; Ivette Santana-Cruz; Yanan Sun; Andrew Winder; Maricel G. Kann

UNLABELLED Domain mapping of disease mutations (DMDM) is a database in which each disease mutation can be displayed by its gene, protein or domain location. DMDM provides a unique domain-level view where all human coding mutations are mapped on the protein domain. To build DMDM, all human proteins were aligned to a database of conserved protein domains using a Hidden Markov Model-based sequence alignment tool (HMMer). The resulting protein-domain alignments were used to provide a domain location for all available human disease mutations and polymorphisms. The number of disease mutations and polymorphisms in each domain position are displayed alongside other relevant functional information (e.g. the binding and catalytic activity of the site and the conservation of that domain location). DMDMs protein domain view highlights molecular relationships among mutations from different diseases that might not be clearly observed with traditional gene-centric visualization tools. AVAILABILITY Freely available at http://bioinf.umbc.edu/dmdm.


BMC Genomics | 2012

Domain landscapes of somatic mutations in cancer

Nathan L Nehrt; Thomas A. Peterson; DoHwan Park; Maricel G. Kann

BackgroundLarge-scale tumor sequencing projects are now underway to identify genetic mutations that drive tumor initiation and development. Most studies take a gene-based approach to identifying driver mutations, highlighting genes mutated in a large percentage of tumor samples as those likely to contain driver mutations. However, this gene-based approach usually does not consider the position of the mutation within the gene or the functional context the position of the mutation provides. Here we introduce a novel method for mapping mutations to distinct protein domains, not just individual genes, in which they occur, thus providing the functional context for how the mutation contributes to disease. Furthermore, aggregating mutations from all genes containing a specific protein domain enables the identification of mutations that are rare at the gene level, but that occur frequently within the specified domain. These highly mutated domains potentially reveal disruptions of protein function necessary for cancer development.ResultsWe mapped somatic mutations from the protein coding regions of 100 colon adenocarcinoma tumor samples to the genes and protein domains in which they occurred, and constructed topographical maps to depict the “mutational landscapes” of gene and domain mutation frequencies. We found significant mutation frequency in a number of genes previously known to be somatically mutated in colon cancer patients including APC, TP53 and KRAS. In addition, we found significant mutation frequency within specific domains located in these genes, as well as within other domains contained in genes having low mutation frequencies. These domain “peaks” were enriched with functions important to cancer development including kinase activity, DNA binding and repair, and signal transduction.ConclusionsUsing our method to create the domain landscapes of mutations in colon cancer, we were able to identify somatic mutations with high potential to drive cancer development. Interestingly, the majority of the genes involved have a low mutation frequency. Therefore, themethod shows good potential for identifying rare driver mutations in current, large-scale tumor sequencing projects. In addition, mapping mutations to specific domains provides the necessary functional context for understanding how the mutations contribute to the disease, and may reveal novel or more refined gene and domain target regions for drug development.


Journal of the American Medical Informatics Association | 2012

Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer

Thomas A. Peterson; Nathan L Nehrt; DoHwan Park; Maricel G. Kann

BACKGROUND AND OBJECTIVE With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants. METHODS Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions. RESULTS Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision. CONCLUSION By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies.


BMC Genomics | 2013

A protein domain-centric approach for the comparative analysis of human and yeast phenotypically relevant mutations.

Thomas A. Peterson; DoHwan Park; Maricel G. Kann

BackgroundThe body of disease mutations with known phenotypic relevance continues to increase and is expected to do so even faster with the advent of new experimental techniques such as whole-genome sequencing coupled with disease association studies. However, genomic association studies are limited by the molecular complexity of the phenotype being studied and the population size needed to have adequate statistical power. One way to circumvent this problem, which is critical for the study of rare diseases, is to study the molecular patterns emerging from functional studies of existing disease mutations. Current gene-centric analyses to study mutations in coding regions are limited by their inability to account for the functional modularity of the protein. Previous studies of the functional patterns of known human disease mutations have shown a significant tendency to cluster at protein domain positions, namely position-based domain hotspots of disease mutations. However, the limited number of known disease mutations remains the main factor hindering the advancement of mutation studies at a functional level. In this paper, we address this problem by incorporating mutations known to be disruptive of phenotypes in other species. Focusing on two evolutionarily distant organisms, human and yeast, we describe the first inter-species analysis of mutations of phenotypic relevance at the protein domain level.ResultsThe results of this analysis reveal that phenotypic mutations from yeast cluster at specific positions on protein domains, a characteristic previously revealed to be displayed by human disease mutations. We found over one hundred domain hotspots in yeast with approximately 50% in the exact same domain position as known human disease mutations.ConclusionsWe describe an analysis using protein domains as a framework for transferring functional information by studying domain hotspots in human and yeast and relating phenotypic changes in yeast to diseases in human. This first-of-a-kind study of phenotypically relevant yeast mutations in relation to human disease mutations demonstrates the utility of a multi-species analysis for advancing the understanding of the relationship between genetic mutations and phenotypic changes at the organismal level.


Human Mutation | 2016

Regulatory Single-Nucleotide Variant Predictor Increases Predictive Performance of Functional Regulatory Variants.

Thomas A. Peterson; Matthew Mort; David Neil Cooper; Predrag Radivojac; Maricel G. Kann; Sean D. Mooney

In silico methods for detecting functionally relevant genetic variants are important for identifying genetic markers of human inherited disease. Much research has focused on protein‐coding variants since coding regions have well‐defined physicochemical and functional properties. However, many bioinformatics tools are not applicable to variants outside coding regions. Here, we increase the classification performance of our regulatory single‐nucleotide variant predictor (RSVP) for variants that cause regulatory abnormalities from an AUC of 0.90–0.97 by incorporating genomic regions identified by the ENCODE project into RSVP. RSVP is comparable to a recently published tool, Genome‐Wide Annotation of Variants (GWAVA); both RSVP and GWAVA perform better on regulatory variants than a traditional variant predictor, combined annotation‐dependent depletion (CADD). However, our method outperforms GWAVA on variants located at similar distances to the transcription start site as the positive set (AUC: 0.96) as compared with GWAVA (AUC: 0.71). Much of this disparity is due to RSVPs incorporation of features pertaining to the nearest gene (expression, GO terms, etc.), which are not included in GWAVA. Our findings hold out the promise of a framework for the assessment of all functional regulatory variants, providing a means to predict which rare or de novo variants are of pathogenic significance.


PLOS Computational Biology | 2017

Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples

Thomas A. Peterson; Iris Ivy M. Gauran; Junyong Park; DoHwan Park; Maricel G. Kann; Marco Punta

The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are ‘gene-centric’ in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new ‘domain-centric’ method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots’ unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods.


Biometrics | 2018

Empirical null estimation using zero‐inflated discrete mixture distributions and its application to protein domain data

Iris Ivy M. Gauran; Junyong Park; Johan Lim; DoHwan Park; John Zylstra; Thomas A. Peterson; Maricel G. Kann; John L. Spouge

In recent mutation studies, analyses based on protein domain positions are gaining popularity over gene-centric approaches since the latter have limitations in considering the functional context that the position of the mutation provides. This presents a large-scale simultaneous inference problem, with hundreds of hypothesis tests to consider at the same time. This article aims to select significant mutation counts while controlling a given level of Type I error via False Discovery Rate (FDR) procedures. One main assumption is that the mutation counts follow a zero-inflated model in order to account for the true zeros in the count model and the excess zeros. The class of models considered is the Zero-inflated Generalized Poisson (ZIGP) distribution. Furthermore, we assumed that there exists a cut-off value such that smaller counts than this value are generated from the null distribution. We present several data-dependent methods to determine the cut-off value. We also consider a two-stage procedure based on screening process so that the number of mutations exceeding a certain value should be considered as significant mutations. Simulated and protein domain data sets are used to illustrate this procedure in estimation of the empirical null using a mixture of discrete distributions. Overall, while maintaining control of the FDR, the proposed two-stage testing procedure has superior empirical power.

Collaboration


Dive into the Thomas A. Peterson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

DoHwan Park

University of Maryland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Asa Adadey

University of Maryland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John L. Spouge

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Johan Lim

Seoul National University

View shared research outputs
Researchain Logo
Decentralizing Knowledge