Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael Gribskov is active.

Publication


Featured researches published by Michael Gribskov.


Bioinformatics | 1998

Combining evidence using p-values : application to sequence homology searches

Timothy L. Bailey; Michael Gribskov

MOTIVATION To illustrate an intuitive and statistically valid method for combining independent sources of evidence that yields a p-value for the complete evidence, and to apply it to the problem of detecting simultaneous matches to multiple patterns in sequence homology searches. RESULTS In sequence analysis, two or more (approximately) independent measures of the membership of a sequence (or sequence region) in some class are often available. We would like to estimate the likelihood of the sequence being a member of the class in view of all the available evidence. An example is estimating the significance of the observed match of a macromolecular sequence (DNA or protein) to a set of patterns (motifs) that characterize a biological sequence family. An intuitive way to do this is to express each piece of evidence as a p-value, and then use the product of these p-values as the measure of membership in the family. We derive a formula and algorithm (QFAST) for calculating the statistical distribution of the product of n independent p-values. We demonstrate that sorting sequences by this p-value effectively combines the information present in multiple motifs, leading to highly accurate and sensitive sequence homology searches.


Plant Physiology | 2003

The Arabidopsis CDPK-SnRK Superfamily of Protein Kinases

Estelle M. Hrabak; Catherine W.M. Chan; Michael Gribskov; Jeffrey F. Harper; Jung H. Choi; Nigel G. Halford; Jörg Kudla; Sheng Luan; Hugh G. Nimmo; Michael R. Sussman; Martine Thomas; Kay Walker-Simmons; Jian-Kang Zhu; Alice C. Harmon

The CDPK-SnRK superfamily consists of seven types of serine-threonine protein kinases: calcium-dependent protein kinase (CDPKs), CDPK-related kinases (CRKs), phosphoenolpyruvate carboxylase kinases (PPCKs), PEP carboxylase kinase-related kinases (PEPRKs), calmodulin-dependent protein kinases (CaMKs), calcium and calmodulin-dependent protein kinases (CCaMKs), and SnRKs. Within this superfamily, individual isoforms and subfamilies contain distinct regulatory domains, subcellular targeting information, and substrate specificities. Our analysis of the Arabidopsis genome identified 34 CDPKs, eight CRKs, two PPCKs, two PEPRKs, and 38 SnRKs. No definitive examples were found for a CCaMK similar to those previously identified in lily (Lilium longiflorum) and tobacco (Nicotiana tabacum) or for a CaMK similar to those in animals or yeast. CDPKs are present in plants and a specific subgroup of protists, but CRKs, PPCKs, PEPRKs, and two of the SnRK subgroups have been found only in plants. CDPKs and at least one SnRK have been implicated in decoding calcium signals in Arabidopsis. Analysis of intron placements supports the hypothesis that CDPKs, CRKs, PPCKs and PEPRKs have a common evolutionary origin; however there are no conserved intron positions between these kinases and the SnRK subgroup. CDPKs and SnRKs are found on all five Arabidopsis chromosomes. The presence of closely related kinases in regions of the genome known to have arisen by genome duplication indicates that these kinases probably arose by divergence from common ancestors. The PlantsP database provides a resource of continuously updated information on protein kinases from Arabidopsis and other plants.


Methods in Enzymology | 1990

[9] Profile analysis

Michael Gribskov; Roland Lothy; David Eisenberg

Publisher Summary The profile method provides a convenient way to represent information about groups or families of sequences as well as a means to ask questions about the definition of protein families, the relationships between distantly related proteins, and the presence of sequence or structural motifs in proteins. The observed positions of insertions and deletions in the sequences provide similar structural information. This information is incorporated in the profile, and used to improve the detection of sequence patterns that represent structural motifs. It is clear that certain three-dimensional structural motifs are shared by many proteins. These structural motifs have patterns in their amino acid sequences that permit them to be recognized from the sequences alone. Another use of profile analysis is based on profiles generated from sequences aligned using sequence information alone. In this case, the profile can be considered to be specific for the protein family or super family as defined by sequence criteria. A characteristic of profile analysis is that the score for aligning a residue at a given position varies depending on the observed conservation of residues at that position. This may be contrasted with the approach of deriving a single consensus sequence for a family and using it, as a kind of family-specific probe, in standard alignment algorithms.


Plant Physiology | 2003

Genomic Comparison of P-Type ATPase Ion Pumps in Arabidopsis and Rice

Ivan Baxter; Jason Tchieu; Michael R. Sussman; Marc Boutry; Michael G. Palmgren; Michael Gribskov; Jeffrey F. Harper; Kristian B. Axelsen

Members of the P-type ATPase ion pump superfamily are found in all three branches of life. Forty-six P-type ATPase genes were identified in Arabidopsis, the largest number yet identified in any organism. The recent completion of two draft sequences of the rice (Oryza sativa) genome allows for comparison of the full complement of P-type ATPases in two different plant species. Here, we identify a similar number (43) in rice, despite the rice genome being more than three times the size of Arabidopsis. The similarly large families suggest that both dicots and monocots have evolved with a large preexisting repertoire of P-type ATPases. Both Arabidopsis and rice have representative members in all five major subfamilies of P-type ATPases: heavy-metal ATPases (P1B), Ca2+-ATPases (endoplasmic reticulum-type Ca2+-ATPase and autoinhibited Ca2+-ATPase, P2A and P2B), H+-ATPases (autoinhibited H+-ATPase, P3A), putative aminophospholipid ATPases (ALA, P4), and a branch with unknown specificity (P5). The close pairing of similar isoforms in rice and Arabidopsis suggests potential orthologous relationships for all 43 rice P-type ATPases. A phylogenetic comparison of protein sequences and intron positions indicates that the common angiosperm ancestor had at least 23 P-type ATPases. Although little is known about unique and common features of related pumps, clear differences between some members of the calcium pumps indicate that evolutionarily conserved clusters may distinguish pumps with either different subcellular locations or biochemical functions.


Plant Physiology | 2002

The Complement of Protein Phosphatase Catalytic Subunits Encoded in the Genome of Arabidopsis

David Kerk; Joshua Bulgrien; Douglas W. Smith; Brooke Barsam; Stella Veretnik; Michael Gribskov

Reversible protein phosphorylation is critically important in the modulation of a wide variety of cellular functions. Several families of protein phosphatases remove phosphate groups placed on key cellular proteins by protein kinases. The complete genomic sequence of the model plant Arabidopsis permits a comprehensive survey of the phosphatases encoded by this organism. Several errors in the sequencing project gene models were found via analysis of predicted phosphatase coding sequences. Structural sequence probes from aligned and unaligned sequence models, and all-against-all BLAST searches, were used to identify 112 phosphatase catalytic subunit sequences, distributed among the serine (Ser)/threonine (Thr) phosphatases (STs) of the protein phosphatase P (PPP) family, STs of the protein phosphatase M (PPM) family (protein phosphatases 2C [PP2Cs] subfamily), protein tyrosine (Tyr) phosphatases (PTPs), low-M r protein Tyr phosphatases, and dual-specificity (Tyr and Ser/Thr) phosphatases (DSPs). The Arabidopsis genome contains an abundance of PP2Cs (69) and a dearth of PTPs (one). Eight sequences were identified as new protein phosphatase candidates: five dual-specificity phosphatases and three PP2Cs. We used phylogenetic analyses to infer clustering patterns reflecting sequence similarity and evolutionary ancestry. These clusters, particularly for the largely unexplored PP2C set, will be a rich source of material for plant biologists, allowing the systematic sampling of protein function by genetic and biochemical means.


Journal of Computational Biology | 1998

Methods and Statistics for Combining Motif Match Scores

Timothy L. Bailey; Michael Gribskov

Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.


Plant Physiology | 2009

A Rice Kinase-Protein Interaction Map

Xiaodong Ding; Todd Richter; Mei Chen; Hiroaki Fujii; Young Su Seo; Mingtang Xie; Xianwu Zheng; Siddhartha Kanrar; Rebecca A. Stevenson; Christopher Dardick; Ying Li; Hao Jiang; Yan Zhang; Fahong Yu; Laura E. Bartley; Mawsheng Chern; Rebecca Bart; Xiuhua Chen; Lihuang Zhu; William G. Farmerie; Michael Gribskov; Jian-Kang Zhu; Michael E. Fromm; Pamela C. Ronald; Wen-Yuan Song

Plants uniquely contain large numbers of protein kinases, and for the vast majority of the 1,429 kinases predicted in the rice (Oryza sativa) genome, little is known of their functions. Genetic approaches often fail to produce observable phenotypes; thus, new strategies are needed to delineate kinase function. We previously developed a cost-effective high-throughput yeast two-hybrid system. Using this system, we have generated a protein interaction map of 116 representative rice kinases and 254 of their interacting proteins. Overall, the resulting interaction map supports a large number of known or predicted kinase-protein interactions from both plants and animals and reveals many new functional insights. Notably, we found a potential widespread role for E3 ubiquitin ligases in pathogen defense signaling mediated by receptor-like kinases, particularly by the kinases that may have evolved from recently expanded kinase subfamilies in rice. We anticipate that the data provided here will serve as a foundation for targeted functional studies in rice and other plants. The application of yeast two-hybrid and TAPtag analyses for large-scale plant protein interaction studies is also discussed.


Molecular & Cellular Proteomics | 2011

A physical interaction network of dengue virus and human proteins

Sudip Khadka; Abbey D. Vangeloff; Chaoying Zhang; Prasad Siddavatam; Nicholas S. Heaton; Ling Wang; Ranjan Sengupta; Sudhir Sahasrabudhe; Glenn Randall; Michael Gribskov; Richard J. Kuhn; Rushika Perera; Douglas J. LaCount

Dengue virus (DENV), an emerging mosquito-transmitted pathogen capable of causing severe disease in humans, interacts with host cell factors to create a more favorable environment for replication. However, few interactions between DENV and human proteins have been reported to date. To identify DENV-human protein interactions, we used high-throughput yeast two-hybrid assays to screen the 10 DENV proteins against a human liver activation domain library. From 45 DNA-binding domain clones containing either full-length viral genes or partially overlapping gene fragments, we identified 139 interactions between DENV and human proteins, the vast majority of which are novel. These interactions involved 105 human proteins, including six previously implicated in DENV infection and 45 linked to the replication of other viruses. Human proteins with functions related to the complement and coagulation cascade, the centrosome, and the cytoskeleton were enriched among the DENV interaction partners. To determine if the cellular proteins were required for DENV infection, we used small interfering RNAs to inhibit their expression. Six of 12 proteins targeted (CALR, DDX3X, ERC1, GOLGA2, TRIP11, and UBE2I) caused a significant decrease in the replication of a DENV replicon. We further showed that calreticulin colocalized with viral dsRNA and with the viral NS3 and NS5 proteins in DENV-infected cells, consistent with a direct role for calreticulin in DENV replication. Human proteins that interacted with DENV had significantly higher average degree and betweenness than expected by chance, which provides additional support for the hypothesis that viruses preferentially target cellular proteins that occupy central position in the human protein interaction network. This study provides a valuable starting point for additional investigations into the roles of human proteins in DENV infection.


Methods in Enzymology | 1996

[13] Identification of sequence patterns with profile analysis

Michael Gribskov; Stella Veretnik

Publisher Summary This chapter discusses the identification of sequence patterns with profile analysis. The profile is a two-dimensional weight matrix in which the rows correspond to aligned positions in a group of sequences, and the columns correspond to each of the 20 possible amino acid residues. The profile analysis package of programs provides a suite of tools for creating profiles and matching them with sequences. The average profile method seeks to extract information from a single set of prior information embodied in the scoring table used in the averaging process. When the scoring table is based on the observed mutational exchanges among amino acid residues, as is the percentage assisted matrix (PAM 250) table typically used, it represents a superposition of all of the chemical similarities among the residues. Evolutionary profiles perform better than average profiles in generating discriminators for sequence classification. Using this approach, models with very good discriminatory power can be generated from very small numbers of sequences, a sharp contrast to the fairly large numbers of sequences required to train hidden Markov models.


Plant Physiology | 2003

Arabidopsis proteins containing similarity to the universal stress protein domain of bacteria.

David Kerk; Joshua Bulgrien; Douglas W. Smith; Michael Gribskov

We have collected a set of 44 Arabidopsis proteins with similarity to the USPA (universal stress protein A ofEscherichia coli) domain of bacteria. The USPA domain is found either in small proteins, or it makes up the N-terminal portion of a larger protein, usually a protein kinase. Phylogenetic tree analysis based upon a multiple sequence alignment of the USPA domains shows that these domains of protein kinases 1.3.1 and 1.3.2 form distinct groups, as do the protein kinases 1.4.1. This indicates that their USPA domain structures have diverged appreciably and suggests that they may subserve distinct cellular functions. Two USPA fold classes have been proposed: one based on Methanococcus jannaschii MJ0577 (1MJH) that binds ATP, and the other based on the Haemophilus influenzae universal stress protein (1JMV), highly similar to E. coli UspA, which does not bind ATP. A set of common residues involved in ATP binding in 1MJH and conserved in similar bacterial sequences is also found in a distinct cluster of Arabidopsis sequences. Threading analysis, which examines aspects of secondary and tertiary structure, confirms this Arabidopsis sequence cluster as highly similar to 1MJH. This structural approach can distinguish between the characteristic fold differences of 1MJH-like and 1JMV-like bacterial proteins and was used to assign the complete set of candidate Arabidopsis proteins to one of these fold classes. It is clear that all the plant sequences have arisen from a 1MJH-like ancestor.

Collaboration


Dive into the Michael Gribskov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Philip E. Bourne

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeffrey F. Harper

Scripps Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mei Chen

University of Nebraska–Lincoln

View shared research outputs
Top Co-Authors

Avatar

Michael E. Fromm

University of Nebraska–Lincoln

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge