Is this you? Create Your Porfile

Gary D. Stormo

Washington University in St. Louis

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gary D. Stormo is active.

Explore More

Publication

Featured researches published by Gary D. Stormo.

Bioinformatics | 2000

DNA binding sites: representation and discovery

Gary D. Stormo

The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites. This problem can be conveniently divided into two subproblems. The first is, given a collection of known binding sites, develop a representation of those sites that can be used to search new sequences and reliably predict where additional binding sites occur. The second is, given a set of sequences known to contain binding sites for a common factor, but not knowing where the sites are, discover the location of the sites in each sequence and a representation for the specificity of the protein.

international conference on bioinformatics | 1999

Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.

Gerald Z. Hertz; Gary D. Stormo

MOTIVATION Molecular biologists frequently can obtain interesting insight by aligning a set of related DNA, RNA or protein sequences. Such alignments can be used to determine either evolutionary or functional relationships. Our interest is in identifying functional relationships. Unless the sequences are very similar, it is necessary to have a specific strategy for measuring-or scoring-the relatedness of the aligned sequences. If the alignment is not known, one can be determined by finding an alignment that optimizes the scoring scheme. RESULTS We describe four components to our approach for determining alignments of multiple sequences. First, we review a log-likelihood scoring scheme we call information content. Second, we describe two methods for estimating the P value of an individual information content score: (i) a method that combines a technique from large-deviation statistics with numerical calculations; (ii) a method that is exclusively numerical. Third, we describe how we count the number of possible alignments given the overall amount of sequence data. This count is multiplied by the P value to determine the expected frequency of an information content score and, thus, the statistical significance of the corresponding alignment. Statistical significance can be used to compare alignments having differing widths and containing differing numbers of sequences. Fourth, we describe a greedy algorithm for determining alignments of functionally related sequences. Finally, we test the accuracy of our P value calculations, and give an example of using our algorithm to identify binding sites for the Escherichia coli CRP protein. AVAILABILITY Programs were developed under the UNIX operating system and are available by anonymous ftp from ftp://beagle.colorado.edu/pub/consensus.

Journal of Molecular Biology | 1986

Information content of binding sites on nucleotide sequences

Thomas D. Schneider; Gary D. Stormo; Larry Gold; Andrzej Ehrenfeucht

Repressors, polymerases, ribosomes and other macromolecules bind to specific nucleic acid sequences. They can find a binding site only if the sequence has a recognizable pattern. We define a measure of the information (R sequence) in the sequence patterns at binding sites. It allows one to investigate how information is distributed across the sites and to compare one site to another. One can also calculate the amount of information (R frequency) that would be required to locate the sites, given that they occur with some frequency in the genome. Several Escherichia coli binding sites were analyzed using these two independent empirical measurements. The two amounts of information are similar for most of the sites we analyzed. In contrast, bacteriophage T7 RNA polymerase binding sites contain about twice as much information as is necessary for recognition by the T7 polymerase, suggesting that a second protein may bind at T7 promoters. The extra information can be accounted for by a strong symmetry element found at the T7 promoters. This element may be an operator. If this model is correct, these promoters and operators do not share much information. The comparisons between R sequence and R frequency suggest that the information at binding sites is just sufficient for the sites to be distinguished from the rest of the genome.

Archive | 2002

Current Protocols in Bioinformatics

Alex Bateman; William R. Pearson; Lincoln Stein; Gary D. Stormo; John R. Yates

1. Please read the rough pages and mark any changes right in the text. 2. If you have large inserts to add, please supply us with a disk and hard copy of the insert(s) and indicate where they should go.

Cell | 2004

Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene.

Jin Billy Li; Jantje M. Gerdes; Courtney J. Haycraft; Yanli Fan; Tanya M. Teslovich; Helen May-Simera; Haitao Li; Oliver E. Blacque; Linya Li; Carmen C. Leitch; Ra Lewis; Jane Green; Patrick S. Parfrey; Michel R. Leroux; William S. Davidson; Philip L. Beales; Lisa M. Guay-Woodford; Bradley K. Yoder; Gary D. Stormo; Nicholas Katsanis; Susan K. Dutcher

Cilia and flagella are microtubule-based structures nucleated by modified centrioles termed basal bodies. These biochemically complex organelles have more than 250 and 150 polypeptides, respectively. To identify the proteins involved in ciliary and basal body biogenesis and function, we undertook a comparative genomics approach that subtracted the nonflagellated proteome of Arabidopsis from the shared proteome of the ciliated/flagellated organisms Chlamydomonas and human. We identified 688 genes that are present exclusively in organisms with flagella and basal bodies and validated these data through a series of in silico, in vitro, and in vivo studies. We then applied this resource to the study of human ciliation disorders and have identified BBS5, a novel gene for Bardet-Biedl syndrome. We show that this novel protein localizes to basal bodies in mouse and C. elegans, is under the regulatory control of daf-19, and is necessary for the generation of both cilia and flagella.

Nature | 2009

The AP-1 transcription factor Batf controls TH17 differentiation

Barbara U. Schraml; Kai Hildner; Wataru Ise; Wan-Ling Lee; Whitney A.-E. Smith; Ben Solomon; Gurmukh Sahota; Julia Sim; Ryuta Mukasa; Saso Cemerski; Robin D. Hatton; Gary D. Stormo; Casey T. Weaver; John H. Russell; Theresa L. Murphy; Kenneth M. Murphy

Activator protein 1 (AP-1, also known as JUN) transcription factors are dimers of JUN, FOS, MAF and activating transcription factor (ATF) family proteins characterized by basic region and leucine zipper domains. Many AP-1 proteins contain defined transcriptional activation domains, but BATF and the closely related BATF3 (refs 2, 3) contain only a basic region and leucine zipper, and are considered to be inhibitors of AP-1 activity. Here we show that Batf is required for the differentiation of IL17-producing T helper (TH17) cells. TH17 cells comprise a CD4+ T-cell subset that coordinates inflammatory responses in host defence but is pathogenic in autoimmunity. Batf-/- mice have normal TH1 and TH2 differentiation, but show a defect in TH17 differentiation, and are resistant to experimental autoimmune encephalomyelitis. Batf-/- T cells fail to induce known factors required for TH17 differentiation, such as RORγt (encoded by Rorc) and the cytokine IL21 (refs 14–17). Neither the addition of IL21 nor the overexpression of RORγt fully restores IL17 production in Batf-/- T cells. The Il17 promoter is BATF-responsive, and after TH17 differentiation, BATF binds conserved intergenic elements in the Il17a–Il17f locus and to the Il17, Il21 and Il22 (ref. 18) promoters. These results demonstrate that the AP-1 protein BATF has a critical role in TH17 differentiation.

Trends in Biochemical Sciences | 1998

Specificity, free energy and information content in protein–DNA interactions

Gary D. Stormo; Dana S. Fields

Site-specific DNA-protein interactions can be studied using experimental and computational methods. Experimental approaches typically analyze a protein-DNA interaction by measuring the free energy of binding under a variety of conditions. Computational methods focus on alignments of known binding sites for a protein, and, from these alignments, make estimates of the binding energy. Understanding the relationship between these two perspectives, and finding ways to improve both, is a major challenge of modern molecular biology.

Bioinformatics | 1990

Identification of consensus patterns in unaligned DNA sequences known to be functionally related

Gerald Z. Hertz; George W. Hartzell; Gary D. Stormo

We have developed a method for identifying consensus patterns in a set of unaligned DNA sequences known to bind a common protein or to have some other common biochemical function. The method is based on a matrix representation of binding site patterns. Each row of the matrix represents one of the four possible bases, each column represents one of the positions of the binding site and each element is determined by the frequency the indicated base occurs at the indicated position. The goal of the method is to find the most significant matrix--i.e. the one with the lowest probability of occurring by chance--out of all the matrices that can be formed from the set of related sequences. The reliability of the method improves with the number of sequences, while the time required increases only linearly with the number of sequences. To test this method, we analysed 11 DNA sequences containing promoters regulated by the Escherichia coli LexA protein. The matrices we found were consistent with the known consensus sequence, and could distinguish the generally accepted LexA binding sites from other DNA sequences.

Molecular Microbiology | 1992

Translation initiation in Escherichia coli: sequences within the ribosome-binding site

Steven Ringquist; Sidney Shinedling; Doug Barrick; Louis S. Green; Jonathan Binkley; Gary D. Stormo; Larry Gold

The translational roles of the Shine‐Dalgarno sequence, the initiation codon, the space between them, and the second codon have been studied. The Shine Dalgarno sequence UAAGGAGG initiated translation roughly four times more efficiently than did the shorter AAGGA sequence. Each Shine‐Dalgarno sequence required a minimum distance to the initiation codon in order to drive translation; spacing, however, could be rather long. Initiation at AUG was more efficient than at GUG or UUG at each spacing examined; initiation at GUG was only slightly better than UUG. Translation was also affected by residues 3′ to the initiation codon. The second codon can influence the rate of initiation, with the magnitude depending on the initiation codon. The data are consistent with a simple kinetic model in which a variety of rate constants contribute to the process of translation initiation.

Cell | 2008

Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites

Marcus Blaine Noyes; Atsuya Wakabayashi; Gary D. Stormo; Michael H. Brodsky; Scot A. Wolfe

We describe the comprehensive characterization of homeodomain DNA-binding specificities from a metazoan genome. The analysis of all 84 independent homeodomains from D. melanogaster reveals the breadth of DNA sequences that can be specified by this recognition motif. The majority of these factors can be organized into 11 different specificity groups, where the preferred recognition sequence between these groups can differ at up to four of the six core recognition positions. Analysis of the recognition motifs within these groups led to a catalog of common specificity determinants that may cooperate or compete to define the binding site preference. With these recognition principles, a homeodomain can be reengineered to create factors where its specificity is altered at the majority of recognition positions. This resource also allows prediction of homeodomain specificities from other organisms, which is demonstrated by the prediction and analysis of human homeodomain specificities.

Explore More