Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gavin Sherlock is active.

Publication


Featured researches published by Gavin Sherlock.


Nature Genetics | 2000

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Michael Ashburner; Catherine A. Ball; Judith A. Blake; David Botstein; Heather L. Butler; J. Michael Cherry; Allan Peter Davis; Kara Dolinski; Selina S. Dwight; Janan T. Eppig; Midori A. Harris; David P. Hill; Laurie Issel-Tarver; Andrew Kasarskis; Suzanna E. Lewis; John C. Matese; Joel E. Richardson; Martin Ringwald; Gerald M. Rubin; Gavin Sherlock

Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.


Nature | 2000

Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.

Ash A. Alizadeh; Michael B. Eisen; R. Eric Davis; Izidore S. Lossos; Andreas Rosenwald; Jennifer C. Boldrick; Hajeer Sabet; Truc Tran; Xin Yu; John Powell; Liming Yang; Gerald E. Marti; Troy Moore; James I. Hudson; Lisheng Lu; David B. Lewis; Robert Tibshirani; Gavin Sherlock; Wing C. Chan; Timothy C. Greiner; Dennis D. Weisenburger; James O. Armitage; Roger A. Warnke; Ronald Levy; Wyndham H. Wilson; Michael R. Grever; John C. Byrd; David Botstein; Patrick O. Brown; Louis M. Staudt

Diffuse large B-cell lymphoma (DLBCL), the most common subtype of non-Hodgkins lymphoma, is clinically heterogeneous: 40% of patients respond well to current therapy and have prolonged survival, whereas the remainder succumb to the disease. We proposed that this variability in natural history reflects unrecognized molecular heterogeneity in the tumours. Using DNA microarrays, we have conducted a systematic characterization of gene expression in B-cell malignancies. Here we show that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour. We identified two molecularly distinct forms of DLBCL which had gene expression patterns indicative of different stages of B-cell differentiation. One type expressed genes characteristic of germinal centre B cells (‘germinal centre B-like DLBCL’); the second type expressed genes normally induced during in vitro activation of peripheral blood B cells (‘activated B-like DLBCL’). Patients with germinal centre B-like DLBCL had a significantly better overall survival than those with activated B-like DLBCL. The molecular classification of tumours on the basis of gene expression can thus identify previously undetected and clinically significant subtypes of cancer.


Nature Genetics | 2001

Minimum information about a microarray experiment (MIAME)-toward standards for microarray data.

Alvis Brazma; Pascal Hingamp; John Quackenbush; Gavin Sherlock; Paul T. Spellman; Stoeckert C; John Aach; Wilhelm Ansorge; Catherine A. Ball; Helen C. Causton; Terry Gaasterland; Patrick Glenisson; Irene F. Kim; John C. Matese; Helen Parkinson; Alan Robinson; Ugis Sarkans; Jason Stewart; Ronald C. Taylor; Jaak Vilo; Martin Vingron

Microarray analysis has become a widely used tool for the generation of gene expression data on a genomic scale. Although many significant results have been derived from microarray studies, one limitation has been the lack of standards for presenting and exchanging such data. Here we present a proposal, the Minimum Information About a Microarray Experiment (MIAME), that describes the minimum information required to ensure that microarray data can be easily interpreted and that results derived from its analysis can be independently verified. The ultimate goal of this work is to establish a standard for recording and reporting microarray-based gene expression data, which will in turn facilitate the establishment of databases and public repositories and enable the development of data analysis tools. With respect to MIAME, we concentrate on defining the content and structure of the necessary information rather than the technical format for capturing it.


Bioinformatics | 2001

Missing value estimation methods for DNA microarrays

Olga G. Troyanskaya; Michael N. Cantor; Gavin Sherlock; Patrick O. Brown; Trevor Hastie; Robert Tibshirani; David Botstein; Russ B. Altman

MOTIVATION Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clustering are not robust to missing data, and may lose effectiveness even with a few missing values. Methods for imputing missing data are needed, therefore, to minimize the effect of incomplete data sets on analyses, and to increase the range of data sets to which these algorithms can be applied. In this report, we investigate automated methods for estimating missing data. RESULTS We present a comparative study of several methods for the estimation of missing values in gene microarray data. We implemented and evaluated three methods: a Singular Value Decomposition (SVD) based method (SVDimpute), weighted K-nearest neighbors (KNNimpute), and row average. We evaluated the methods using a variety of parameter settings and over different real data sets, and assessed the robustness of the imputation methods to the amount of missing data over the range of 1--20% missing values. We show that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVDimpute and KNNimpute surpass the commonly used row average method (as well as filling missing values with zeros). We report results of the comparative experiments and provide recommendations and tools for accurate estimation of missing microarray data under a variety of conditions.


Bioinformatics | 2004

GO: :TermFinder---open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes

Elizabeth I. Boyle; Shuai Weng; Jeremy Gollub; Heng Jin; David Botstein; J. Michael Cherry; Gavin Sherlock

SUMMARY GO::TermFinder comprises a set of object-oriented Perl modules for accessing Gene Ontology (GO) information and evaluating and visualizing the collective annotation of a list of genes to GO terms. It can be used to draw conclusions from microarray and other biological data, calculating the statistical significance of each annotation. GO::TermFinder can be used on any system on which Perl can be run, either as a command line application, in single or batch mode, or as a web-based CGI script. AVAILABILITY The full source code and documentation for GO::TermFinder are freely available from http://search.cpan.org/dist/GO-TermFinder/.


Nature | 2009

Evolution of pathogenicity and sexual reproduction in eight Candida genomes.

Geraldine Butler; Matthew D. Rasmussen; Michael F. Lin; Manuel A. S. Santos; Sharadha Sakthikumar; Carol A. Munro; Esther Rheinbay; Manfred Grabherr; Anja Forche; Jennifer L. Reedy; Ino Agrafioti; Martha B. Arnaud; Steven Bates; Alistair J. P. Brown; Sascha Brunke; Maria C. Costanzo; David A. Fitzpatrick; Piet W. J. de Groot; David Harris; Lois L. Hoyer; Bernhard Hube; Frans M. Klis; Chinnappa D. Kodira; Nicola Lennard; Mary E. Logue; Ronny Martin; Aaron M. Neiman; Elissavet Nikolaou; Michael A. Quail; Janet Quinn

Candida species are the most common cause of opportunistic fungal infection worldwide. Here we report the genome sequences of six Candida species and compare these and related pathogens and non-pathogens. There are significant expansions of cell wall, secreted and transporter gene families in pathogenic species, suggesting adaptations associated with virulence. Large genomic tracts are homozygous in three diploid species, possibly resulting from recent recombination events. Surprisingly, key components of the mating and meiosis pathways are missing from several species. These include major differences at the mating-type loci (MTL); Lodderomyces elongisporus lacks MTL, and components of the a1/α2 cell identity determinant were lost in other species, raising questions about how mating and cell types are controlled. Analysis of the CUG leucine-to-serine genetic-code change reveals that 99% of ancestral CUG codons were erased and new ones arose elsewhere. Lastly, we revise the Candida albicans gene catalogue, identifying many new genes.


The Lancet | 2002

Molecular characterisation of soft tissue tumours: a gene expression study

Torsten O. Nielsen; Robert B. West; Sabine C. Linn; Orly Alter; Margaret A. Knowling; John X. O'Connell; Shirley Zhu; Mike Fero; Gavin Sherlock; Jonathan R. Pollack; Patrick O. Brown; David Botstein; Matt van de Rijn

BACKGROUND Soft-tissue tumours are derived from mesenchymal cells such as fibroblasts, muscle cells, or adipocytes, but for many such tumours the histogenesis is controversial. We aimed to start molecular characterisation of these rare neoplasms and to do a genome-wide search for new diagnostic markers. METHODS We analysed gene-expression patterns of 41 soft-tissue tumours with spotted cDNA microarrays. After removal of errors introduced by use of different microarray batches, the expression patterns of 5520 genes that were well defined were used to separate tumours into discrete groups by hierarchical clustering and singular value decomposition. FINDINGS Synovial sarcomas, gastrointestinal stromal tumours, neural tumours, and a subset of the leiomyosarcomas, showed strikingly distinct gene-expression patterns. Other tumour categories--malignant fibrous histiocytoma, liposarcoma, and the remaining leiomyosarcomas--shared molecular profiles that were not predicted by histological features or immunohistochemistry. Strong expression of known genes, such as KIT in gastrointestinal stromal tumours, was noted within gene sets that distinguished the different sarcomas. However, many uncharacterised genes also contributed to the distinction between tumour types. INTERPRETATION These results suggest a new method for classification of soft-tissue tumours, which could improve on the method based on histological findings. Large numbers of uncharacterised genes contributed to distinctions between the tumours, and some of these could be useful markers for diagnosis, have prognostic significance, or prove possible targets for treatment.


Nucleic Acids Research | 2001

The Stanford Microarray Database

Gavin Sherlock; Tina Hernandez-Boussard; Andrew Kasarskis; Gail Binkley; John C. Matese; Selina S. Dwight; Shuai Weng; Heng Jin; Catherine A. Ball; Michael B. Eisen; Paul T. Spellman; Patrick O. Brown; David Botstein; J. Michael Cherry

The Stanford Microarray Database (SMD) stores raw and normalized data from microarray experiments, and provides web interfaces for researchers to retrieve, analyze and visualize their data. The two immediate goals for SMD are to serve as a storage site for microarray data from ongoing research at Stanford University, and to facilitate the public dissemination of that data once published, or released by the researcher. Of paramount importance is the connection of microarray data with the biological data that pertains to the DNA deposited on the microarray (genes, clones etc.). SMD makes use of many public resources to connect expression information to the relevant biology, including SGD [Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Issel-Tarver,L., Kasarskis,A., Scafe,C.R., Sherlock,G., Binkley,G., Jin,H. et al. (2000) Nucleic Acids Res., 28, 77-80], YPD and WormPD [Costanzo,M.C., Hogan,J.D., Cusick,M.E., Davis,B.P., Fancher,A.M., Hodges,P.E., Kondu,P., Lengieza,C., Lew-Smith,J.E., Lingner,C. et al. (2000) Nucleic Acids Res., 28, 73-76], Unigene [Wheeler,D.L., Chappey,C., Lash,A.E., Leipe,D.D., Madden,T.L., Schuler,G.D., Tatusova,T.A. and Rapp,B.A. (2000) Nucleic Acids Res., 28, 10-14], dbEST [Boguski,M.S., Lowe,T.M. and Tolstoshev,C.M. (1993) Nature Genet., 4, 332-333] and SWISS-PROT [Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45-48] and can be accessed at http://genome-www.stanford.edu/microarray.


Nucleic Acids Research | 2003

SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data

Maximilian Diehn; Gavin Sherlock; Gail Binkley; Heng Jin; John C. Matese; Tina Hernandez-Boussard; Christian A. Rees; J. Michael Cherry; David Botstein; Patrick O. Brown; Ash A. Alizadeh

The explosion in the number of functional genomic datasets generated with tools such as DNA microarrays has created a critical need for resources that facilitate the interpretation of large-scale biological data. SOURCE is a web-based database that brings together information from a broad range of resources, and provides it in manner particularly useful for genome-scale analyses. SOURCEs GeneReports include aliases, chromosomal location, functional descriptions, GeneOntology annotations, gene expression data, and links to external databases. We curate published microarray gene expression datasets and allow users to rapidly identify sets of co-regulated genes across a variety of tissues and a large number of conditions using a simple and intuitive interface. SOURCE provides content both in gene and cDNA clone-centric pages, and thus simplifies analysis of datasets generated using cDNA microarrays. SOURCE is continuously updated and contains the most recent and accurate information available for human, mouse, and rat genes. By allowing dynamic linking to individual gene or clone reports, SOURCE facilitates browsing of large genomic datasets. Finally, SOURCEs batch interface allows rapid extraction of data for thousands of genes or clones at once and thus facilitates statistical analyses such as assessing the enrichment of functional attributes within clusters of genes. SOURCE is available at http://source.stanford.edu.


Current Opinion in Immunology | 2000

Analysis of large-scale gene expression data.

Gavin Sherlock

The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.

Collaboration


Dive into the Gavin Sherlock's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alvis Brazma

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge