Gregory D. Schuler | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gregory D. Schuler is active.

Explore More

Publication

Featured researches published by Gregory D. Schuler.

Nucleic Acids Research | 2004

Database resources of the National Center for Biotechnology Information.

David Wheeler; Deanna M. Church; Ron Edgar; Scott Federhen; Wolfgang Helmberg; Thomas L. Madden; Joan Pontius; Gregory D. Schuler; Lynn M. Schriml; Edwin Sequeira; Tugba O. Suzek; Tatiana Tatusova; Lukas Wagner

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.

Proceedings of the National Academy of Sciences of the United States of America | 2002

Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.

Robert L. Strausberg; Elise A. Feingold; Lynette H. Grouse; Jeffery G. Derge; Richard D. Klausner; Francis S. Collins; Lukas Wagner; Carolyn M. Shenmen; Gregory D. Schuler; Stephen F. Altschul; Barry R. Zeeberg; Kenneth H. Buetow; Carl F. Schaefer; Narayan K. Bhat; Ralph F. Hopkins; Heather Jordan; Troy Moore; Steve I. Max; Jun Wang; Florence Hsieh; Luda Diatchenko; Kate Marusina; Andrew A. Farmer; Gerald M. Rubin; Ling Hong; Mark Stapleton; M. Bento Soares; Maria F. Bonaldo; Tom L. Casavant; Todd E. Scheetz

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http://mgc.nci.nih.gov).

Nucleic Acids Research | 2003

Database resources of the National Center for Biotechnology

David Wheeler; Deanna M. Church; Scott Federhen; Alex E. Lash; Thomas L. Madden; Joan Pontius; Gregory D. Schuler; Lynn M. Schriml; Edwin Sequeira; Tatiana Tatusova; Lukas Wagner

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBIs Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.

Journal of Molecular Medicine | 1997

Pieces of the puzzle: expressed sequence tags and the catalog of human genes

Gregory D. Schuler

Imagine trying to solve a jigsaw puzzle without having all of the pieces. This is exactly the dilemma faced by researchers in the field of molecular medicine when attempting to understand how human genes and their protein products interact with one another to lead to normal biological functions, how these functions can break down in various disease states, and how normal functions can be restored through molecular intervention. This description of the Puzzle of Life is not meant to deny the importance of environmental and other epigenetic factors, but is simply meant to define the boundaries of a puzzle whose solution is easily within our grasp. To further our basic understanding of human biology and the genetics of inherited diseases, it would be immensely valuable to compile a complete catalog of human gene sequences and to make this information available over the Internet to scientists around the world. Over the past few years huge amounts of data relevant to this puzzle have become available, but solving the puzzle remains a bioinformatics challenge. Before setting out to solve the Puzzle of Life, it would be useful to have a rough sense of how many pieces it contains. In other words, how many human genes are there? Based on indirect evidence, estimates ranging from approximately 64,000 [1] to 80,000 [2] genes have been advanced. Complete genomic sequencing has been used to generate gene catalogs for several organisms with relatively small genomes [3]. However, sequencing the human genome is a much more daunting task due to its immense size (about 3 billion bases). The United States Genome Project began in 1990 with the ambitious goal of sequencing the human genome within 15 years (i.e., by the year 2005) [4]. Unfortunately, only about 2% of the total bases make up the protein-coding portions of our genes; the remaining 98% is of unknown function and often referred to as “junk DNA.” Thus, sequencing the genome may not be the most efficient way to generate a catalog of human genes. A number of investigators have advocated large-scale sequencing of the transcription products of genes, in the form of complimentary DNA (cDNA) clones, as a prelude to sequencing of the entire human genome. As Brenner [5] put it, “If something like 98% of the genome is junk, then the best strategy would be to find the important 2%, and sequence it first.” An abundance of puzzle pieces

Nature Genetics | 1998

Data management and analysis for gene expression arrays

Olga Ermolaeva; Mohit Rastogi; Kim D. Pruitt; Gregory D. Schuler; Michael L. Bittner; Yidong Chen; Richard Simon; Paul S. Meltzer; Jeffrey M. Trent; Mark S. Boguski

Microarray technology makes it possible to simultaneously study the expression of thousands of genes during a single experiment. We have developed an information system, ArrayDB, to manage and analyse large-scale expression data. The underlying relational database was designed to allow flexibility in the nature and structure of data input and also in the generation of standard or customized reports through a web-browser interface. ArrayDB provides varied options for data retrieval and analysis tools that should facilitate the interpretation of complex hybridization results. A sampling of ArrayDB storage, retrieval and analysis capabilities is available (http://www.nhgri.nih.gov/DIR/LCG/15K/HTML/), along with information on a set of approximately 15,000 genes used to fabricate several widely used microarrays. Information stored in ArrayDB is used to provide integrated gene expression reports by linking array target sequences with NCBIs Entrez retrieval system, UniGene and KEGG pathway views. The integration of external information resources is essential in interpreting intrinsic patterns and relationships in large-scale gene expression data.

American Journal of Pathology | 2000

Molecular profiling of clinical tissue specimens: feasibility and applications.

Michael R. Emmert-Buck; Robert L. Strausberg; David B. Krizman; M. Fatima Bonaldo; Robert F. Bonner; David G. Bostwick; Monica R. Brown; Kenneth H. Buetow; Rodrigo F. Chuaqui; Kristina A. Cole; Paul H. Duray; Chad R. Englert; John W. Gillespie; Susan F. Greenhut; Lynette H. Grouse; LaDeana W. Hillier; Kenneth S. Katz; Richard D. Klausner; Vladimir Kuznetzov; Alex E. Lash; Greg Lennon; W. Marston Linehan; Lance A. Liotta; Marco A. Marra; Peter J. Munson; David K. Ornstein; Vinay V. Prabhu; Christa Prange; Gregory D. Schuler; Marcelo B. Soares

The relationship between gene expression profiles and cellular behavior in humans is largely unknown. Expression patterns of individual cell types have yet to be precisely measured, and, at present, we know or can predict the function of a relatively small percentage of genes. However, biomedical research is in the midst of an informational and technological revolution with the potential to increase dramatically our understanding of how expression modulates cellular phenotype and response to the environment. The entire sequence of the human genome will be known by the year 2003 or earlier. 1,2 In concert, the pace of efforts to complete identification and full-length cDNA sequencing of all genes has accelerated, and these goals will be attained within the next few years. 3-7 Accompanying the expanding base of genetic information are several new technologies capable of global gene expression measurements. 8-16 Taken together, the expanding genetic database and developing expression technologies are leading to an exciting new paradigm in biomedical research known as molecular profiling.

The Journal of Molecular Diagnostics | 2000

Molecular Profiling of Clinical Tissue Specimens : Feasibility and Applications

The relationship between gene expression profiles and cellular behavior in humans is largely unknown. Expression patterns of individual cell types have yet to be precisely measured, and, at present, we know or can predict the function of a relatively small percentage of genes. However, biomedical research is in the midst of an informational and technological revolution with the potential to increase dramatically our understanding of how expression modulates cellular phenotype and response to the environment. The entire sequence of the human genome will be known by the year 2003 or earlier. 1, 2 In concert, the pace of efforts to complete identification and full-length cDNA sequencing of all genes has accelerated, and these goals will be attained within the next few years. 3, 4, 5, 6, 7 Accompanying the expanding base of genetic information are several new technologies capable of global gene expression measurements. 8, 9, 10, 11, 12, 13, 14, 15, 16 Taken together, the expanding genetic database and developing expression technologies are leading to an exciting new paradigm in biomedical research known as molecular profiling.

Nature | 2001

Guide to the draft human genome

Tyra G. Wolfsberg; Johanna McEntyre; Gregory D. Schuler

There are a number of ways to investigate the structure, function and evolution of the human genome. These include examining the morphology of normal and abnormal chromosomes, constructing maps of genomic landmarks, following the genetic transmission of phenotypes and DNA sequence variations, and characterizing thousands of individual genes. To this list we can now add the elucidation of the genomic DNA sequence, albeit at ‘working draft’ accuracy. The current challenge is to weave together these disparate types of data to produce the information infrastructure needed to support the next generation of biomedical research. Here we provide an overview of the different sources of information about the human genome and how modern information technology, in particular the internet, allows us to link them together.

Nucleic Acids Research | 2011

NCBI Epigenomics: a new public resource for exploring epigenomic data sets

Ian M. Fingerman; Lee McDaniel; Xuan Zhang; Walter Ratzat; Tarek Hassan; Zhifang Jiang; Robert F. Cohen; Gregory D. Schuler

The Epigenomics database at the National Center for Biotechnology Information (NCBI) is a new resource that has been created to serve as a comprehensive public resource for whole-genome epigenetic data sets (www.ncbi.nlm.nih.gov/epigenomics). Epigenetics is the study of stable and heritable changes in gene expression that occur independently of the primary DNA sequence. Epigenetic mechanisms include post-translational modifications of histones, DNA methylation, chromatin conformation and non-coding RNAs. It has been observed that misregulation of epigenetic processes has been associated with human disease. We have constructed the new resource by selecting the subset of epigenetics-specific data from general-purpose archives, such as the Gene Expression Omnibus, and Sequence Read Archives, and then subjecting them to further review, annotation and reorganization. Raw data is processed and mapped to genomic coordinates to generate ‘tracks’ that are a visual representation of the data. These data tracks can be viewed using popular genome browsers or downloaded for local analysis. The Epigenomics resource also provides the user with a unique interface that allows for intuitive browsing and searching of data sets based on biological attributes. Currently, there are 69 studies, 337 samples and over 1100 data tracks from five well-studied species that are viewable and downloadable in Epigenomics.

Nucleic Acids Research | 2012

NCBI Epigenomics: What’s new for 2013

Ian M. Fingerman; Xuan Zhang; Walter Ratzat; Nora Husain; Robert F. Cohen; Gregory D. Schuler

The Epigenomics resource at the National Center for Biotechnology Information (NCBI) has been created to serve as a comprehensive public repository for whole-genome epigenetic data sets (www.ncbi.nlm.nih.gov/epigenomics). We have constructed this resource by selecting the subset of epigenetics-specific data from the Gene Expression Omnibus (GEO) database and then subjecting them to further review and annotation. Associated data tracks can be viewed using popular genome browsers or downloaded for local analysis. We have performed extensive user testing throughout the development of this resource, and new features and improvements are continuously being implemented based on the results. We have made substantial usability improvements to user interfaces, enhanced functionality, made identification of data tracks of interest easier and created new tools for preliminary data analyses. Additionally, we have made efforts to enhance the integration between the Epigenomics resource and other NCBI databases, including the Gene database and PubMed. Data holdings have also increased dramatically since the initial publication describing the NCBI Epigenomics resource and currently consist of >3700 viewable and downloadable data tracks from 955 biological sources encompassing five well-studied species. This updated manuscript highlights these changes and improvements.

Explore More