Ron Edgar
National Institutes of Health
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ron Edgar.
Nucleic Acids Research | 2004
David Wheeler; Deanna M. Church; Ron Edgar; Scott Federhen; Wolfgang Helmberg; Thomas L. Madden; Joan Pontius; Gregory D. Schuler; Lynn M. Schriml; Edwin Sequeira; Tugba O. Suzek; Tatiana Tatusova; Lukas Wagner
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
Nucleic Acids Research | 2002
Ron Edgar; Michael Domrachev; Alex E. Lash
The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data. GEO provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-throughput gene expression and genomic hybridization experiments. GEO is not intended to replace in house gene expression databases that benefit from coherent data sets, and which are constructed to facilitate a particular analytic method, but rather complement these by acting as a tertiary, central data distribution hub. The three central data entities of GEO are platforms, samples and series, and were designed with gene expression and genomic hybridization experiments in mind. A platform is, essentially, a list of probes that define what set of molecules may be detected. A sample describes the set of molecules that are being probed and references a single platform used to generate its molecular abundance data. A series organizes samples into the meaningful data sets which make up an experiment. The GEO repository is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
Nucleic Acids Research | 2007
Tanya Barrett; Dennis B. Troup; Stephen E. Wilhite; Pierre Ledoux; Dmitry Rudnev; Carlos Evangelista; Irene F. Kim; Alexandra Soboleva; Maxim Tomashevsky; Ron Edgar
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at
Nucleic Acids Research | 2004
Tanya Barrett; Tugba O. Suzek; Dennis B. Troup; Stephen E. Wilhite; Wing-Chi Ngau; Pierre Ledoux; Dmitry Rudnev; Alex E. Lash; Wataru Fujibuchi; Ron Edgar
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30 000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
Nucleic Acids Research | 2009
Tanya Barrett; Dennis B. Troup; Stephen E. Wilhite; Pierre Ledoux; Dmitry Rudnev; Carlos Evangelista; Irene F. Kim; Alexandra Soboleva; Maxim Tomashevsky; Kimberly A. Marshall; Katherine Phillippy; Patti M. Sherman; Rolf N. Muertter; Ron Edgar
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as ‘Minimum Information About a Microarray Experiment’ (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Methods in Enzymology | 2006
Tanya Barrett; Ron Edgar
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information archives and freely distributes high-throughput molecular abundance data, predominantly gene expression data generated by DNA microarray technology. The database has a flexible design that can handle diverse styles of both unprocessed and processed data in a Minimum Information About a Microarray Experiment-supportive infrastructure that promotes fully annotated submissions. GEO currently stores about a billion individual gene expression measurements, derived from over 100 organisms, submitted by over 1500 laboratories, addressing a wide range of biological phenomena. To maximize the utility of these data, several user-friendly web-based interfaces and applications have been implemented that enable effective exploration, query, and visualization of these data at the level of individual genes or entire studies. This chapter describes how data are stored, submission procedures, and mechanisms for data retrieval and query. GEO is publicly accessible at http://www.ncbi.nlm.nih.gov/projects/geo/.
Nature Biotechnology | 2006
Ron Edgar; Tanya Barrett
The Minimum Information About a Microarray Experiment (MIAME) guidelines are a data content document developed by the Microarray Gene Expression Data (MGED) Society that outlines the information that should be provided when describing a microarray experiment1. Many journals and funding agencies have adopted the guidelines, with the aim of facilitating access to the elements of a study that would enable independent evaluation of results. However, the MIAME requirements have been criticized recently2, 3. The criticism stems, in part, from different interpretations of the level of detail required to adequately report a microarray experiment, and debates as to whether there is a genuine benefit to making microarray data public. The Gene Expression Omnibus (GEO) database at the National Center for Biotechnology Information (NCBI)4 and ArrayExpress at the European Bioinformatics Institute (EBI)5 are the two major public databases of microarray data. Although they have different designs, both databases support capture of all data elements defined by MIAME. Figure 1 presents a timeline of major landmarks in the evolution of the GEO database, together with concomitant growth in submissions. GEO was launched in 2000, more than a year before the MIAME guidelines were proposed. Because there was not yet a consensus on reporting standards for microarray data, or even an obligation to make microarray data public, GEO initially allowed a minimal level of experimental detail to be supplied. Over the ensuing years we continually monitored the needs and requests of end-users, and gauged the level of effort submitters were realistically willing to invest in making their data public. We responded with incremental improvements to database design and curation standards, and we developed easy-to-generate batch deposit formats that significantly reduced the burden of submission and allowed contributors to focus on the content submitted rather than the mechanism of submission. Figure 1 Timeline of GEO growth and major landmarks in evolution of GEO database, and a screenshot of GEO tools which allow users to query, analyze, and visualize the data in GEO. In June 2005, we released major database revisions that included specific provisions for all MIAME data elements. In 2006, mechanisms for provision of raw data were further streamlined, and several MIAME elements that were previously optional became mandatory. Yet, even with these advances, it is still possible for a submitter to supply data that do not strictly adhere to the MIAME requirements. The difficulty lies in the fact that MIAME is a subjective set of guidelines where the level of detail to report is open to interpretation and, thus, cannot be unequivocally validated or enforced by computational means. All data submitted to GEO are syntactically validated for correct document structure, organization, and provision of basic elements. Next, each submission is inspected by curators for content integrity. GEO curators employ a pragmatic approach; we aim to ensure that sufficient information has been supplied to allow general interpretation of the experiment. Although encouraged, we have been less dogmatic with regards to provision of all-inclusive experimental protocols that would possibly permit practical replication of the entire experiment. Our reasoning is that provision of granulated experimental details adds a significant burden on the submitter, for (arguably) minimal real benefit for most end-users who are usually less concerned with this level of detail. When content or format problems are identified, curators work with the submitter until the issue is resolved. Submissions lacking critical descriptive elements necessary for overall experiment interpretation are not approved for public release. However, given the large diversity of biological themes, technologies, and statistical transformations applied to microarray data, it is impractical for curators to decisively determine the accuracy and validity of the data, or to assess if all relevant information has been supplied. This is where the role of reviewers and editors becomes important. The GEO database has had mechanisms for anonymous reviewer access to prepublication data since 2003. Over the last several years, authors have occasionally requested curator comment regarding the level of MIAME-compliance of their submissions, and we have been happy to offer feedback on areas that could be improved. GEO staff are similarly available to support reviewers and editors by providing tailored inspections of MIAME compliance of specific submissions upon request of the journal, as ArrayExpress is proposing to do6. If a reviewer determines that insufficient information has been supplied, the GEO database is designed such that authors can quickly respond by updating their records accordingly. It has been challenging to find the optimal balance between submitter effort and the appropriate level of metadata detail to request, all within a rapidly evolving technological and social environment7. However, the relative simplicity of the GEO database structure, together with common-sense curation policies that focus on gathering germane MIAME elements, have made it possible for us to develop an extensive suite of utilities that make the volumes of complex data archived at GEO accessible and easy to use by the research community at large8. Ultimately, the value of a database is reflected by how it is used by the community it serves. In the past month, GEO received approximately one million query hits, and over 200,000 file transfer downloads amounting to over 2.5 terabytes of compressed data. Furthermore, it is clear that researchers are applying these data to their own studies, as evidenced by over 100 recent publications citing data found in GEO to support or otherwise complement their own studies9. We view this as testament that the effort involved in making expression data public via GEO is fully justified.
Methods of Molecular Biology | 2006
Tanya Barrett; Ron Edgar
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) has emerged as the leading fully public repository for gene expression data. This chapter describes how to use Web-based interfaces, applications, and graphics to effectively explore, visualize, and interpret the hundreds of microarray studies and millions of gene expression patterns stored in GEO. Data can be examined from both experiment-centric and gene-centric perspectives using user-friendly tools that do not require specialized expertise in microarray analysis or time-consuming download of massive data sets. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
Nature Biotechnology | 2017
Yu-Feng Yvonne Chan; Pei Wang; Linda Rogers; Nicole Tignor; Micol Zweig; Steven Gregory Hershman; Nicholas Genes; Erick R. Scott; Eric Krock; Marcus A. Badgeley; Ron Edgar; Samantha Violante; Rosalind J. Wright; Charles A. Powell; Joel T. Dudley; Eric E. Schadt
The feasibility of using mobile health applications to conduct observational clinical studies requires rigorous validation. Here, we report initial findings from the Asthma Mobile Health Study, a research study, including recruitment, consent, and enrollment, conducted entirely remotely by smartphone. We achieved secure bidirectional data flow between investigators and 7,593 participants from across the United States, including many with severe asthma. Our platform enabled prospective collection of longitudinal, multidimensional data (e.g., surveys, devices, geolocation, and air quality) in a subset of users over the 6-month study period. Consistent trending and correlation of interrelated variables support the quality of data obtained via this method. We detected increased reporting of asthma symptoms in regions affected by heat, pollen, and wildfires. Potential challenges with this technology include selection bias, low retention rates, reporting bias, and data security. These issues require attention to realize the full potential of mobile platforms in research and patient care.
Nature Methods | 2008
Tanya Barrett; Ron Edgar
To the editor: Although your recent editorial1, “An intelligently designed response” was apposite, an important omission was apparent. Yes, debunking intelligent design (ID) by scientific reasoning requires good lay communication skills. Yes, merely (correctly) dismissing ID as nonsense will only fuel charges of scientific arrogance. And yes, the point about the nature of science has to be made because doing so makes palpable that ID is not science. However, the advice to avoid a religious discussion is questionableparticularly as so doing does not necessarily entail an atheistic rant. As well as emphasizing what ID is not, we also need to consider what it is. ID proponents eschew its association with literalist creationism but couple religious conservatism with a technology-friendly modernity. In the UK, we have a Christian organization, absurdly named ‘Truth in Science’, which has distributed glossy paraphernalia to the science departments of secondary schools and sixth form colleges, advocating ID inclusion in science lessons. Despite contravening the national curriculum, this marketing ploy has apparently proven effective in persuading a number of schools that it has scientific credentials. ID appeals to fundamentalists of other religions. Harun Yahya, the pseudonymous vehicle for Muslim creationist propaganda, has distributed a lavish, 800-page tome to schools and universities, scientists and museums in France and the US. Thus, referring to religion is both unavoidable and necessary to understand the strategy at work here. ID is nothing more than sexed-up creationism for the media age, a realization necessary for an effective refutation of its scientific posturing.