Carl F. Schaefer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carl F. Schaefer is active.

Explore More

Publication

Featured researches published by Carl F. Schaefer.

Proceedings of the National Academy of Sciences of the United States of America | 2002

Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.

Robert L. Strausberg; Elise A. Feingold; Lynette H. Grouse; Jeffery G. Derge; Richard D. Klausner; Francis S. Collins; Lukas Wagner; Carolyn M. Shenmen; Gregory D. Schuler; Stephen F. Altschul; Barry R. Zeeberg; Kenneth H. Buetow; Carl F. Schaefer; Narayan K. Bhat; Ralph F. Hopkins; Heather Jordan; Troy Moore; Steve I. Max; Jun Wang; Florence Hsieh; Luda Diatchenko; Kate Marusina; Andrew A. Farmer; Gerald M. Rubin; Ling Hong; Mark Stapleton; M. Bento Soares; Maria F. Bonaldo; Tom L. Casavant; Todd E. Scheetz

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http://mgc.nci.nih.gov).

Nucleic Acids Research | 2009

PID: the Pathway Interaction Database

Carl F. Schaefer; Kira Anthony; Shiva Krupa; Jeffrey R. Buchoff; Matthew Day; Timo Hannay; Kenneth H. Buetow

The Pathway Interaction Database (PID, http://pid.nci.nih.gov) is a freely available collection of curated and peer-reviewed pathways composed of human molecular signaling and regulatory events and key cellular processes. Created in a collaboration between the US National Cancer Institute and Nature Publishing Group, the database serves as a research tool for the cancer research community and others interested in cellular pathways, such as neuroscientists, developmental biologists and immunologists. PID offers a range of search features to facilitate pathway exploration. Users can browse the predefined set of pathways or create interaction network maps centered on a single molecule or cellular process of interest. In addition, the batch query tool allows users to upload long list(s) of molecules, such as those derived from microarray experiments, and either overlay these molecules onto predefined pathways or visualize the complete molecular connectivity map. Users can also download molecule lists, citation lists and complete database content in extensible markup language (XML) and Biological Pathways Exchange (BioPAX) Level 2 format. The database is updated with new pathway content every month and supplemented by specially commissioned articles on the practical uses of other relevant online tools.

Nature Biotechnology | 2010

The BioPAX community standard for pathway data sharing

Emek Demir; Michael P. Cary; Suzanne M. Paley; Ken Fukuda; Christian Lemer; Imre Vastrik; Guanming Wu; Peter D'Eustachio; Carl F. Schaefer; Joanne S. Luciano; Frank Schacherer; Irma Martínez-Flores; Zhenjun Hu; Verónica Jiménez-Jacinto; Geeta Joshi-Tope; Kumaran Kandasamy; Alejandra López-Fuentes; Huaiyu Mi; Elgar Pichler; Igor Rodchenkov; Andrea Splendiani; Sasha Tkachev; Jeremy Zucker; Gopal Gopinath; Harsha Rajasimha; Ranjani Ramakrishnan; Imran Shah; Mustafa Syed; Nadia Anwar; Özgün Babur

Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.

Proceedings of the National Academy of Sciences of the United States of America | 2002

An anatomy of normal and malignant gene expression

Kathy Boon; Elisson Osório; Susan F. Greenhut; Carl F. Schaefer; Jennifer Shoemaker; Kornelia Polyak; Patrice J. Morin; Kenneth H. Buetow; Robert L. Strausberg; Sandro J. de Souza; Gregory J. Riggins

A genes expression pattern provides clues to its role in normal physiology and disease. To provide quantitative expression levels on a genome-wide scale, the Cancer Genome Anatomy Project (CGAP) uses serial analysis of gene expression (SAGE). Over 5 million transcript tags from more than 100 human cell types have been assembled. To enhance the utility of this data, the CGAP SAGE project created SAGE Genie, a web site for the analysis and presentation of SAGE data (http://cgap.nci.nih.gov/SAGE). SAGE Genie provides an automatic link between gene names and SAGE transcript levels, accounting for alternative transcription and many potential errors. These informatics advances provide a rapid and intuitive view of transcript expression in the human body or brain, displayed on the SAGE Anatomic Viewer. We report here an easily accessible view of nearly any genes expression in a wide variety of malignant and normal tissues.

Bioinformatics | 2003

caCORE: A common infrastructure for cancer informatics

Peter A. Covitz; Frank W. Hartel; Carl F. Schaefer; Sherri de Coronado; Gilberto Fragoso; Himanso Sahni; Scott Gustafson; Kenneth H. Buetow

MOTIVATION Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. RESULTS We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources. AVAILABILITY caCORE downloads and web interfaces can be accessed from links on the caCORE web site (http://ncicb.nci.nih.gov/core). caBIO software is distributed under an open source license that permits unrestricted academic and commercial use. Vocabulary and metadata content in the EVS and caDSR, respectively, is similarly unrestricted, and is available through web applications and FTP downloads. SUPPLEMENTARY INFORMATION http://ncicb.nci.nih.gov/core/publications contains links to the caBIO 1.0 class diagram and the caCORE 1.0 Technical Guide, which provide detailed information on the present caCORE architecture, data sources and APIs. Updated information appears on a regular basis on the caCORE web site (http://ncicb.nci.nih.gov/core).

PLOS ONE | 2007

Identification of Key Processes Underlying Cancer Phenotypes Using Biologic Pathway Analysis

Sol Efroni; Carl F. Schaefer; Kenneth H. Buetow

Cancer is recognized to be a family of gene-based diseases whose causes are to be found in disruptions of basic biologic processes. An increasingly deep catalogue of canonical networks details the specific molecular interaction of genes and their products. However, mapping of disease phenotypes to alterations of these networks of interactions is accomplished indirectly and non-systematically. Here we objectively identify pathways associated with malignancy, staging, and outcome in cancer through application of an analytic approach that systematically evaluates differences in the activity and consistency of interactions within canonical biologic processes. Using large collections of publicly accessible genome-wide gene expression, we identify small, common sets of pathways – Trka Receptor, Apoptosis response to DNA Damage, Ceramide, Telomerase, CD40L and Calcineurin – whose differences robustly distinguish diverse tumor types from corresponding normal samples, predict tumor grade, and distinguish phenotypes such as estrogen receptor status and p53 mutation state. Pathways identified through this analysis perform as well or better than phenotypes used in the original studies in predicting cancer outcome. This approach provides a means to use genome-wide characterizations to map key biological processes to important clinical features in disease.

Clinical Proteomics | 2004

Analysis of the human serum proteome

King C. Chan; David A. Lucas; Denise Hise; Carl F. Schaefer; Zhen Xiao; George M. Janini; Kenneth H. Buetow; Haleem J. Issaq; Timothy D. Veenstra; Thomas P. Conrads

Changes in serum proteins that signal histopathological states, such as cancer, are useful diagnostic and prognostic biomarkers. Unfortunately, the large dynamic concentration range of proteins in serum makes it a challenging proteome to effectively characterize. Typically, methods to deplete highly abundant proteins to decrease this dynamic protein concentration range are employed, yet such depletion results in removal of important low abundant proteins.A multi-dimensional peptide separation strategy utilizing conventional separation techniques combined with tandem mass spectrometry (MS/MS) was employed for a proteome analysis of human serum. Serum proteins were digested with trypsin and resolved into 20 fractions by ampholyte-free liquid phase isoelectric focusing. These 20 peptide fractions were further fractionated by strong cation-exchange chromatography, each of which was analyzed by microcapillary reversed-phase liquid chromatography coupled online with MS/MS analysis.This investigation resulted in the identification of 1444 unique proteins in serum. Proteins from all functional classes, cellular localization, and abundance levels were identified.This study illustrates that a majority of lower abundance proteins identified in serum are present as secreted or shed species by cells as a result of signalling, necrosis, apoptosis, and hemolysis. These findings show that the protein content of serum is quite reflective of the overall profile of the human organism and a conventional multidimensional fractionation strategy combined with MS/MS is entirely capable of characterizing a significant fraction of the serum proteome. We have constructed a publicly available human serum proteomic database (http://bpp.nci.nih.gov) to provide a reference resource to facilitate future investigations of the vast archive of pathophysiological content in serum.

Cancer Research | 2006

Cancers as Wounds that Do Not Heal: Differences and Similarities between Renal Regeneration/Repair and Renal Cell Carcinoma

Joseph Riss; Chand Khanna; Seongjoon Koo; Gadisetti V.R. Chandramouli; Howard H. Yang; Ying Hu; David E. Kleiner; Andreas Rosenwald; Carl F. Schaefer; Shmuel A. Ben-Sasson; Liming Yang; John Powell; David W. Kane; Robert A. Star; Olga Aprelikova; Kristin Bauer; James R. Vasselli; Jodi K. Maranchie; Kurt W. Kohn; Kenneth H. Buetow; W. Marston Linehan; John N. Weinstein; Maxwell P. Lee; Richard D. Klausner; J. Carl Barrett

Cancers have been described as wounds that do not heal, suggesting that the two share common features. By comparing microarray data from a model of renal regeneration and repair (RRR) with reported gene expression in renal cell carcinoma (RCC), we asked whether those two processes do, in fact, share molecular features and regulatory mechanisms. The majority (77%) of the genes expressed in RRR and RCC were concordantly regulated, whereas only 23% were discordant (i.e., changed in opposite directions). The orchestrated processes of regeneration, involving cell proliferation and immune response, were reflected in the concordant genes. The discordant gene signature revealed processes (e.g., morphogenesis and glycolysis) and pathways (e.g., hypoxia-inducible factor and insulin-like growth factor-I) that reflect the intrinsic pathologic nature of RCC. This is the first study that compares gene expression patterns in RCC and RRR. It does so, in particular, with relation to the hypothesis that RCC resembles the wound healing processes seen in RRR. However, careful attention to the genes that are regulated in the discordant direction provides new insights into the critical differences between renal carcinogenesis and wound healing. The observations reported here provide a conceptual framework for further efforts to understand the biology and to develop more effective diagnostic biomarkers and therapeutic strategies for renal tumors and renal ischemia.

Trends in Cell Biology | 2001

In silico analysis of cancer through the Cancer Genome Anatomy Project

Robert L. Strausberg; Susan F. Greenhut; Lynette H. Grouse; Carl F. Schaefer; Kenneth H. Buetow

The Cancer Genome Anatomy Project (CGAP) was designed and implemented to provide public datasets, material resources and informatics tools to serve as a platform to support the elucidation of the molecular signatures of cancer. This overview of CGAP describes the status of this effort to develop resources based on gene expression, polymorphism identification and chromosome aberrations, and we describe a variety of analytical tools designed to facilitate in silico analysis of these datasets.

PLOS Genetics | 2009

Needles in the Haystack: Identifying individuals present in pooled genomic data

Rosemary Braun; William Rowe; Carl F. Schaefer; Jinghui Zhang; Kenneth H. Buetow

Recent publications have described and applied a novel metric that quantifies the genetic distance of an individual with respect to two population samples, and have suggested that the metric makes it possible to infer the presence of an individual of known genotype in a sample for which only the marginal allele frequencies are known. However, the assumptions, limitations, and utility of this metric remained incompletely characterized. Here we present empirical tests of the method using publicly accessible genotypes, as well as analytical investigations of the methods strengths and limitations. The results reveal that the null distribution is sensitive to the underlying assumptions, making it difficult to accurately calibrate thresholds for classifying an individual as a member of the population samples. As a result, the false-positive rates obtained in practice are considerably higher than previously believed. However, despite the metrics inadequacies for identifying the presence of an individual in a sample, our results suggest potential avenues for future research on tuning this method to problems of ancestry inference or disease prediction. By revealing both the strengths and limitations of the proposed method, we hope to elucidate situations in which this distance metric may be used in an appropriate manner. We also discuss the implications of our findings in forensics applications and in the protection of GWAS participant privacy.

Explore More