Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthew N. McCall is active.

Publication


Featured researches published by Matthew N. McCall.


Biostatistics | 2010

Frozen robust multiarray analysis (fRMA).

Matthew N. McCall; Benjamin M. Bolstad; Rafael A. Irizarry

Robust multiarray analysis (RMA) is the most widely used preprocessing algorithm for Affymetrix and Nimblegen gene expression microarrays. RMA performs background correction, normalization, and summarization in a modular way. The last 2 steps require multiple arrays to be analyzed simultaneously. The ability to borrow information across samples provides RMA various advantages. For example, the summarization step fits a parametric model that accounts for probe effects, assumed to be fixed across arrays, and improves outlier detection. Residuals, obtained from the fitted model, permit the creation of useful quality metrics. However, the dependence on multiple arrays has 2 drawbacks: (1) RMA cannot be used in clinical settings where samples must be processed individually or in small batches and (2) data sets preprocessed separately are not comparable. We propose a preprocessing algorithm, frozen RMA (fRMA), which allows one to analyze microarrays individually or in small batches and then combine the data for analysis. This is accomplished by utilizing information from the large publicly available microarray databases. In particular, estimates of probe-specific effects and variances are precomputed and frozen. Then, with new data sets, these are used in concert with information from the new arrays to normalize and summarize the data. We find that fRMA is comparable to RMA when the data are analyzed as a single batch and outperforms RMA when analyzing multiple batches. The methods described here are implemented in the R package fRMA and are currently available for download from the software section of http://rafalab.jhsph.edu.


Nucleic Acids Research | 2011

The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes

Matthew N. McCall; Karan Uppal; Harris A. Jaffee; Michael J. Zilliox; Rafael A. Irizarry

Various databases have harnessed the wealth of publicly available microarray data to address biological questions ranging from across-tissue differential expression to homologous gene expression. Despite their practical value, these databases rely on relative measures of expression and are unable to address the most fundamental question—which genes are expressed in a given cell type. The Gene Expression Barcode is the first database to provide reliable absolute measures of expression for most annotated genes for 131 human and 89 mouse tissue types, including diseased tissue. This is made possible by a novel algorithm that leverages information from the GEO and ArrayExpress public repositories to build statistical models that permit converting data from a single microarray into expressed/unexpressed calls for each gene. For selected platforms, users may upload data and obtain results in a matter of seconds. The raw data, curated annotation, and code used to create our resource are also available at http://rafalab.jhsph.edu/barcode.


PLOS ONE | 2014

A critical evaluation of microRNA biomarkers in non-neoplastic disease.

Baqer A. Haider; Alexander S. Baras; Matthew N. McCall; Joshua A. Hertel; Toby C. Cornish; Marc K. Halushka

Background MicroRNAs (miRNAs) are small (∼22-nt), stable RNAs that critically modulate post-transcriptional gene regulation. MicroRNAs can be found in the blood as components of serum, plasma and peripheral blood mononuclear cells (PBMCs). Many microRNAs have been reported to be specific biomarkers in a variety of non-neoplastic diseases. To date, no one has globally evaluated these proposed clinical biomarkers for general quality or disease specificity. We hypothesized that the cellular source of circulating microRNAs should correlate with cells involved in specific non-neoplastic disease processes. Appropriate cell expression data would inform on the quality and usefulness of each microRNA as a biomarker for specific diseases. We further hypothesized a useful clinical microRNA biomarker would have specificity to a single disease. Methods and Findings We identified 416 microRNA biomarkers, of which 192 were unique, in 104 publications covering 57 diseases. One hundred and thirty-nine microRNAs (33%) represented biologically plausible biomarkers, corresponding to non-ubiquitous microRNAs expressed in disease-appropriate cell types. However, at a global level, many of these microRNAs were reported as “specific” biomarkers for two or more unrelated diseases with 6 microRNAs (miR-21, miR-16, miR-146a, miR-155, miR-126 and miR-223) being reported as biomarkers for 9 or more distinct diseases. Other biomarkers corresponded to common patterns of cellular injury, such as the liver-specific microRNA, miR-122, which was elevated in a disparate set of diseases that injure the liver primarily or secondarily including hepatitis B, hepatitis C, sepsis, and myocardial infarction. Conclusions Only a subset of reported blood-based microRNA biomarkers have specificity for a particular disease. The remainder of the reported non-neoplastic biomarkers are either biologically implausible, non-specific, or uninterpretable due to limitations of our current understanding of microRNA expression.


Nucleic Acids Research | 2014

Lessons from miR-143/145: the importance of cell-type localization of miRNAs

Oliver A. Kent; Matthew N. McCall; Toby C. Cornish; Marc K. Halushka

miR-143 and miR-145 are co-expressed microRNAs (miRNAs) that have been extensively studied as potential tumor suppressors. These miRNAs are highly expressed in the colon and are consistently reported as being downregulated in colorectal and other cancers. Through regulation of multiple targets, they elicit potent effects on cancer cell growth and tumorigenesis. Importantly, a recent discovery demonstrates that miR-143 and miR-145 are not expressed in colonic epithelial cells; rather, these two miRNAs are highly expressed in mesenchymal cells such as fibroblasts and smooth muscle cells. The expression patterns of miR-143 and miR-145 and other miRNAs were initially determined from tissue level data without consideration that multiple different cell types, each with their own unique miRNA expression patterns, make up each tissue. Herein, we discuss the early reports on the identification of dysregulated miR-143 and miR-145 expression in colorectal cancer and how lack of consideration of cellular composition of normal tissue led to the misconception that these miRNAs are downregulated in cancer. We evaluate mechanistic data from miR-143/145 studies in context of their cell type-restricted expression pattern and the potential of these miRNAs to be considered tumor suppressors. Further, we examine other examples of miRNAs being investigated in inappropriate cell types modulating pathways in a non-biological fashion. Our review highlights the importance of determining the cellular expression pattern of each miRNA, so that downstream studies are conducted in the appropriate cell type.


BMC Medical Genomics | 2011

MicroRNA profiling of diverse endothelial cell types

Matthew N. McCall; Oliver A. Kent; Jianshi Yu; Karen Fox-Talbot; Ari Zaiman; Marc K. Halushka

BackgroundMicroRNAs are ~22-nt long regulatory RNAs that serve as critical modulators of post-transcriptional gene regulation. The diversity of miRNAs in endothelial cells (ECs) and the relationship of this diversity to epithelial and hematologic cells is unknown. We investigated the baseline miRNA signature of human ECs cultured from the aorta (HAEC), coronary artery (HCEC), umbilical vein (HUVEC), pulmonary artery (HPAEC), pulmonary microvasculature (HPMVEC), dermal microvasculature (HDMVEC), and brain microvasculature (HBMVEC) to understand the diversity of miRNA expression in ECs.ResultsWe identified 166 expressed miRNAs, of which 3 miRNAs (miR-99b, miR-20b and let-7b) differed significantly between EC types and predicted EC clustering. We confirmed the significance of these miRNAs by RT-PCR analysis and in a second data set by Sylamer analysis. We found wide diversity of miRNAs between endothelial, epithelial and hematologic cells with 99 miRNAs shared across cell types and 31 miRNAs unique to ECs. We show polycistronic miRNA chromosomal clusters have common expression levels within a given cell type.ConclusionsEC miRNA expression levels are generally consistent across EC types. Three microRNAs were variable within the dataset indicating potential regulatory changes that could impact on EC phenotypic differences. MiRNA expression in endothelial, epithelial and hematologic cells differentiate these cell types. This data establishes a valuable resource characterizing the diverse miRNA signature of ECs.


BMC Bioinformatics | 2011

Assessing affymetrix GeneChip microarray quality.

Matthew N. McCall; Peter Murakami; Margus Lukk; Wolfgang Huber; Rafael A. Irizarry

BackgroundMicroarray technology has become a widely used tool in the biological sciences. Over the past decade, the number of users has grown exponentially, and with the number of applications and secondary data analyses rapidly increasing, we expect this rate to continue. Various initiatives such as the External RNA Control Consortium (ERCC) and the MicroArray Quality Control (MAQC) project have explored ways to provide standards for the technology. For microarrays to become generally accepted as a reliable technology, statistical methods for assessing quality will be an indispensable component; however, there remains a lack of consensus in both defining and measuring microarray quality.ResultsWe begin by providing a precise definition of microarray quality and reviewing existing Affymetrix GeneChip quality metrics in light of this definition. We show that the best-performing metrics require multiple arrays to be assessed simultaneously. While such multi-array quality metrics are adequate for bench science, as microarrays begin to be used in clinical settings, single-array quality metrics will be indispensable. To this end, we define a single-array version of one of the best multi-array quality metrics and show that this metric performs as well as the best multi-array metrics. We then use this new quality metric to assess the quality of microarry data available via the Gene Expression Omnibus (GEO) using more than 22,000 Affymetrix HGU133a and HGU133plus2 arrays from 809 studies.ConclusionsWe find that approximately 10 percent of these publicly available arrays are of poor quality. Moreover, the quality of microarray measurements varies greatly from hybridization to hybridization, study to study, and lab to lab, with some experiments producing unusable data. Many of the concepts described here are applicable to other high-throughput technologies.


Nucleic Acids Research | 2014

The Gene Expression Barcode 3.0: improved data processing and mining tools.

Matthew N. McCall; Harris A. Jaffee; Susan Zelisko; Neeraj Sinha; Guido Hooiveld; Rafael A. Irizarry; Michael J. Zilliox

The Gene Expression Barcode project, http://barcode.luhs.org, seeks to determine the genes expressed for every tissue and cell type in humans and mice. Understanding the absolute expression of genes across tissues and cell types has applications in basic cell biology, hypothesis generation for gene function and clinical predictions using gene expression signatures. In its current version, this project uses the abundant publicly available microarray data sets combined with a suite of single-array preprocessing, quality control and analysis methods. In this article, we present the improvements that have been made since the previous version of the Gene Expression Barcode in 2011. These include a variety of new data mining tools and summaries, estimated transcriptomes and curated annotations.


Bioinformatics | 2014

On non-detects in qPCR data

Matthew N. McCall; Helene McMurray; Hartmut Land; Anthony Almudevar

Motivation: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. Despite extensive research in qPCR laboratory protocols, normalization and statistical analysis, little attention has been given to qPCR non-detects—those reactions failing to produce a minimum amount of signal. Results: We show that the common methods of handling qPCR non-detects lead to biased inference. Furthermore, we show that non-detects do not represent data missing completely at random and likely represent missing data occurring not at random. We propose a model of the missing data mechanism and develop a method to directly model non-detects as missing data. Finally, we show that our approach results in a sizeable reduction in bias when estimating both absolute and differential gene expression. Availability and implementation: The proposed algorithm is implemented in the R package, nondetects. This package also contains the raw data for the three example datasets used in this manuscript. The package is freely available at http://mnmccall.com/software and as part of the Bioconductor project. Contact: [email protected]


American Journal of Transplantation | 2012

Micro RNA Expression Profiles as Adjunctive Data to Assess the Risk of Hepatocellular Carcinoma Recurrence After Liver Transplantation

Christopher T. Barry; M. D'Souza; Matthew N. McCall; Saman Safadjou; Charlotte K. Ryan; Randeep Kashyap; C.E. Marroquin; Mark S. Orloff; Anthony Almudevar; T. E. Godfrey

Donor livers are precious resources and it is, therefore, ethically imperative that we employ optimally sensitive and specific transplant selection criteria. Current selection criteria, the Milan criteria, for liver transplant candidates with hepatocellular carcinoma (HCC) are primarily based on radiographic characteristics of the tumor. Although the Milan criteria result in reasonably high survival and low‐recurrence rates, they do not assess an individual patients tumor biology and recurrence risk. Consequently, it is difficult to predict on an individual basis the risk for recurrent disease. To address this, we employed microarray profiling of microRNA (miRNA) expression from formalin fixed paraffin embedded tissues to define a biomarker that distinguishes between patients with and without HCC recurrence after liver transplant. In our cohort of 64 patients, this biomarker outperforms the Milan criteria in that it identifies patients outside of Milan who did not have recurrent disease and patients within Milan who had recurrence. We also describe a method to account for multifocal tumors in biomarker signature discovery.


Nucleic Acids Research | 2008

Consolidated strategy for the analysis of microarray spike-in data

Matthew N. McCall; Rafael A. Irizarry

As the number of users of microarray technology continues to grow, so does the importance of platform assessments and comparisons. Spike-in experiments have been successfully used for internal technology assessments by microarray manufacturers and for comparisons of competing data analysis approaches. The microarray literature is saturated with statistical assessments based on spike-in experiment data. Unfortunately, the statistical assessments vary widely and are applicable only in specific cases. This has introduced confusion into the debate over best practices with regards to which platform, protocols and data analysis tools are best. Furthermore, cross-platform comparisons have proven difficult because reported concentrations are not comparable. In this article, we introduce two new spike-in experiments, present a novel statistical solution that enables cross-platform comparisons, and propose a comprehensive procedure for assessments based on spike-in experiments. The ideas are implemented in a user friendly Bioconductor package: spkTools. We demonstrated the utility of our tools by presenting the first spike-in-based comparison of the three major platforms–Affymetrix, Agilent and Illumina.

Collaboration


Dive into the Matthew N. McCall's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anthony Almudevar

University of Rochester Medical Center

View shared research outputs
Top Co-Authors

Avatar

Helene McMurray

University of Rochester Medical Center

View shared research outputs
Top Co-Authors

Avatar

Hartmut Land

University of Rochester Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Oliver A. Kent

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Toby C. Cornish

Johns Hopkins University School of Medicine

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge