Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Paul Marjoram is active.

Publication


Featured researches published by Paul Marjoram.


Nature | 2010

Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines

Susanna Atwell; Yu S. Huang; Bjarni J. Vilhjálmsson; Glenda Willems; Matthew Horton; Yan Li; Dazhe Meng; Alexander Platt; Aaron M. Tarone; Tina T. Hu; Rong Jiang; N. Wayan Muliyati; Xu Zhang; Muhammad Ali Amer; Ivan Baxter; Benjamin Brachi; Joanne Chory; Caroline Dean; Marilyne Debieu; Juliette de Meaux; Joseph R. Ecker; Nathalie Faure; Joel M. Kniskern; Jonathan D. G. Jones; Todd P. Michael; Adnane Nemri; Fabrice Roux; David E. Salt; Chunlao Tang; Marco Todesco

Although pioneered by human geneticists as a potential solution to the challenging problem of finding the genetic basis of common human diseases, genome-wide association (GWA) studies have, owing to advances in genotyping and sequencing technology, become an obvious general approach for studying the genetics of natural variation and traits of agricultural importance. They are particularly useful when inbred lines are available, because once these lines have been genotyped they can be phenotyped multiple times, making it possible (as well as extremely cost effective) to study many different traits in many different environments, while replicating the phenotypic measurements to reduce environmental noise. Here we demonstrate the power of this approach by carrying out a GWA study of 107 phenotypes in Arabidopsis thaliana, a widely distributed, predominantly self-fertilizing model plant known to harbour considerable genetic variation for many adaptively important traits. Our results are dramatically different from those of human GWA studies, in that we identify many common alleles of major effect, but they are also, in many cases, harder to interpret because confounding by complex genetics and population structure make it difficult to distinguish true associations from false. However, a-priori candidates are significantly over-represented among these associations as well, making many of them excellent candidates for follow-up experiments. Our study demonstrates the feasibility of GWA studies in A. thaliana and suggests that the approach will be appropriate for many other organisms.


Proceedings of the National Academy of Sciences of the United States of America | 2003

Markov chain Monte Carlo without likelihoods

Paul Marjoram; John Molitor; Vincent Plagnol; Simon Tavaré

Many stochastic simulation approaches for generating observations from a posterior distribution depend on knowing a likelihood function. However, for many complex probability models, such likelihoods are either impossible or computationally prohibitive to obtain. Here we present a Markov chain Monte Carlo method for generating observations from a posterior distribution without the use of likelihoods. It can also be used in frequentist applications, in particular for maximum-likelihood estimation. The approach is illustrated by an example of ancestral inference in population genetics. A number of open problems are highlighted in the discussion.


PLOS Genetics | 2005

An Arabidopsis Example of Association Mapping in Structured Samples

Keyan Zhao; Maria Jose Aranzana; Sung Kim; Clare Lister; Chikako Shindo; Chunlao Tang; Christopher Toomajian; Honggang Zheng; Caroline Dean; Paul Marjoram; Magnus Nordborg

A potentially serious disadvantage of association mapping is the fact that marker-trait associations may arise from confounding population structure as well as from linkage to causative polymorphisms. Using genome-wide marker data, we have previously demonstrated that the problem can be severe in a global sample of 95 Arabidopsis thaliana accessions, and that established methods for controlling for population structure are generally insufficient. Here, we use the same sample together with a number of flowering-related phenotypes and data-perturbation simulations to evaluate a wider range of methods for controlling for population structure. We find that, in terms of reducing the false-positive rate while maintaining statistical power, a recently introduced mixed-model approach that takes genome-wide differences in relatedness into account via estimated pairwise kinship coefficients generally performs best. By combining the association results with results from linkage mapping in F2 crosses, we identify one previously known true positive and several promising new associations, but also demonstrate the existence of both false positives and false negatives. Our results illustrate the potential of genome-wide association scans as a tool for dissecting the genetics of natural variation, while at the same time highlighting the pitfalls. The importance of study design is clear; our study is severely under-powered both in terms of sample size and marker density. Our results also provide a striking demonstration of confounding by population structure. While statistical methods can be used to ameliorate this problem, they cannot always be effective and are certainly not a substitute for independent evidence, such as that obtained via crosses or transgenic experiments. Ultimately, association mapping is a powerful tool for identifying a list of candidates that is short enough to permit further genetic study.


PLOS Genetics | 2005

Genome-Wide Association Mapping in Arabidopsis Identifies Previously Known Flowering Time and Pathogen Resistance Genes

Maria Jose Aranzana; Sung Kim; Keyan Zhao; Erica G. Bakker; Matthew Horton; Katrin Jakob; Clare Lister; John Molitor; Chikako Shindo; Chunlao Tang; Christopher Toomajian; Brian Traw; Honggang Zheng; Joy Bergelson; Caroline Dean; Paul Marjoram; Magnus Nordborg

There is currently tremendous interest in the possibility of using genome-wide association mapping to identify genes responsible for natural variation, particularly for human disease susceptibility. The model plant Arabidopsis thaliana is in many ways an ideal candidate for such studies, because it is a highly selfing hermaphrodite. As a result, the species largely exists as a collection of naturally occurring inbred lines, or accessions, which can be genotyped once and phenotyped repeatedly. Furthermore, linkage disequilibrium in such a species will be much more extensive than in a comparable outcrossing species. We tested the feasibility of genome-wide association mapping in A. thaliana by searching for associations with flowering time and pathogen resistance in a sample of 95 accessions for which genome-wide polymorphism data were available. In spite of an extremely high rate of false positives due to population structure, we were able to identify known major genes for all phenotypes tested, thus demonstrating the potential of genome-wide association mapping in A. thaliana and other species with similar patterns of variation. The rate of false positives differed strongly between traits, with more clinal traits showing the highest rate. However, the false positive rates were always substantial regardless of the trait, highlighting the necessity of an appropriate genomic control in association studies.


Journal of Computational Biology | 1996

Ancestral inference from samples of DNA sequences with recombination.

R. C. Griffiths; Paul Marjoram

The sampling distribution of a collection of DNA sequences is studied under a model where recombination can occur in the ancestry of the sequences. The infinitely-many-sites model of mutation is assumed where there may only be one mutation at a given site. Ancestral inference procedures are discussed for: estimating recombination and mutation rates; estimating the times to the most recent common ancestors along the sequences; estimating ages of mutations; and estimating the number of recombination events in the ancestry of the sample. Inferences are made conditional on the configuration of the pattern of mutations at sites in observed sample sequences. A computational algorithm based on a Markov chain simulation is developed, implemented, and illustrated with examples for these inference procedures. This algorithm is very computationally intensive.


Genome Research | 2008

Fast and flexible simulation of DNA sequence data

Gary K. Chen; Paul Marjoram; Jeffrey D. Wall

Simulation of genomic sequences under the coalescent with recombination has conventionally been impractical for regions beyond tens of megabases. This work presents an algorithm, implemented as the program MaCS (Markovian Coalescent Simulator), that can efficiently simulate haplotypes under any arbitrary model of population history. We present several metrics comparing the performance of MaCS with other available simulation programs. Practical usage of MaCS is demonstrated through a comparison of measures of linkage disequilibrium between generated program output and real genotype data from populations considered to be structured.


Nature Reviews Genetics | 2006

Modern computational approaches for analysing molecular genetic variation data

Paul Marjoram; Simon Tavaré

An explosive growth is occurring in the quantity, quality and complexity of molecular variation data that are being collected. Historically, such data have been analysed by using model-based methods. Models are useful for sharpening intuition, for explanation and for prediction: they add to our understanding of how the data were formed, and they can provide quantitative answers to questions of interest. We outline some of these model-based approaches, including the coalescent, and discuss the applicability of the computational methods that are necessary given the highly complex nature of current and future data sets.


BMC Genetics | 2006

Fast "coalescent" simulation.

Paul Marjoram; Jeffrey D. Wall

BackgroundThe amount of genome-wide molecular data is increasing rapidly, as is interest in developing methods appropriate for such data. There is a consequent increasing need for methods that are able to efficiently simulate such data. In this paper we implement the sequentially Markovian coalescent algorithm described by McVean and Cardin and present a further modification to that algorithm which slightly improves the closeness of the approximation to the full coalescent model. The algorithm ignores a class of recombination events known to affect the behavior of the genealogy of the sample, but which do not appear to affect the behavior of generated samples to any substantial degree.ResultsWe show that our software is able to simulate large chromosomal regions, such as those appropriate in a consideration of genome-wide data, in a way that is several orders of magnitude faster than existing coalescent algorithms.ConclusionThis algorithm provides a useful resource for those needing to simulate large quantities of data for chromosomal-length regions using an approach that is much more efficient than traditional coalescent models.


Proceedings of the National Academy of Sciences of the United States of America | 2009

Inferring clonal expansion and cancer stem cell dynamics from DNA methylation patterns in colorectal cancers

Kimberly D. Siegmund; Paul Marjoram; Yen-Jung Woo; Simon Tavaré; Darryl Shibata

Cancers are clonal expansions, but how a single, transformed human cell grows into a billion-cell tumor is uncertain because serial observations are impractical. Potentially, this history is surreptitiously recorded within genomes that become increasingly numerous, polymorphic, and physically separated after transformation. To correlate physical with epigenetic pairwise distances, small 2,000- to 10,000-cell gland fragments were sampled from left and right sides of 12 primary colorectal cancers, and passenger methylation at 2 CpG-rich regions was measured by bisulfite sequencing. Methylation patterns were polymorphic but differences were similar between different parts of the same tumor, consistent with relatively isotropic or “flat” clonal expansions that could be simulated by rapid initial population expansions. Methylation patterns were too diverse to be consistent with very rare cancer stem cells but were more consistent with multiple (≈4 to 1,000) long-lived cancer stem cell lineages per cancer gland. Our study illustrates the potential to reconstruct the unperturbed biology of human cancers from epigenetic passenger variations in their present-day genomes.


Oncogene | 2004

A multigene expression panel for the molecular diagnosis of Barrett's esophagus and Barrett's adenocarcinoma of the esophagus.

Jan Brabender; Paul Marjoram; Dennis Salonga; Ralf Metzger; Paul M. Schneider; Ji Min Park; Sylke Schneider; Arnulf H. Hölscher; Jing Yin; Stephen J. Meltzer; Kathleen D. Danenberg; Peter V. Danenberg; Reginald V. Lord

In order to identify genes or combination of genes that have the power to discriminate between premalignant Barretts esophagus and Barretts associated adenocarcinoma, we analysed a panel of 23 genes using quantitative real-time RT–PCR (qRT–PCR, Taqman®) and bioinformatic tools. The genes chosen were either known to be associated with Barretts carcinogenesis or were filtered from a previous cDNA microarray study on Barretts adenocarcinoma. A total of 98 tissues, obtained from 19 patients with Barretts esophagus (BE group) and 20 patients with Barretts associated esophageal adenocarcinoma (EA group), were studied. Triplicate analysis for the full 23 gene of interest panel, and analysis of an internal control gene, was performed for all samples, for a total of more than 9016 single PCR reactions. We found distinct classes of gene expression patterns in the different types of tissues. The most informative genes clustered in six different classes and had significantly different expression levels in Barretts esophagus tissues compared to adenocarcinoma tissues. Linear discriminant analysis (LDA) distinguished four genetically different groups. The normal squamous esophagus tissues from patients with BE or EA were not distinguishable from one another, but Barretts esophagus tissues could be distinguished from adenocarcinoma tissues. Using the most informative genes, obtained from a logistic regression analysis, we were able to completely distinguish between benign Barretts and Barretts adenocarcinomas. This study provides the first non-array parallel mRNA quantitation analysis of a panel of genes in the Barretts esophagus model of multistage carcinogenesis. Our results suggest that mRNA expression quantitation of a panel of genes can discriminate between premalignant and malignant Barretts disease. Logistic regression and LDAs can be used to further identify, from the complete panel, gene subsets with the power to make these diagnostic distinctions. Expression analysis of a limited number of highly selected genes may have clinical usefulness for the treatment of patients with this disease.

Collaboration


Dive into the Paul Marjoram's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

John Molitor

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Kimberly D. Siegmund

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Darryl Shibata

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Magnus Nordborg

Austrian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sergey V. Nuzhdin

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Rohit Varma

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Boris Z. Simkhovich

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

David V. Conti

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge