Christopher Yau
Wellcome Trust Centre for Human Genetics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christopher Yau.
Nucleic Acids Research | 2007
Stefano Colella; Christopher Yau; Jennifer M. Taylor; Ghazala Mirza; Helen Butler; Penny Clouston; Anne S. Bassett; Anneke Seller; Christopher Holmes; Jiannis Ragoussis
Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray™ SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray™ SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies.
Journal of Computational and Graphical Statistics | 2010
Anthony Lee; Christopher Yau; Michael B. Giles; Arnaud Doucet; Christopher Holmes
We present a case study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multicore processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we find speedups from 35- to 500-fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modeling into complex data-rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. This article has supplementary material online.
Briefings in Functional Genomics | 2009
Laura Winchester; Christopher Yau; Jiannis Ragoussis
Data from whole genome association studies can now be used for dual purposes, genotyping and copy number detection. In this review we discuss some of the methods for using SNP data to detect copy number events. We examine a number of algorithms designed to detect copy number changes through the use of signal-intensity data and consider methods to evaluate the changes found. We describe the use of several statistical models in copy number detection in germline samples. We also present a comparison of data using these methods to assess accuracy of prediction and detection of changes in copy number.
Molecular Cancer | 2013
Neel Sengupta; Christopher Yau; Anuratha Sakthianandeswaren; Dmitri Mouradov; Peter Gibbs; Nirosha Suraweera; Jean-Baptiste Cazier; Guadalupe Polanco-Echeverry; Anil Ghosh; M. A. Thaha; Shafi Ahmed; Roger Feakins; David Propper; Sina Dorudi; Oliver M. Sieber; Andrew Silver; Cecilia Lai
BackgroundPrevalence of colorectal cancer (CRC) in the British Bangladeshi population (BAN) is low compared to British Caucasians (CAU). Genetic background may influence mutations and disease features.MethodsWe characterized the clinicopathological features of BAN CRCs and interrogated their genomes using mutation profiling and high-density single nucleotide polymorphism (SNP) arrays and compared findings to CAU CRCs.ResultsAge of onset of BAN CRC was significantly lower than for CAU patients (p=3.0 x 10-5) and this difference was not due to Lynch syndrome or the polyposis syndromes. KRAS mutations in BAN microsatellite stable (MSS) CRCs were comparatively rare (5.4%) compared to CAU MSS CRCs (25%; p=0.04), which correlates with the high percentage of mucinous histotype observed (31%) in the BAN samples. No BRAF mutations was seen in our BAN MSS CRCs (CAU CRCs, 12%; p=0.08). Array data revealed similar patterns of gains (chromosome 7 and 8q), losses (8p, 17p and 18q) and LOH (4q, 17p and 18q) in BAN and CAU CRCs. A small deletion on chromosome 16p13.2 involving the alternative splicing factor RBFOX1 only was found in significantly more BAN (50%) than CAU CRCs (15%) cases (p=0.04). Focal deletions targeting the 5’ end of the gene were also identified. Novel RBFOX1 mutations were found in CRC cell lines and tumours; mRNA and protein expression was reduced in tumours.ConclusionsKRAS mutations were rare in BAN MSS CRC and a mucinous histotype common. Loss of RBFOX1 may explain the anomalous splicing activity associated with CRC.
Genome Biology | 2010
Christopher Yau; Dmitri Mouradov; Robert N. Jorissen; Stefano Colella; Ghazala Mirza; Graham Steers; Adrian L. Harris; Jiannis Ragoussis; Oliver M. Sieber; Christopher Holmes
We describe a statistical method for the characterization of genomic aberrations in single nucleotide polymorphism microarray data acquired from cancer genomes. Our approach allows us to model the joint effect of polyploidy, normal DNA contamination and intra-tumour heterogeneity within a single unified Bayesian framework. We demonstrate the efficacy of our method on numerous datasets including laboratory generated mixtures of normal-cancer cell lines and real primary tumours.
Genome Biology | 2015
Emma Pierson; Christopher Yau
Single-cell RNA-seq data allows insight into normal cellular function and various disease states through molecular characterization of gene expression on the single cell level. Dimensionality reduction of such high-dimensional data sets is essential for visualization and analysis, but single-cell RNA-seq data are challenging for classical dimensionality-reduction methods because of the prevalence of dropout events, which lead to zero-inflated data. Here, we develop a dimensionality-reduction method, (Z)ero (I)nflated (F)actor (A)nalysis (ZIFA), which explicitly models the dropout characteristics, and show that it improves modeling accuracy on simulated and biological data sets.
The American Journal of Gastroenterology | 2013
Dmitri Mouradov; Enric Domingo; Peter Gibbs; Robert N. Jorissen; Shan Li; Pik Ying Soo; Lara Lipton; Jayesh Desai; Håvard E. Danielsen; Dahmane Oukrif; Marco Novelli; Christopher Yau; Christopher Holmes; Ian Jones; Stephen McLaughlin; Peter L. Molloy; Nicholas J. Hawkins; Robyn L. Ward; Rachel Midgely; David Kerr; Ian Tomlinson; Oliver M. Sieber
OBJECTIVES:Microsatellite instability (MSI) is an established marker of good prognosis in colorectal cancer (CRC). Chromosomal instability (CIN) is strongly negatively associated with MSI and has been shown to be a marker of poor prognosis in a small number of studies. However, a substantial group of “double-negative” (MSI−/CIN−) CRCs exists. The prognosis of these patients is unclear. Furthermore, MSI and CIN are each associated with specific molecular changes, such as mutations in KRAS and BRAF, that have been associated with prognosis. It is not known which of MSI, CIN, and the specific gene mutations are primary predictors of survival.METHODS:We evaluated the prognostic value (disease-free survival, DFS) of CIN, MSI, mutations in KRAS, NRAS, BRAF, PIK3CA, FBXW7, and TP53, and chromosome 18q loss-of-heterozygosity (LOH) in 822 patients from the VICTOR trial of stage II/III CRC. We followed up promising associations in an Australian community-based cohort (N=375).RESULTS:In the VICTOR patients, no specific mutation was associated with DFS, but individually MSI and CIN showed significant associations after adjusting for stage, age, gender, tumor location, and therapy. A combined analysis of the VICTOR and community-based cohorts showed that MSI and CIN were independent predictors of DFS (for MSI, hazard ratio (HR)=0.58, 95% confidence interval (CI) 0.36–0.93, and P=0.021; for CIN, HR=1.54, 95% CI 1.14–2.08, and P=0.005), and joint CIN/MSI testing significantly improved the prognostic prediction of MSI alone (P=0.028). Higher levels of CIN were monotonically associated with progressively poorer DFS, and a semi-quantitative measure of CIN was a better predictor of outcome than a simple CIN+/− variable. All measures of CIN predicted DFS better than the recently described Watanabe LOH ratio.CONCLUSIONS:MSI and CIN are independent predictors of DFS for stage II/III CRC. Prognostic molecular tests for CRC relapse should currently use MSI and a quantitative measure of CIN rather than specific gene mutations.
Bioinformatics | 2008
Eleni Giannoulatou; Christopher Yau; Stefano Colella; Jiannis Ragoussis; Christopher Holmes
UNLABELLED Current genotyping algorithms typically call genotypes by clustering allele-specific intensity data on a single nucleotide polymorphism (SNP) by SNP basis. This approach assumes the availability of a large number of control samples that have been sampled on the same array and platform. We have developed a SNP genotyping algorithm for the Illumina Infinium SNP genotyping assay that is entirely within-sample and does not require the need for a population of control samples nor parameters derived from such a population. Our algorithm exhibits high concordance with current methods and >99% call accuracy on HapMap samples. The ability to call genotypes using only within-sample information makes the method computationally light and practical for studies involving small sample sizes and provides a valuable independent quality control metric for other population-based approaches. AVAILABILITY http://www.stats.ox.ac.uk/~giannoul/GenoSNP/.
Nature Communications | 2014
Jean-Baptiste Cazier; S. R. Rao; C. M. McLean; A. K. Walker; B. J. Wright; Emma Jaeger; Christiana Kartsonaki; L. Marsden; Christopher Yau; Carme Camps; Pamela J. Kaisaki; Jenny C. Taylor; James Catto; Ian Tomlinson; Anne E. Kiltie; F C Hamdy
Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression.
Leukemia | 2012
Samantha J. L. Knight; Christopher Yau; Ruth Clifford; Adele Timbs; E Sadighi Akha; Helene Dreau; Adam Burns; C Ciria; David Oscier; Andrew R. Pettitt; S Dutton; Christopher Holmes; Jenny C. Taylor; J-B Cazier; Anna Schuh
Genome-wide array approaches and sequencing analyses are powerful tools for identifying genetic aberrations in cancers, including leukemias and lymphomas. However, the clinical and biological significance of such aberrations and their subclonal distribution are poorly understood. Here, we present the first genome-wide array based study of pre-treatment and relapse samples from patients with B-cell chronic lymphocytic leukemia (B-CLL) that uses the computational statistical tool OncoSNP. We show that quantification of the proportion of copy number alterations (CNAs) and copy neutral loss of heterozygosity regions (cnLOHs) in each sample is feasible. Furthermore, we (i) reveal complex changes in the subclonal architecture of paired samples at relapse compared with pre-treatment, (ii) provide evidence supporting an association between increased genomic complexity and poor clinical outcome (iii) report previously undefined, recurrent CNA/cnLOH regions that expand or newly occur at relapse and therefore might harbor candidate driver genes of relapse and/or chemotherapy resistance. Our findings are likely to impact on future therapeutic strategies aimed towards selecting effective and individually tailored targeted therapies.Leukemia advance online publication, 14 February 2012; doi:10.1038/leu.2012.13