Graham R. S. Ritchie
Wellcome Trust Sanger Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Graham R. S. Ritchie.
Genome Biology | 2016
William M. McLaren; Laurent Gil; Sarah Hunt; Harpreet Singh Riat; Graham R. S. Ritchie; Anja Thormann; Paul Flicek; Fiona Cunningham
The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Science | 2013
Ekta Khurana; Yao Fu; Vincenza Colonna; Xinmeng Jasmine Mu; Hyun Min Kang; Tuuli Lappalainen; Andrea Sboner; Lucas Lochovsky; Jieming Chen; Arif Harmanci; Jishnu Das; Alexej Abyzov; Suganthi Balasubramanian; Kathryn Beal; Dimple Chakravarty; Daniel Challis; Yuan Chen; Declan Clarke; Laura Clarke; Fiona Cunningham; Uday S. Evani; Paul Flicek; Robert Fragoza; Erik Garrison; Richard A. Gibbs; Zeynep H. Gümüş; Javier Herrero; Naoki Kitabayashi; Yong Kong; Kasper Lage
Introduction Plummeting sequencing costs have led to a great increase in the number of personal genomes. Interpreting the large number of variants in them, particularly in noncoding regions, is a current challenge. This is especially the case for somatic variants in cancer genomes, a large proportion of which are noncoding. Prioritization of candidate noncoding cancer drivers based on patterns of selection. (Step 1) Filter somatic variants to exclude 1000 Genomes polymorphisms; (2) retain variants in noncoding annotations; (3) retain those in “sensitive” regions; (4) prioritize those disrupting a transcription-factor binding motif and (5) residing near the center of a biological network; (6) prioritize ones in annotation blocks mutated in multiple cancer samples. Methods We investigated patterns of selection in DNA elements from the ENCODE project using the full spectrum of variants from 1092 individuals in the 1000 Genomes Project (Phase 1), including single-nucleotide variants (SNVs), short insertions and deletions (indels), and structural variants (SVs). Although we analyzed broad functional annotations, such as all transcription-factor binding sites, we focused more on highly specific categories such as distal binding sites of factor ZNF274. The greater statistical power of the Phase 1 data set compared with earlier ones allowed us to differentiate the selective constraints on these categories. We also used connectivity information between elements from protein-protein-interaction and regulatory networks. We integrated all the information on selection to develop a workflow (FunSeq) to prioritize personal-genome variants on the basis of their deleterious impact. As a proof of principle, we experimentally validated and characterized a few candidate variants. Results We identified a specific subgroup of noncoding categories with almost as much selective constraint as coding genes: “ultrasensitive” regions. We also uncovered a number of clear patterns of selection. Elements more consistently active across tissues and both maternal and paternal alleles (in terms of allele-specific activity) are under stronger selection. Variants disruptive because of mechanistic effects on transcription-factor binding (i.e. “motif-breakers”) are selected against. Higher network connectivity (i.e. for hubs) is associated with higher constraint. Additionally, many hub promoters and regulatory elements show evidence of recent positive selection. Overall, indels and SVs follow the same pattern as SNVs; however, there are notable exceptions. For instance, enhancers are enriched for SVs formed by nonallelic homologous recombination. We integrated these patterns of selection into the FunSeq prioritization workflow and applied it to cancer variants, because they present a strong contrast to inherited polymorphisms. In particular, application to ~90 cancer genomes (breast, prostate and medulloblastoma) reveals nearly a hundred candidate noncoding drivers. Discussion Our approach can be readily used to prioritize variants in cancer and is immediately applicable in a precision-medicine context. It can be further improved by incorporation of larger-scale population sequencing, better annotations, and expression data from large cohorts. Identifying Important Identifiers Each of us has millions of sequence variations in our genomes. Signatures of purifying or negative selection should help identify which of those variations is functionally important. Khurana et al. (1235587) used sequence polymorphisms from 1092 humans across 14 populations to identify patterns of selection, especially in noncoding regulatory regions. Noncoding regions under very strong negative selection included binding sites of some chromatin and general transcription factors (TFs) and core motifs of some important TF families. Positive selection in TF binding sites tended to occur in network hub promoters. Many recurrent somatic cancer variants occurred in noncoding regulatory regions and thus might indicate mutations that drive cancer. Regions under strong selection in the human genome identify noncoding regulatory elements with possible roles in disease. Interpreting variants, especially noncoding ones, in the increasing number of personal genomes is challenging. We used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then we experimentally validated candidates. We analyzed both coding and noncoding regions, with the former corroborating the latter. We found regions particularly sensitive to mutations (“ultrasensitive”) and variants that are disruptive because of mechanistic effects on transcription-factor binding (that is, “motif-breakers”). We also found variants in regions with higher network centrality tend to be deleterious. Insertions and deletions followed a similar pattern to single-nucleotide variants, with some notable exceptions (e.g., certain deletions and enhancers). On the basis of these patterns, we developed a computational tool (FunSeq), whose application to ~90 cancer genomes reveals nearly a hundred candidate noncoding drivers.
Nature Methods | 2014
Graham R. S. Ritchie; Ian Dunham; Eleftheria Zeggini; Paul Flicek
Identifying functionally relevant variants against the background of ubiquitous genetic variation is a major challenge in human genetics. For variants in protein-coding regions, our understanding of the genetic code and splicing allows us to identify likely candidates, but interpreting variants outside genic regions is more difficult. Here we present genome-wide annotation of variants (GWAVA), a tool that supports prioritization of noncoding variants by integrating various genomic and epigenomic annotations.
Nature | 2015
Deepti Gurdasani; Tommy Carstensen; Fasil Tekola-Ayele; Luca Pagani; Ioanna Tachmazidou; Konstantinos Hatzikotoulas; Savita Karthikeyan; Louise Iles; Martin Pollard; Ananyo Choudhury; Graham R. S. Ritchie; Yali Xue; Jennifer L. Asimit; Rebecca N. Nsubuga; Elizabeth H. Young; Cristina Pomilla; Katja Kivinen; Kirk Rockett; Anatoli Kamali; Ayo Doumatey; Gershim Asiki; Janet Seeley; Fatoumatta Sisay-Joof; Muminatou Jallow; Stephen Tollman; Ephrem Mekonnen; Rosemary Ekong; Tamiru Oljira; Neil Bradman; Kalifa Bojang
Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
PLOS Biology | 2011
Deanna M. Church; Valerie Schneider; Tina Graves; Katherine Auger; Fiona Cunningham; Nathan Bouk; Hsiu Chuan Chen; Richa Agarwala; William M. McLaren; Graham R. S. Ritchie; Derek Albracht; Milinn Kremitzki; Susan Rock; Holland Kotkiewicz; Colin Kremitzki; Aye Wollam; Lee Trani; Lucinda Fulton; Robert S. Fulton; Lucy Matthews; S. Whitehead; William Chow; James Torrance; Matthew Dunn; Glenn Harden; Glen Threadgold; Jonathan Wood; Joanna Collins; Paul Heath; Guy Griffiths
I have read the journals policy and have the following conflicts: Paul Flicek is married to the deputy editor of PLoS Medicine, Melissa Norton. Evan Eichler is on the board of Pacific Biosciences. Support for this work came from the Intramural Research Program of the NIH, The National Library of Medicine, the European Molecular Biology Laboratory, the Wellcome Trust (grant number 077198), and the Howard Hughes Medical Institute (EEE). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Nature Methods | 2013
Abel Gonzalez-Perez; Ville Mustonen; Boris Reva; Graham R. S. Ritchie; Pau Creixell; Rachel Karchin; Miguel Vazquez; J. Lynn Fink; Karin S. Kassahn; John V. Pearson; Gary D. Bader; Paul C. Boutros; Lakshmi Muthuswamy; B. F. Francis Ouellette; Jüri Reimand; Rune Linding; Tatsuhiro Shibata; Alfonso Valencia; Adam Butler; Serge Dronov; Paul Flicek; Nick B. Shannon; Hannah Carter; Li Ding; Chris Sander; Josh Stuart; Lincoln Stein; Nuria Lopez-Bigas
The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.
Nature Genetics | 2016
G. David Poznik; Yali Xue; Fernando L. Mendez; Thomas Willems; Andrea Massaia; Melissa A. Wilson Sayres; Qasim Ayub; Shane McCarthy; Apurva Narechania; Seva Kashin; Yuan Chen; Ruby Banerjee; Juan L. Rodriguez-Flores; Maria Cerezo; Haojing Shao; Melissa Gymrek; Ankit Malhotra; Sandra Louzada; Rob DeSalle; Graham R. S. Ritchie; Eliza Cerveira; Tomas Fitzgerald; Erik Garrison; Anthony Marcketta; David Mittelman; Mallory Romanovitch; Chengsheng Zhang; Xiangqun Zheng-Bradley; Gonçalo R. Abecasis; Steven A. McCarroll
We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.
Cognition | 2009
Thomas C. Scott-Phillips; Simon Kirby; Graham R. S. Ritchie
A unique hallmark of human language is that it uses signals that are both learnt and symbolic. The emergence of such signals was therefore a defining event in human cognitive evolution, yet very little is known about how such a process occurs. Previous work provides some insights on how meaning can become attached to form, but a more foundational issue is presently unaddressed. How does a signal signal its own signalhood? That is, how do humans even know that communicative behaviour is indeed communicative in nature? We introduce an experimental game that has been designed to tackle this problem. We find that it is commonly resolved with a bootstrapping process, and that this process influences the final form of the communication system. Furthermore, sufficient common ground is observed to be integral to the recognition of signalhood, and the emergence of dialogue is observed to be the key step in the development of a system that can be employed to achieve shared goals.
Human Heredity | 2012
Margarida Lopes; Christopher J. Joyce; Graham R. S. Ritchie; Sally John; Fiona Cunningham; Jennifer L. Asimit; Eleftheria Zeggini
Aims: Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-synonymous coding variants. Methods: We used a weighted Z method that combines the probabilistic scores of PolyPhen-2 and SIFT. We defined 2 dataset pairs to train and test CAROL using information from the dbSNP: ‘HGMD-PUBLIC’ and 1000 Genomes Project databases. The training pair comprises a total of 980 positive control (disease-causing) and 4,845 negative control (non-disease-causing) variants. The test pair consists of 1,959 positive and 9,691 negative controls. Results: CAROL has higher predictive power and accuracy for the effect of non-synonymous variants than each individual annotation tool (PolyPhen-2 and SIFT) and benefits from higher coverage. Conclusion: The combination of annotation tools can help improve automated prediction of whole-genome/exome non-synonymous variant functional consequences.
Nature Communications | 2013
Ioanna Tachmazidou; George V. Dedoussis; Lorraine Southam; Aliki-Eleni Farmaki; Graham R. S. Ritchie; Dionysia K. Xifara; Angela Matchan; Konstantinos Hatzikotoulas; N W Rayner; Yuning Chen; Toni I. Pollin; O'Connell; Laura M. Yerges-Armstrong; Chrysoula Kiagiadaki; Kalliope Panoutsopoulou; Jeremy Schwartzentruber; Loukas Moutsianas; Emmanouil Tsafantakis; Chris Tyler-Smith; Gilean McVean; Yali Xue; Eleftheria Zeggini
Isolated populations can empower the identification of rare variation associated with complex traits through next generation association studies, but the generalizability of such findings remains unknown. Here we genotype 1,267 individuals from a Greek population isolate on the Illumina HumanExome Beadchip, in search of functional coding variants associated with lipids traits. We find genome-wide significant evidence for association between R19X, a functional variant in APOC3, with increased high-density lipoprotein and decreased triglycerides levels. Approximately 3.8% of individuals are heterozygous for this cardioprotective variant, which was previously thought to be private to the Amish founder population. R19X is rare (<0.05% frequency) in outbred European populations. The increased frequency of R19X enables discovery of this lipid traits signal at genome-wide significance in a small sample size. This work exemplifies the value of isolated populations in successfully detecting transferable rare variant associations of high medical relevance.