Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Petr Danecek is active.

Publication


Featured researches published by Petr Danecek.


Nature | 2011

Mouse genomic variation and its effect on phenotypes and gene regulation.

Thomas M. Keane; Leo Goodstadt; Petr Danecek; Michael A. White; Kim Wong; Binnaz Yalcin; Andreas Heger; Avigail Agam; Guy Slater; Martin Goodson; N A Furlotte; Eleazar Eskin; Christoffer Nellåker; H Whitley; James Cleak; Deborah Janowitz; Polinka Hernandez-Pliego; Andrew Edwards; T G Belgard; Peter L. Oliver; Rebecca E McIntyre; Amarjit Bhomra; Jérôme Nicod; Xiangchao Gan; Wei Yuan; L van der Weyden; Charles A. Steward; Sendu Bala; Jim Stalker; Richard Mott

We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.


Genome Biology | 2013

A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains

Michelle Simon; Simon Greenaway; Jacqueline K. White; Helmut Fuchs; Valérie Gailus-Durner; Sara Wells; Tania Sorg; Kim Wong; Elodie Bedu; Elizabeth J. Cartwright; Romain Dacquin; Sophia Djebali; Jeanne Estabel; Jochen Graw; Neil Ingham; Ian J. Jackson; Andreas Lengeling; Silvia Mandillo; Jacqueline Marvel; Hamid Meziane; Frédéric Preitner; Oliver Puk; Michel J. Roux; David J. Adams; Sarah Atkins; Abdel Ayadi; Lore Becker; Andrew Blake; Debra Brooker; Heather Cater

BackgroundThe mouse inbred line C57BL/6J is widely used in mouse genetics and its genome has been incorporated into many genetic reference populations. More recently large initiatives such as the International Knockout Mouse Consortium (IKMC) are using the C57BL/6N mouse strain to generate null alleles for all mouse genes. Hence both strains are now widely used in mouse genetics studies. Here we perform a comprehensive genomic and phenotypic analysis of the two strains to identify differences that may influence their underlying genetic mechanisms.ResultsWe undertake genome sequence comparisons of C57BL/6J and C57BL/6N to identify SNPs, indels and structural variants, with a focus on identifying all coding variants. We annotate 34 SNPs and 2 indels that distinguish C57BL/6J and C57BL/6N coding sequences, as well as 15 structural variants that overlap a gene. In parallel we assess the comparative phenotypes of the two inbred lines utilizing the EMPReSSslim phenotyping pipeline, a broad based assessment encompassing diverse biological systems. We perform additional secondary phenotyping assessments to explore other phenotype domains and to elaborate phenotype differences identified in the primary assessment. We uncover significant phenotypic differences between the two lines, replicated across multiple centers, in a number of physiological, biochemical and behavioral systems.ConclusionsComparison of C57BL/6J and C57BL/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Moreover, the sequence variants we identify provide a set of candidate genes for the phenotypic differences observed between the two strains.


Genome Biology | 2012

High levels of RNA-editing site conservation amongst 15 laboratory mouse strains

Petr Danecek; Christoffer Nellåker; Rebecca E McIntyre; Jorge E Buendia-Buendia; Suzannah Bumpstead; Chris P. Ponting; Jonathan Flint; Richard Durbin; Thomas M. Keane; David J. Adams

BackgroundAdenosine-to-inosine (A-to-I) editing is a site-selective post-transcriptional alteration of double-stranded RNA by ADAR deaminases that is crucial for homeostasis and development. Recently the Mouse Genomes Project generated genome sequences for 17 laboratory mouse strains and rich catalogues of variants. We also generated RNA-seq data from whole brain RNA from 15 of the sequenced strains.ResultsHere we present a computational approach that takes an initial set of transcriptome/genome mismatch sites and filters these calls taking into account systematic biases in alignment, single nucleotide variant calling, and sequencing depth to identify RNA editing sites with high accuracy. We applied this approach to our panel of mouse strain transcriptomes identifying 7,389 editing sites with an estimated false-discovery rate of between 2.9 and 10.5%. The overwhelming majority of these edits were of the A-to-I type, with less than 2.4% not of this class, and only three of these edits could not be explained as alignment artifacts. We validated 24 novel RNA editing sites in coding sequence, including two non-synonymous edits in the Cacna1d gene that fell into the IQ domain portion of the Cav1.2 voltage-gated calcium channel, indicating a potential role for editing in the generation of transcript diversity.ConclusionsWe show that despite over two million years of evolutionary divergence, the sites edited and the level of editing at each site is remarkably consistent across the 15 strains. In the Cds2 gene we find evidence for RNA editing acting to preserve the ancestral transcript sequence despite genomic sequence divergence.


Nature Communications | 2015

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Jie Huang; Bryan Howie; Shane McCarthy; Yasin Memari; Klaudia Walter; Jl Min; Petr Danecek; Giovanni Malerba; Elisabetta Trabetti; Hou-Feng Zheng; Giovanni Gambaro; Jb Richards; Richard Durbin; Nj Timpson; Jonathan Marchini; Nicole Soranzo

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.


Nature Genetics | 2016

Reference-based phasing using the Haplotype Reference Consortium panel

Po-Ru Loh; Petr Danecek; Pier Francesco Palamara; Christian Fuchsberger; Yakir A. Reshef; Hilary Finucane; Sebastian Schoenherr; Lukas Forer; Shane McCarthy; Gonçalo R. Abecasis; Richard Durbin; Alkes L. Price

Haplotype phasing is a fundamental problem in medical and population genetics. Phasing is generally performed via statistical phasing in a genotyped cohort, an approach that can yield high accuracy in very large cohorts but attains lower accuracy in smaller cohorts. Here we instead explore the paradigm of reference-based phasing. We introduce a new phasing algorithm, Eagle2, that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform. We demonstrate that Eagle2 attains a ∼20× speedup and ∼10% increase in accuracy compared to reference-based phasing using SHAPEIT2. On European-ancestry samples, Eagle2 with the HRC panel achieves >2× the accuracy of 1000 Genomes–based phasing. Eagle2 is open source and freely available for HRC-based phasing via the Sanger Imputation Service and the Michigan Imputation Server.


Nature | 2017

Common genetic variation drives molecular heterogeneity in human iPSCs

Helena Kilpinen; Angela Goncalves; Andreas Leha; Vackar Afzal; Kaur Alasoo; Sofie Ashford; Sendu Bala; Dalila Bensaddek; Francesco Paolo Casale; Oliver J. Culley; Petr Danecek; Adam Faulconbridge; Peter W. Harrison; Annie Kathuria; Davis J. McCarthy; Shane McCarthy; Ruta Meleckyte; Yasin Memari; Nathalie Moens; Filipa Soares; Alice L. Mann; Ian Streeter; Chukwuma A. Agu; Alex Alderton; Rachel Nelson; Sarah Harper; Minal Patel; Alistair White; Sharad R Patel; Laura Clarke

Technology utilizing human induced pluripotent stem cells (iPS cells) has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterization of many existing iPS cell lines limits their potential use for research and therapy. Here we describe the systematic generation, genotyping and phenotyping of 711 iPS cell lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative. Our study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5–46% of the variation in different iPS cell phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of genomic copy-number alterations that are repeatedly observed in iPS cells. In addition, we present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells.


American Journal of Human Genetics | 2015

Tracing the Route of Modern Humans out of Africa by Using 225 Human Genome Sequences from Ethiopians and Egyptians

Luca Pagani; Stephan Schiffels; Deepti Gurdasani; Petr Danecek; Aylwyn Scally; Yuan Chen; Yali Xue; Marc Haber; Rosemary Ekong; Tamiru Oljira; Ephrem Mekonnen; Donata Luiselli; Neil Bradman; Endashaw Bekele; Pierre Zalloua; Richard Durbin; Toomas Kivisild; Chris Tyler-Smith

The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt.


Nature Communications | 2014

A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans

Nicholas J. Timpson; Klaudia Walter; Josine L. Min; Ioanna Tachmazidou; Giovanni Malerba; So-Youn Shin; Lu Chen; Marta Futema; Lorraine Southam; Valentina Iotchkova; Massimiliano Cocca; Jie Huang; Yasin Memari; Shane McCarthy; Petr Danecek; Dawn Muddyman; Massimo Mangino; Cristina Menni; John Perry; Susan M. Ring; Amadou Gaye; George Dedoussis; Aliki-Eleni Farmaki; Paul R. Burton; Philippa J. Talmud; Giovanni Gambaro; Tim D. Spector; George Davey Smith; Richard Durbin; J. Brent Richards

The analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (−1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10−8)) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (−1.0 s.d. (s.e.=0.173), P-value=7.32 × 10−9). This is consistent with an effect between 0.5 and 1.5 mmol l−1 dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.


Nature Communications | 2015

Whole-genome sequence-based analysis of thyroid function

Peter N. Taylor; Eleonora Porcu; Shelby Chew; Purdey J. Campbell; Michela Traglia; Suzanne J. Brown; Benjamin H. Mullin; Hashem A. Shihab; Josine Min; Klaudia Walter; Yasin Memari; Jie Huang; Michael R. Barnes; John Beilby; Pimphen Charoen; Petr Danecek; Frank Dudbridge; Vincenzo Forgetta; Celia M. T. Greenwood; Elin Grundberg; Andrew D. Johnson; Jennie Hui; Ee Mun Lim; Shane McCarthy; Dawn Muddyman; Vijay Panicker; John Perry; Jordana T. Bell; Wei Yuan; Caroline L Relton

Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10−9) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10−14). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10−9) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10−11). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.


Bioinformatics | 2016

BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

Vagheesh Narasimhan; Petr Danecek; Aylwyn Scally; Yali Xue; Chris Tyler-Smith; Richard Durbin

Summary: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity. Availability and implementation: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Collaboration


Dive into the Petr Danecek's collaboration.

Top Co-Authors

Avatar

Shane McCarthy

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Richard Durbin

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Yasin Memari

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Chris Tyler-Smith

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Yali Xue

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Davis J. McCarthy

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Klaudia Walter

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Marc Haber

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Sendu Bala

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Adam Faulconbridge

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge