Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gil McVean is active.

Publication


Featured researches published by Gil McVean.


Nature Genetics | 2007

A new multipoint method for genome-wide association studies by imputation of genotypes

Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly

Genome-wide association studies are set to become the method of choice for uncovering the genetic basis of human diseases. A central challenge in this area is the development of powerful multipoint methods that can detect causal variants that have not been directly genotyped. We propose a coherent analysis framework that treats the problem as one involving missing or uncertain genotypes. Central to our approach is a model-based imputation method for inferring genotypes at observed or unobserved SNPs, leading to improved power over existing methods for multipoint association mapping. Using real genome-wide association study data, we show that our approach (i) is accurate and well calibrated, (ii) provides detailed views of associated regions that facilitate follow-up studies and (iii) can be used to validate and correct data at genotyped markers. A notable future use of our method will be to boost power by combining data from genome-wide scans that use different SNP sets.


Nature | 2011

Mapping copy number variation by population-scale genome sequencing

Ryan E. Mills; Klaudia Walter; Chip Stewart; Robert E. Handsaker; Ken Chen; Can Alkan; Alexej Abyzov; Seungtai Yoon; Kai Ye; R. Keira Cheetham; Asif T. Chinwalla; Donald F. Conrad; Yutao Fu; Fabian Grubert; Iman Hajirasouliha; Fereydoun Hormozdiari; Lilia M. Iakoucheva; Zamin Iqbal; Shuli Kang; Jeffrey M. Kidd; Miriam K. Konkel; Joshua M. Korn; Ekta Khurana; Deniz Kural; Hugo Y. K. Lam; Jing Leng; Ruiqiang Li; Yingrui Li; Chang-Yun Lin; Ruibang Luo

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Nature Genetics | 2006

A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC

Paul I. W. de Bakker; Gil McVean; Pardis C. Sabeti; Marcos M Miretti; Todd Green; Jonathan Marchini; Xiayi Ke; Alienke J. Monsuur; Pamela Whittaker; Marcos Delgado; Jonathan Morrison; Angela Richardson; Emily Walsh; Xiaojiang Gao; Luana Galver; John Hart; David A. Hafler; Margaret A. Pericak-Vance; John A. Todd; Mark J. Daly; John Trowsdale; Cisca Wijmenga; Tim J Vyse; Stephan Beck; Sarah S. Murray; Mary Carrington; Simon G. Gregory; Panos Deloukas; John D. Rioux

The proteins encoded by the classical HLA class I and class II genes in the major histocompatibility complex (MHC) are highly polymorphic and are essential in self versus non-self immune recognition. HLA variation is a crucial determinant of transplant rejection and susceptibility to a large number of infectious and autoimmune diseases. Yet identification of causal variants is problematic owing to linkage disequilibrium that extends across multiple HLA and non-HLA genes in the MHC. We therefore set out to characterize the linkage disequilibrium patterns between the highly polymorphic HLA genes and background variation by typing the classical HLA genes and >7,500 common SNPs and deletion-insertion polymorphisms across four population samples. The analysis provides informative tag SNPs that capture much of the common variation in the MHC region and that could be used in disease association studies, and it provides new insight into the evolutionary dynamics and ancestral origins of the HLA loci and their haplotypes.


Science | 2010

Drive Against Hotspot Motifs in Primates Implicates the PRDM9 Gene in Meiotic Recombination

Simon Myers; Rory Bowden; Afidalina Tumian; Ronald E. Bontrop; Colin Freeman; Tammie S. MacFie; Gil McVean; Peter Donnelly

Homing in on Hotspots The clustering of recombination in the genome, around locations known as hotspots, is associated with specific DNA motifs. Now, using a variety of techniques, three studies implicate a chromatin-modifying protein, the histone-methyltransferase PRDM9, as a major factor involved in human hotspots (see the Perspective by Cheung et al.). Parvanov et al. (p. 835, published online 31 December) mapped the locus in mice, and analyzed allelic variation in mice and humans, whereas Myers et al. (p. 876, published online 31 December) used a comparative analysis between human and chimpanzees to show that the recombination process leads to a self-destructive drive in which the very motifs that recruit hotspots are eliminated from our genome. Baudat et al. (p. 836, published online 31 December) took this analysis a step further to identify human allelic variants within Prdm9 that differed in the frequency at which they used hotspots. Furthermore, differential binding of this protein to different human alleles suggests that this protein interacts with specific DNA sequences. Thus, PDRM9 functions in the determination of recombination loci within the genome and may be a significant factor in the genomic differences between closely related species. Bioinformatics identifies a chromatin-modifying enzyme as a factor in determining recombination hotspots. Although present in both humans and chimpanzees, recombination hotspots, at which meiotic crossover events cluster, differ markedly in their genomic location between the species. We report that a 13–base pair sequence motif previously associated with the activity of 40% of human hotspots does not function in chimpanzees and is being removed by self-destructive drive in the human lineage. Multiple lines of evidence suggest that the rapidly evolving zinc-finger protein PRDM9 binds to this motif and that sequence changes in the protein may be responsible for hotspot differences between species. The involvement of PRDM9, which causes histone H3 lysine 4 trimethylation, implies that there is a common mechanism for recombination hotspots in eukaryotes but raises questions about what forces have driven such rapid change.


Nature Genetics | 2008

A common sequence motif associated with recombination hot spots and genome instability in humans

Simon Myers; Colin Freeman; Adam Auton; Peter Donnelly; Gil McVean

In humans, most meiotic crossover events are clustered into short regions of the genome known as recombination hot spots. We have previously identified DNA motifs that are enriched in hot spots, particularly the 7-mer CCTCCCT. Here we use the increased hot-spot resolution afforded by the Phase 2 HapMap and novel search methods to identify an extended family of motifs based around the degenerate 13-mer CCNCCNTNNCCNC, which is critical in recruiting crossover events to at least 40% of all human hot spots and which operates on diverse genetic backgrounds in both sexes. Furthermore, these motifs are found in hypervariable minisatellites and are clustered in the breakpoint regions of both disease-causing nonallelic homologous recombination hot spots and common mitochondrial deletion hot spots, implicating the motif as a driver of genome instability.


Nature Genetics | 2012

De novo assembly and genotyping of variants using colored de Bruijn graphs

Zamin Iqbal; Mario Caccamo; Isaac Turner; Paul Flicek; Gil McVean

Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome.


Nature Genetics | 2014

Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications

Andy Rimmer; Hang Phan; Iain Mathieson; Zamin Iqbal; Stephen R.F. Twigg; Andrew O.M. Wilkie; Gil McVean; Gerton Lunter

High-throughput DNA sequencing technology has transformed genetic research and is starting to make an impact on clinical practice. However, analyzing high-throughput sequencing data remains challenging, particularly in clinical settings where accuracy and turnaround times are critical. We present a new approach to this problem, implemented in a software package called Platypus. Platypus achieves high sensitivity and specificity for SNPs, indels and complex polymorphisms by using local de novo assembly to generate candidate variants, followed by local realignment and probabilistic haplotype estimation. It is an order of magnitude faster than existing tools and generates calls from raw aligned read data without preprocessing. We demonstrate the performance of Platypus in clinically relevant experimental designs by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.


PLOS Genetics | 2009

A genealogical interpretation of principal components analysis.

Gil McVean

Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wrights fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference.


Nature | 2012

TNF receptor 1 genetic risk mirrors outcome of anti-TNF therapy in multiple sclerosis.

Adam Patrick Gregory; Calliope A. Dendrou; Kathrine E. Attfield; Aiden Haghikia; Dionysia K. Xifara; Falk Butter; Gereon Poschmann; Gurman Kaur; Lydia Lambert; Oliver A. Leach; Simone Prömel; Divya Punwani; James H. Felce; Simon J. Davis; Ralf Gold; Finn C. Nielsen; Richard M. Siegel; Matthias Mann; John I. Bell; Gil McVean; Lars Fugger

Although there has been much success in identifying genetic variants associated with common diseases using genome-wide association studies (GWAS), it has been difficult to demonstrate which variants are causal and what role they have in disease. Moreover, the modest contribution that these variants make to disease risk has raised questions regarding their medical relevance. Here we have investigated a single nucleotide polymorphism (SNP) in the TNFRSF1A gene, that encodes tumour necrosis factor receptor 1 (TNFR1), which was discovered through GWAS to be associated with multiple sclerosis (MS), but not with other autoimmune conditions such as rheumatoid arthritis, psoriasis and Crohn’s disease. By analysing MS GWAS data in conjunction with the 1000 Genomes Project data we provide genetic evidence that strongly implicates this SNP, rs1800693, as the causal variant in the TNFRSF1A region. We further substantiate this through functional studies showing that the MS risk allele directs expression of a novel, soluble form of TNFR1 that can block TNF. Importantly, TNF-blocking drugs can promote onset or exacerbation of MS, but they have proven highly efficacious in the treatment of autoimmune diseases for which there is no association with rs1800693. This indicates that the clinical experience with these drugs parallels the disease association of rs1800693, and that the MS-associated TNFR1 variant mimics the effect of TNF-blocking drugs. Hence, our study demonstrates that clinical practice can be informed by comparing GWAS across common autoimmune diseases and by investigating the functional consequences of the disease-associated genetic variation.


Nature Genetics | 2012

Differential confounding of rare and common variants in spatially structured populations

Iain Mathieson; Gil McVean

Well-powered genome-wide association studies, now made possible through advances in technology and large-scale collaborative projects, promise to characterize the contribution of rare variants to complex traits and disease. However, while population structure is a known confounder of association studies, it remains unknown whether methods developed to control stratification are equally effective for rare variants. Here, we demonstrate that rare variants can show a stratification that is systematically different from, and typically stronger than, common variants, and this is not necessarily corrected by existing methods. We show that the same process leads to inflation for load-based tests and can obscure signals at truly associated variants. Furthermore, we show that populations can display spatial structure in rare variants, even when Wrights fixation index FST is low, but that allele frequency–dependent metrics of allele sharing can reveal localized stratification. These results underscore the importance of collecting and integrating spatial information in the genetic analysis of complex traits.

Collaboration


Dive into the Gil McVean's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zamin Iqbal

Wellcome Trust Centre for Human Genetics

View shared research outputs
Top Co-Authors

Avatar

Alexander Dilthey

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adam Auton

Albert Einstein College of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chris C. A. Spencer

Wellcome Trust Centre for Human Genetics

View shared research outputs
Researchain Logo
Decentralizing Knowledge