Benjamin M. Neale | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Benjamin M. Neale is active.

Explore More

Publication

Featured researches published by Benjamin M. Neale.

American Journal of Human Genetics | 2007

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

Shaun Purcell; Benjamin M. Neale; Kathe Todd-Brown; Lori Thomas; Manuel A. Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I. W. de Bakker; Mark J. Daly; Pak Sham

Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

Nature | 2016

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek; Konrad J. Karczewski; Eric Vallabh Minikel; Kaitlin E. Samocha; Eric Banks; Timothy Fennell; Anne H. O’Donnell-Luria; James S. Ware; Andrew Hill; Beryl B. Cummings; Taru Tukiainen; Daniel P. Birnbaum; Jack A. Kosmicki; Laramie Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David Neil Cooper; Nicole Deflaux; Mark A. DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel P. Howrigan; Adam Kiezun

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.

The Lancet | 2013

Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis

Jordan W. Smoller; Kenneth S. Kendler; Nicholas John Craddock; Phil H. Lee; Benjamin M. Neale; John I. Nurnberger; Stephan Ripke; Susan L. Santangelo; Patrick F. Sullivan; Shaun Purcell; Richard Anney; Jan K. Buitelaar; Ayman H. Fanous; Stephen V. Faraone; Witte J. G. Hoogendijk; Klaus-Peter Lesch; Douglas F. Levinson; Roy H. Perlis; Marcella Rietschel; Brien P. Riley; Edmund Sonuga-Barke; Russell Schachar; Thomas G. Schulze; Anita Thapar; Michael C. Neale; Patrick Bender; Sven Cichon; Mark J. Daly; John R. Kelsoe; Thomas Lehner

BACKGROUND: Findings from family and twin studies suggest that genetic contributions to psychiatric disorders do not in all cases map to present diagnostic categories. We aimed to identify specific variants underlying genetic effects shared between the five disorders in the Psychiatric Genomics Consortium: autism spectrum disorder, attention deficit-hyperactivity disorder, bipolar disorder, major depressive disorder, and schizophrenia. METHODS: We analysed genome-wide single-nucleotide polymorphism (SNP) data for the five disorders in 33,332 cases and 27,888 controls of European ancestory. To characterise allelic effects on each disorder, we applied a multinomial logistic regression procedure with model selection to identify the best-fitting model of relations between genotype and phenotype. We examined cross-disorder effects of genome-wide significant loci previously identified for bipolar disorder and schizophrenia, and used polygenic risk-score analysis to examine such effects from a broader set of common variants. We undertook pathway analyses to establish the biological associations underlying genetic overlap for the five disorders. We used enrichment analysis of expression quantitative trait loci (eQTL) data to assess whether SNPs with cross-disorder association were enriched for regulatory SNPs in post-mortem brain-tissue samples. FINDINGS: SNPs at four loci surpassed the cutoff for genome-wide significance (p<5x10(-8)) in the primary analysis: regions on chromosomes 3p21 and 10q24, and SNPs within two L-type voltage-gated calcium channel subunits, CACNA1C and CACNB2. Model selection analysis supported effects of these loci for several disorders. Loci previously associated with bipolar disorder or schizophrenia had variable diagnostic specificity. Polygenic risk scores showed cross-disorder associations, notably between adult-onset disorders. Pathway analysis supported a role for calcium channel signalling genes for all five disorders. Finally, SNPs with evidence of cross-disorder association were enriched for brain eQTL markers. INTERPRETATION: Our findings show that specific SNPs are associated with a range of psychiatric disorders of childhood onset or adult onset. In particular, variation in calcium-channel activity genes seems to have pleiotropic effects on psychopathology. These results provide evidence relevant to the goal of moving beyond descriptive syndromes in psychiatry, and towards a nosology informed by disease cause. FUNDING: National Institute of Mental Health.BACKGROUND Findings from family and twin studies suggest that genetic contributions to psychiatric disorders do not in all cases map to present diagnostic categories. We aimed to identify specific variants underlying genetic effects shared between the five disorders in the Psychiatric Genomics Consortium: autism spectrum disorder, attention deficit-hyperactivity disorder, bipolar disorder, major depressive disorder, and schizophrenia. METHODS We analysed genome-wide single-nucleotide polymorphism (SNP) data for the five disorders in 33,332 cases and 27,888 controls of European ancestory. To characterise allelic effects on each disorder, we applied a multinomial logistic regression procedure with model selection to identify the best-fitting model of relations between genotype and phenotype. We examined cross-disorder effects of genome-wide significant loci previously identified for bipolar disorder and schizophrenia, and used polygenic risk-score analysis to examine such effects from a broader set of common variants. We undertook pathway analyses to establish the biological associations underlying genetic overlap for the five disorders. We used enrichment analysis of expression quantitative trait loci (eQTL) data to assess whether SNPs with cross-disorder association were enriched for regulatory SNPs in post-mortem brain-tissue samples. FINDINGS SNPs at four loci surpassed the cutoff for genome-wide significance (p<5×10(-8)) in the primary analysis: regions on chromosomes 3p21 and 10q24, and SNPs within two L-type voltage-gated calcium channel subunits, CACNA1C and CACNB2. Model selection analysis supported effects of these loci for several disorders. Loci previously associated with bipolar disorder or schizophrenia had variable diagnostic specificity. Polygenic risk scores showed cross-disorder associations, notably between adult-onset disorders. Pathway analysis supported a role for calcium channel signalling genes for all five disorders. Finally, SNPs with evidence of cross-disorder association were enriched for brain eQTL markers. INTERPRETATION Our findings show that specific SNPs are associated with a range of psychiatric disorders of childhood onset or adult onset. In particular, variation in calcium-channel activity genes seems to have pleiotropic effects on psychopathology. These results provide evidence relevant to the goal of moving beyond descriptive syndromes in psychiatry, and towards a nosology informed by disease cause. FUNDING National Institute of Mental Health.

Nature | 2012

Patterns and rates of exonic de novo mutations in autism spectrum disorders

Benjamin M. Neale; Yan Kou; Li Liu; Avi Ma'ayan; Kaitlin E. Samocha; Aniko Sabo; Chiao-Feng Lin; Christine Stevens; Li-San Wang; Vladimir Makarov; Pazi Penchas Polak; Seungtai Yoon; Jared Maguire; Emily L. Crawford; Nicholas G. Campbell; Evan T. Geller; Otto Valladares; Chad Shafer; Han Liu; Tuo Zhao; Guiqing Cai; Jayon Lihm; Ruth Dannenfelser; Omar Jabado; Zuleyma Peralta; Uma Nagaswamy; Donna M. Muzny; Jeffrey G. Reid; Irene Newsham; Yuanqing Wu

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case–control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.

American Journal of Human Genetics | 2004

The Future of Association Studies: Gene-Based Analysis and Replication

Benjamin M. Neale; Pak Sham

Historically, association tests were limited to single variants, so that the allele was considered the basic unit for association testing. As marker density increases and indirect approaches are used to assess association through linkage disequilibrium, association is now frequently considered at the haplotypic level. We suggest that there are difficulties in replicating association findings at the single-nucleotide-polymorphism (SNP) or the haplotype level, and we propose a shift toward a gene-based approach in which all common variation within a candidate gene is considered jointly. Inconsistencies arising from population differences are more readily resolved by use of a gene-based approach rather than either a SNP-based or a haplotype-based approach. A gene-based approach captures all of the potential risk-conferring variations; thus, negative findings are subject only to the issue of power. In addition, chance findings due to multiple testing can be readily accounted for by use of a genewide-significance level. Meta-analysis procedures can be formalized for gene-based methods through the combination of P values. It is only a matter of time before all variation within genes is mapped, at which point the gene-based approach will become the natural end point for association analysis and will inform our search for functional variants relevant to disease etiology.

Nature Genetics | 2015

LD Score regression distinguishes confounding from polygenicity in genome-wide association studies

Brendan Bulik-Sullivan; Po-Ru Loh; Hilary Finucane; Stephan Ripke; Jian Yang; Nick Patterson; Mark J. Daly; Alkes L. Price; Benjamin M. Neale

Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.

Nature Genetics | 2007

Two independent alleles at 6q23 associated with risk of rheumatoid arthritis

Robert M. Plenge; Chris Cotsapas; Leela Davies; Alkes L. Price; Paul I. W. de Bakker; Julian Maller; Itsik Pe'er; Noël P. Burtt; Brendan Blumenstiel; Matt DeFelice; Melissa Parkin; Rachel Barry; Wendy Winslow; Claire Healy; Robert R. Graham; Benjamin M. Neale; Elena Izmailova; Ronenn Roubenoff; Alex Parker; Roberta Glass; Elizabeth W. Karlson; Nancy E. Maher; David A. Hafler; David M. Lee; Michael F. Seldin; Elaine F. Remmers; Annette Lee; Leonid Padyukov; Lars Alfredsson; Jonathan S. Coblyn

To identify susceptibility alleles associated with rheumatoid arthritis, we genotyped 397 individuals with rheumatoid arthritis for 116,204 SNPs and carried out an association analysis in comparison to publicly available genotype data for 1,211 related individuals from the Framingham Heart Study. After evaluating and adjusting for technical and population biases, we identified a SNP at 6q23 (rs10499194, ∼150 kb from TNFAIP3 and OLIG3) that was reproducibly associated with rheumatoid arthritis both in the genome-wide association (GWA) scan and in 5,541 additional case-control samples (P = 10−3, GWA scan; P < 10−6, replication; P = 10−9, combined). In a concurrent study, the Wellcome Trust Case Control Consortium (WTCCC) has reported strong association of rheumatoid arthritis susceptibility to a different SNP located 3.8 kb from rs10499194 (rs6920220; P = 5 × 10−6 in WTCCC). We show that these two SNP associations are statistically independent, are each reproducible in the comparison of our data and WTCCC data, and define risk and protective haplotypes for rheumatoid arthritis at 6q23.

Nature Genetics | 2015

An atlas of genetic correlations across human diseases and traits

Brendan Bulik-Sullivan; Hilary Finucane; Verneri Anttila; Alexander Gusev; Felix R. Day; Po-Ru Loh; Laramie Duncan; John Perry; Nick Patterson; Elise B. Robinson; Mark J. Daly; Alkes L. Price; Benjamin M. Neale

Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique—cross-trait LD Score regression—for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.

Nature Genetics | 2011

Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease

Manuel A. Rivas; Mélissa Beaudoin; Agnès Gardet; Christine Stevens; Yashoda Sharma; Clarence K. Zhang; Gabrielle Boucher; Stephan Ripke; David Ellinghaus; Noël P. Burtt; Timothy Fennell; Andrew Kirby; Anna Latiano; Philippe Goyette; Todd Green; Jonas Halfvarson; Talin Haritunians; Joshua M. Korn; Finny Kuruvilla; Caroline Lagacé; Benjamin M. Neale; Ken Sin Lo; Phil Schumm; Leif Törkvist; Marla Dubinsky; Steven R. Brant; Mark S. Silverberg; Richard H. Duerr; David Altshuler; Stacey Gabriel

More than 1,000 susceptibility loci have been identified through genome-wide association studies (GWAS) of common variants; however, the specific genes and full allelic spectrum of causal variants underlying these findings have not yet been defined. Here we used pooled next-generation sequencing to study 56 genes from regions associated with Crohns disease in 350 cases and 350 controls. Through follow-up genotyping of 70 rare and low-frequency protein-altering variants in nine independent case-control series (16,054 Crohns disease cases, 12,153 ulcerative colitis cases and 17,575 healthy controls), we identified four additional independent risk factors in NOD2, two additional protective variants in IL23R, a highly significant association with a protective splice variant in CARD9 (P < 1 × 10−16, odds ratio ≈ 0.29) and additional associations with coding variants in IL18RAP, CUL2, C1orf106, PTPN22 and MUC19. We extend the results of successful GWAS by identifying new, rare and probably functional variants that could aid functional experiments and predictive models.

Nature Genetics | 2008

Common variants at CD40 and other loci confer risk of rheumatoid arthritis

Soumya Raychaudhuri; Elaine F. Remmers; Annette Lee; Rachel Hackett; Candace Guiducci; Noël P. Burtt; Lauren Gianniny; Benjamin D. Korman; Leonid Padyukov; Fina Kurreeman; Monica Chang; Joseph J. Catanese; Bo Ding; Sandra Wong; Annette H. M. van der Helm-van Mil; Benjamin M. Neale; Jonathan S. Coblyn; Jing Cui; Paul P. Tak; Gert Jan Wolbink; J. Bart A. Crusius; Irene E. van der Horst-Bruinsma; Lindsey A. Criswell; Christopher I. Amos; Michael F. Seldin; Daniel L. Kastner; Kristin Ardlie; Lars Alfredsson; Karen H. Costenbader; David Altshuler

To identify rheumatoid arthritis risk loci in European populations, we conducted a meta-analysis of two published genome-wide association (GWA) studies totaling 3,393 cases and 12,462 controls. We genotyped 31 top-ranked SNPs not previously associated with rheumatoid arthritis in an independent replication of 3,929 autoantibody-positive rheumatoid arthritis cases and 5,807 matched controls from eight separate collections. We identified a common variant at the CD40 gene locus (rs4810485, P = 0.0032 replication, P = 8.2 × 10−9 overall, OR = 0.87). Along with other associations near TRAF1 (refs. 2,3) and TNFAIP3 (refs. 4,5), this implies a central role for the CD40 signaling pathway in rheumatoid arthritis pathogenesis. We also identified association at the CCL21 gene locus (rs2812378, P = 0.00097 replication, P = 2.8 × 10−7 overall), a gene involved in lymphocyte trafficking. Finally, we identified evidence of association at four additional gene loci: MMEL1-TNFRSF14 (rs3890745, P = 0.0035 replication, P = 1.1 × 10−7 overall), CDK6 (rs42041, P = 0.010 replication, P = 4.0 × 10−6 overall), PRKCQ (rs4750316, P = 0.0078 replication, P = 4.4 × 10−6 overall), and KIF5A-PIP4K2C (rs1678542, P = 0.0026 replication, P = 8.8 × 10−8 overall).

Explore More