Stephen Burgess
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stephen Burgess.
International Journal of Epidemiology | 2015
Jack Bowden; George Davey Smith; Stephen Burgess
Background: The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). Methods: We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger’s test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. Results: We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. Conclusions: An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.
Genetic Epidemiology | 2013
Stephen Burgess; Adam S. Butterworth; Simon G. Thompson
Genome‐wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual‐level data in simulation studies. We investigate the impact of gene–gene interactions, linkage disequilibrium, and ‘weak instruments’ on these estimates. Both an inverse‐variance weighted average of variant‐specific associations and a likelihood‐based approach for summarized data give similar estimates and precision to the two‐stage least squares method for individual‐level data, even when there are gene–gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P‐value in a linear regression of the risk factor for each variant is less than 1×10−5 , then weak instrument bias will be small. We use these methods to estimate the causal association of low‐density lipoprotein cholesterol (LDL‐C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL‐C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual‐level data, although the necessary assumptions cannot be so fully assessed.
Genetic Epidemiology | 2016
Jack Bowden; George Davey Smith; Philip Haycock; Stephen Burgess
Developments in genome‐wide association studies and the increasing availability of summary genetic association data have made application of Mendelian randomization relatively straightforward. However, obtaining reliable results from a Mendelian randomization investigation remains problematic, as the conventional inverse‐variance weighted method only gives consistent estimates if all of the genetic variants in the analysis are valid instrumental variables. We present a novel weighted median estimator for combining data on multiple genetic variants into a single causal estimate. This estimator is consistent even when up to 50% of the information comes from invalid instrumental variables. In a simulation analysis, it is shown to have better finite‐sample Type 1 error rates than the inverse‐variance weighted method, and is complementary to the recently proposed MR‐Egger (Mendelian randomization‐Egger) regression method. In analyses of the causal effects of low‐density lipoprotein cholesterol and high‐density lipoprotein cholesterol on coronary artery disease risk, the inverse‐variance weighted method suggests a causal effect of both lipid fractions, whereas the weighted median and MR‐Egger regression methods suggest a null effect of high‐density lipoprotein cholesterol that corresponds with the experimental evidence. Both median‐based and MR‐Egger regression methods should be considered as sensitivity analyses for Mendelian randomization investigations with multiple genetic variants.
International Journal of Epidemiology | 2013
Stephen Burgess; Simon G. Thompson
Background An allele score is a single variable summarizing multiple genetic variants associated with a risk factor. It is calculated as the total number of risk factor-increasing alleles for an individual (unweighted score), or the sum of weights for each allele corresponding to estimated genetic effect sizes (weighted score). An allele score can be used in a Mendelian randomization analysis to estimate the causal effect of the risk factor on an outcome. Methods Data were simulated to investigate the use of allele scores in Mendelian randomization where conventional instrumental variable techniques using multiple genetic variants demonstrate ‘weak instrument’ bias. The robustness of estimates using the allele score to misspecification (for example non-linearity, effect modification) and to violations of the instrumental variable assumptions was assessed. Results Causal estimates using a correctly specified allele score were unbiased with appropriate coverage levels. The estimates were generally robust to misspecification of the allele score, but not to instrumental variable violations, even if the majority of variants in the allele score were valid instruments. Using a weighted rather than an unweighted allele score increased power, but the increase was small when genetic variants had similar effect sizes. Naive use of the data under analysis to choose which variants to include in an allele score, or for deriving weights, resulted in substantial biases. Conclusions Allele scores enable valid causal estimates with large numbers of genetic variants. The stringency of criteria for genetic variants in Mendelian randomization should be maintained for all variants in an allele score.
JAMA | 2015
E Di Angelantonio; Stephen Kaptoge; David Wormser; Peter Willeit; Adam S. Butterworth; Narinder Bansal; L M O'Keeffe; Pei Gao; Angela M. Wood; Stephen Burgess; Daniel F. Freitag; Lisa Pennells; Sanne A.E. Peters; Carole Hart; Lise Lund Håheim; Richard F. Gillum; Børge G. Nordestgaard; Bruce M. Psaty; Bu B. Yeap; Matthew Knuiman; Paul J. Nietert; Jussi Kauhanen; Jukka T. Salonen; Lewis H. Kuller; Leon A. Simons; Y. T. van der Schouw; Elizabeth Barrett-Connor; Randi Selmer; Carlos J. Crespo; Beatriz L. Rodriguez
IMPORTANCE The prevalence of cardiometabolic multimorbidity is increasing. OBJECTIVE To estimate reductions in life expectancy associated with cardiometabolic multimorbidity. DESIGN, SETTING, AND PARTICIPANTS Age- and sex-adjusted mortality rates and hazard ratios (HRs) were calculated using individual participant data from the Emerging Risk Factors Collaboration (689,300 participants; 91 cohorts; years of baseline surveys: 1960-2007; latest mortality follow-up: April 2013; 128,843 deaths). The HRs from the Emerging Risk Factors Collaboration were compared with those from the UK Biobank (499,808 participants; years of baseline surveys: 2006-2010; latest mortality follow-up: November 2013; 7995 deaths). Cumulative survival was estimated by applying calculated age-specific HRs for mortality to contemporary US age-specific death rates. EXPOSURES A history of 2 or more of the following: diabetes mellitus, stroke, myocardial infarction (MI). MAIN OUTCOMES AND MEASURES All-cause mortality and estimated reductions in life expectancy. RESULTS In participants in the Emerging Risk Factors Collaboration without a history of diabetes, stroke, or MI at baseline (reference group), the all-cause mortality rate adjusted to the age of 60 years was 6.8 per 1000 person-years. Mortality rates per 1000 person-years were 15.6 in participants with a history of diabetes, 16.1 in those with stroke, 16.8 in those with MI, 32.0 in those with both diabetes and MI, 32.5 in those with both diabetes and stroke, 32.8 in those with both stroke and MI, and 59.5 in those with diabetes, stroke, and MI. Compared with the reference group, the HRs for all-cause mortality were 1.9 (95% CI, 1.8-2.0) in participants with a history of diabetes, 2.1 (95% CI, 2.0-2.2) in those with stroke, 2.0 (95% CI, 1.9-2.2) in those with MI, 3.7 (95% CI, 3.3-4.1) in those with both diabetes and MI, 3.8 (95% CI, 3.5-4.2) in those with both diabetes and stroke, 3.5 (95% CI, 3.1-4.0) in those with both stroke and MI, and 6.9 (95% CI, 5.7-8.3) in those with diabetes, stroke, and MI. The HRs from the Emerging Risk Factors Collaboration were similar to those from the more recently recruited UK Biobank. The HRs were little changed after further adjustment for markers of established intermediate pathways (eg, levels of lipids and blood pressure) and lifestyle factors (eg, smoking, diet). At the age of 60 years, a history of any 2 of these conditions was associated with 12 years of reduced life expectancy and a history of all 3 of these conditions was associated with 15 years of reduced life expectancy. CONCLUSIONS AND RELEVANCE Mortality associated with a history of diabetes, stroke, or MI was similar for each condition. Because any combination of these conditions was associated with multiplicative mortality risk, life expectancy was substantially lower in people with multimorbidity.
WOS | 2015
Emanuele Di Angelantonio; Stephen Kaptoge; David Wormser; Peter Willeit; Adam S. Butterworth; Narinder Bansal; Linda M. O'Keeffe; Pei Gao; Angela M. Wood; Stephen Burgess; Daniel F. Freitag; Lisa Pennells; Sanne A. Peters; Carole Hart; Lise Lund Håheim; Richard F. Gillum; Børge G. Nordestgaard; Bruce M. Psaty; Bu B. Yeap; Matthew Knuiman; Paul J. Nietert; Jussi Kauhanen; Jukka T. Salonen; Lewis H. Kuller; Leon A. Simons; Yvonne T. van der Schouw; Elizabeth Barrett-Connor; Randi Selmer; Carlos J. Crespo; Beatriz L. Rodriguez
IMPORTANCE The prevalence of cardiometabolic multimorbidity is increasing. OBJECTIVE To estimate reductions in life expectancy associated with cardiometabolic multimorbidity. DESIGN, SETTING, AND PARTICIPANTS Age- and sex-adjusted mortality rates and hazard ratios (HRs) were calculated using individual participant data from the Emerging Risk Factors Collaboration (689,300 participants; 91 cohorts; years of baseline surveys: 1960-2007; latest mortality follow-up: April 2013; 128,843 deaths). The HRs from the Emerging Risk Factors Collaboration were compared with those from the UK Biobank (499,808 participants; years of baseline surveys: 2006-2010; latest mortality follow-up: November 2013; 7995 deaths). Cumulative survival was estimated by applying calculated age-specific HRs for mortality to contemporary US age-specific death rates. EXPOSURES A history of 2 or more of the following: diabetes mellitus, stroke, myocardial infarction (MI). MAIN OUTCOMES AND MEASURES All-cause mortality and estimated reductions in life expectancy. RESULTS In participants in the Emerging Risk Factors Collaboration without a history of diabetes, stroke, or MI at baseline (reference group), the all-cause mortality rate adjusted to the age of 60 years was 6.8 per 1000 person-years. Mortality rates per 1000 person-years were 15.6 in participants with a history of diabetes, 16.1 in those with stroke, 16.8 in those with MI, 32.0 in those with both diabetes and MI, 32.5 in those with both diabetes and stroke, 32.8 in those with both stroke and MI, and 59.5 in those with diabetes, stroke, and MI. Compared with the reference group, the HRs for all-cause mortality were 1.9 (95% CI, 1.8-2.0) in participants with a history of diabetes, 2.1 (95% CI, 2.0-2.2) in those with stroke, 2.0 (95% CI, 1.9-2.2) in those with MI, 3.7 (95% CI, 3.3-4.1) in those with both diabetes and MI, 3.8 (95% CI, 3.5-4.2) in those with both diabetes and stroke, 3.5 (95% CI, 3.1-4.0) in those with both stroke and MI, and 6.9 (95% CI, 5.7-8.3) in those with diabetes, stroke, and MI. The HRs from the Emerging Risk Factors Collaboration were similar to those from the more recently recruited UK Biobank. The HRs were little changed after further adjustment for markers of established intermediate pathways (eg, levels of lipids and blood pressure) and lifestyle factors (eg, smoking, diet). At the age of 60 years, a history of any 2 of these conditions was associated with 12 years of reduced life expectancy and a history of all 3 of these conditions was associated with 15 years of reduced life expectancy. CONCLUSIONS AND RELEVANCE Mortality associated with a history of diabetes, stroke, or MI was similar for each condition. Because any combination of these conditions was associated with multiplicative mortality risk, life expectancy was substantially lower in people with multimorbidity.
American Journal of Epidemiology | 2013
Brandon L. Pierce; Stephen Burgess
Mendelian randomization (MR) is a method for estimating the causal relationship between an exposure and an outcome using a genetic factor as an instrumental variable (IV) for the exposure. In the traditional MR setting, data on the IV, exposure, and outcome are available for all participants. However, obtaining complete exposure data may be difficult in some settings, due to high measurement costs or lack of appropriate biospecimens. We used simulated data sets to assess statistical power and bias for MR when exposure data are available for a subset (or an independent set) of participants. We show that obtaining exposure data for a subset of participants is a cost-efficient strategy, often having negligible effects on power in comparison with a traditional complete-data analysis. The size of the subset needed to achieve maximum power depends on IV strength, and maximum power is approximately equal to the power of traditional IV estimators. Weak IVs are shown to lead to bias towards the null when the subsample is small and towards the confounded association when the subset is relatively large. Various approaches for confidence interval calculation are considered. These results have important implications for reducing the costs and increasing the feasibility of MR studies.
International Journal of Epidemiology | 2011
Stephen Burgess; Simon G. Thompson
BACKGROUND Mendelian randomization is used to test and estimate the magnitude of a causal effect of a phenotype on an outcome by using genetic variants as instrumental variables (IVs). Estimates of association from IV analysis are biased in the direction of the confounded, observational association between phenotype and outcome. The magnitude of the bias depends on the F-statistic for the strength of relationship between IVs and phenotype. We seek to develop guidelines for the design and analysis of Mendelian randomization studies to minimize bias. METHODS IV analysis was performed on simulated and real data to investigate the effect on bias of size of study, number and choice of instruments and method of analysis. RESULTS Bias is shown to increase as the expected F-statistic decreases, and can be reduced by using parsimonious models of genetic association (i.e. not over-parameterized) and by adjusting for measured covariates. Using data from a single study, the causal estimate of a unit increase in log-transformed C-reactive protein on fibrinogen (μmol/l) is shown to increase from -0.005 (P = 0.99) to 0.792 (P = 0.00003) due to injudicious choice of instrument. Moreover, when the observed F-statistic is larger than expected in a particular study, the causal estimate is more biased towards the observational association and its standard error is smaller. This correlation between causal estimate and standard error introduces a second source of bias into meta-analysis of Mendelian randomization studies. Bias can be alleviated in meta-analyses by using individual level data and by pooling genetic effects across studies. CONCLUSIONS Weak instrument bias is of practical importance for the design and analysis of Mendelian randomization studies. Post hoc choice of instruments, genetic models or data based on measured F-statistics can exacerbate bias. In particular, the commonly cited rule of thumb that F > 10 avoids bias in IV analysis is misleading.
JAMA | 2016
Luca A. Lotta; Stephen J. Sharp; Stephen Burgess; John Perry; Isobel D. Stewart; Sara M. Willems; Jian'an Luan; Eva Ardanaz; Larraitz Arriola; Beverley Balkau; Heiner Boeing; Panos Deloukas; Nita G. Forouhi; Paul W. Franks; Sara Grioni; Rudolf Kaaks; Timothy J. Key; Carmen Navarro; Peter Nilsson; Kim Overvad; Domenico Palli; Salvatore Panico; José Ramón Quirós; Elio Riboli; Olov Rolandsson; Carlotta Sacerdote; Elena Salamanca-Fernández; Nadia Slimani; Annemieke M. W. Spijkerman; Anne Tjønneland
Importance Low-density lipoprotein cholesterol (LDL-C)-lowering alleles in or near NPC1L1 or HMGCR, encoding the respective molecular targets of ezetimibe and statins, have previously been used as proxies to study the efficacy of these lipid-lowering drugs. Alleles near HMGCR are associated with a higher risk of type 2 diabetes, similar to the increased incidence of new-onset diabetes associated with statin treatment in randomized clinical trials. It is unknown whether alleles near NPC1L1 are associated with the risk of type 2 diabetes. Objective To investigate whether LDL-C-lowering alleles in or near NPC1L1 and other genes encoding current or prospective molecular targets of lipid-lowering therapy (ie, HMGCR, PCSK9, ABCG5/G8, LDLR) are associated with the risk of type 2 diabetes. Design, Setting, and Participants The associations with type 2 diabetes and coronary artery disease of LDL-C-lowering genetic variants were investigated in meta-analyses of genetic association studies. Meta-analyses included 50 775 individuals with type 2 diabetes and 270 269 controls and 60 801 individuals with coronary artery disease and 123 504 controls. Data collection took place in Europe and the United States between 1991 and 2016. Exposures Low-density lipoprotein cholesterol-lowering alleles in or near NPC1L1, HMGCR, PCSK9, ABCG5/G8, and LDLR. Main Outcomes and Measures Odds ratios (ORs) for type 2 diabetes and coronary artery disease. Results Low-density lipoprotein cholesterol-lowering genetic variants at NPC1L1 were inversely associated with coronary artery disease (OR for a genetically predicted 1-mmol/L [38.7-mg/dL] reduction in LDL-C of 0.61 [95% CI, 0.42-0.88]; P = .008) and directly associated with type 2 diabetes (OR for a genetically predicted 1-mmol/L reduction in LDL-C of 2.42 [95% CI, 1.70-3.43]; P < .001). For PCSK9 genetic variants, the OR for type 2 diabetes per 1-mmol/L genetically predicted reduction in LDL-C was 1.19 (95% CI, 1.02-1.38; P = .03). For a given reduction in LDL-C, genetic variants were associated with a similar reduction in coronary artery disease risk (I2 = 0% for heterogeneity in genetic associations; P = .93). However, associations with type 2 diabetes were heterogeneous (I2 = 77.2%; P = .002), indicating gene-specific associations with metabolic risk of LDL-C-lowering alleles. Conclusions and Relevance In this meta-analysis, exposure to LDL-C-lowering genetic variants in or near NPC1L1 and other genes was associated with a higher risk of type 2 diabetes. These data provide insights into potential adverse effects of LDL-C-lowering therapy.
The New England Journal of Medicine | 2015
Christopher P. Nelson; Stephen E. Hamby; Danish Saleheen; Jenna C Hopewell; Lingyao Zeng; Themistocles L. Assimes; Stavroula Kanoni; Christina Willenborg; Stephen Burgess; Philippe Amouyel; Sonia S. Anand; Stefan Blankenberg; Bernhard O. Boehm; Robert Clarke; Rory Collins; George Dedoussis; Martin Farrall; Paul W. Franks; Leif Groop; Alistair S. Hall; Anders Hamsten; Christian Hengstenberg; G. Kees Hovingh; Erik Ingelsson; Sekar Kathiresan; Frank Kee; Inke R. König; Jaspal S. Kooner; Terho Lehtimäki; W. März
BACKGROUND The nature and underlying mechanisms of an inverse association between adult height and the risk of coronary artery disease (CAD) are unclear. METHODS We used a genetic approach to investigate the association between height and CAD, using 180 height-associated genetic variants. We tested the association between a change in genetically determined height of 1 SD (6.5 cm) with the risk of CAD in 65,066 cases and 128,383 controls. Using individual-level genotype data from 18,249 persons, we also examined the risk of CAD associated with the presence of various numbers of height-associated alleles. To identify putative mechanisms, we analyzed whether genetically determined height was associated with known cardiovascular risk factors and performed a pathway analysis of the height-associated genes. RESULTS We observed a relative increase of 13.5% (95% confidence interval [CI], 5.4 to 22.1; P<0.001) in the risk of CAD per 1-SD decrease in genetically determined height. There was a graded relationship between the presence of an increased number of height-raising variants and a reduced risk of CAD (odds ratio for height quartile 4 versus quartile 1, 0.74; 95% CI, 0.68 to 0.84; P<0.001). Of the 12 risk factors that we studied, we observed significant associations only with levels of low-density lipoprotein cholesterol and triglycerides (accounting for approximately 30% of the association). We identified several overlapping pathways involving genes associated with both development and atherosclerosis. CONCLUSIONS There is a primary association between a genetically determined shorter height and an increased risk of CAD, a link that is partly explained by the association between shorter height and an adverse lipid profile. Shared biologic processes that determine achieved height and the development of atherosclerosis may explain some of the association. (Funded by the British Heart Foundation and others.).