Alimentary Pharmacology & Therapeutics | 2021

Letter: improved parsimony of genetic risk scores for coeliac disease through refined HLA modelling

 
 
 
 
 
 

Abstract


Editors, We read with interest the recent article by Sharp et al describing a single nucleotide polymorphism (SNP) genetic risk score (GRS) for coeliac disease.1 The authors reported a 42 SNP model, combining the four established HLA risk alleles for coeliac disease (DQ2.5, DQ2.2, DQ8 and DQ7.5) and 38 non-risk variants, which achieved a similar quality of prediction (AUC = 0.875) as the best previously reported GRS model with several hundreds of SNPs (AUC = 0.870.89).2 Using data from the UK Biobank, the authors demonstrated that the resulting model provided better discrimination of coeliac disease risk than HLA-DQ stratification alone (AUC = 0.881 vs 0.815), where the contribution of the HLA region is represented using three risk categories. The report by Sharp et al highlights an important direction of research—the development of parsimonious models that can be clinically deployed for low cost. Moreover, the reduced complexity of their model allows for greater biological insight. However, this idea can be extended further when assessing coeliac disease risk. Building on the same observation—that combinatorial HLA-DQ risk genotypes may be used to more effectively stratify risk—we recently published a GRS constructed on the same European coeliac disease GWAS datasets.3 In this work, we described two HLA-based risk models achieving equivalent quality of prediction to the best-performing genome-wide coeliac disease risk models2; HDQ15—a model derived by ordering the 15 HLA-DQ risk genotypes with four tagging SNPs (AUC = 0.871)—and HDQ17—an extension integrating two novel HLA risk alleles and two additional SNPs (AUC = 0.879). In particular, our HDQ15 is very similar to the HLA stratification presented by Sharp et al. Importantly, we found that the contribution of non-HLA variants beyond the 4 or 6 SNPs used in these models was limited. To highlight the utility of these even more parsimonious HLAonly models, we applied each model to 409 624 Caucasian Europeans from UK Biobank cohort, defining coeliac disease by hospital admission code and self-reported questionnaires as per Sharp et al. In this population, the 42 SNP Sharp model (AUC = 0.8674) still significantly outperformed a 3-category HLA risk model (AUC = 0.781, one-side Delong s test P-value = 1.3 × 10−127). However, our 6-SNP HDQ17 is almost identical in AUC to the Sharp model (AUC = 0.8672, P-value = 0.47), while the 4-SNP HDQ15 has only marginally lower performance (AUC: 0.855, P-value = 4.9 × 10−6) (Figure 1). These results suggest that most of the improvement in AUC reported by Sharp et al can be achieved with fewer variants through a more nuanced use of HLA-attributed risk. The HLA-only models, HDQ15 and HDQ17, can be implemented using only 4 or 6 SNPs, respectively. The reduced number of variants may improve robustness and decrease cost in clinical deployment. While non-HLA genomic risk variants are critical for coeliac disease aetiology, our results also highlight that further investigation is needed to understand the relative contributions of HLA and non-HLA variations for predicting coeliac disease risk.

Volume 53
Pages None
DOI 10.1111/apt.16263
Language English
Journal Alimentary Pharmacology & Therapeutics

Full Text