Mathematical Biosciences | 2021

Genome-wide covariation in SARS-CoV-2

 
 

Abstract


\n The SARS-CoV-2 virus causing the global pandemic is a coronavirus with a genome of about 30Kbase length. The design of vaccines and choice of therapies depends on the structure and mutational stability of encoded proteins in the open reading frames(ORFs) of this genome. In this study, we computed, using Expectation Reflection, the genome-wide covariation of the SARS-CoV-2 genome based on an alignment of \n \n ≈\n 130000\n \n SARS-CoV-2 complete genome sequences obtained from GISAID. We used this covariation to compute the Direct Information between pairs of positions across the whole genome, investigating potentially important relationships within the genome, both within each encoded protein and between encoded proteins. We then computed the covariation within each clade of the virus. The covariation detected recapitulates all clade determinants and each clade exhibits distinct covarying pairs.\n

Volume 341
Pages 108678 - 108678
DOI 10.1016/j.mbs.2021.108678
Language English
Journal Mathematical Biosciences

Full Text