Cell Research | 2021

O-glycosylation pattern of the SARS-CoV-2 spike protein reveals an “O-Follow-N” rule

 
 
 
 
 
 
 
 
 
 

Abstract


Dear Editor, Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), emerged in late 2019 and has since caused a pandemic. Although there have been extensive studies worldwide, our understanding of this newly emerged pathogen is far from sufficient. The pathogenesis of the SARS-CoV-2 infection is not fully understood, although a “two-stage” hypothesis was proposed in our previous study. As a member of enveloped virus in family Coronaviridae, SARS-CoV-2 makes use of a densely glycosylated spike (S) protein to gain entry into host cells. The S protein is a trimeric class I transmembrane protein composed of two functional subunits. The S1 subunit binds to cellular angiotensin-converting enzyme 2 (ACE2) for host cell recognition, and the S2 subunit functions in viral–host cell membrane fusion. The S protein is the most attractive immunogen for eliciting antibody responses and is therefore the primary focus for neutralizing antibody and vaccine development. The glycosylation of viral envelope proteins has a wide range of functions, including regulating viral tropism, protein stability and shielding the underlying epitopes from immune surveillance. Thus, a full understanding of the glycosylation of SARS-CoV-2 S protein is critical to reveal the pathogenesis of the virus and to guide the design of therapeutic and prophylactic strategies. A total of 22 N-glycosites were mapped in the in vitro-expressed S protein ectodomain and the S protein extracted from virions. Due to the technical limitations, only a few O-glycosylation modifications were confirmed on purified S protein and none has been reported on the S protein extracted from SARS-CoV-2 virions, which is the most representative antigen on virions. To obtain a comprehensive Nand O-linked glycosylation landscape of the S protein at its native status including glycosites, glycoforms, and the relative intensity, we extracted S protein from the SARS-CoV-2 virions and purified recombinant full-length wildtype (WT) S protein expressed in human embryonic kidney 293T cells (Supplementary information, Fig. S1a, b). To generate glycopeptides and ensure maximum coverage of the protein sequence, the S protein was digested separately with chymotrypsin, α-lytic endopeptidase or LysC-trypsin. The glycopeptides were analyzed using nano liquid chromatography (nLC) coupled with an ultra-high resolution Orbitrap Eclipse Tribrid mass spectrometer, and stepped collisional energy (SCE) HCD and HCDpdEThcD were applied for fragmentation. The data were processed by software Byonic (v3.8.13, Protein Metrics Inc., Cupertino) and Byologic glycoanalysis software (v3.8-11 ×64, Protein Metrics Inc., Cupertino). We specifically applied multiple approaches for the O-glycosylation analysis and site confirmation. First, an additional treatment with PNGase F in O water after protease digestion was carried out for N-glycan removing, in which deamidation of asparagine (Asn) yielded an aspartic acid residue with a mass shift of +2.98 Da. It discriminated modified N-glycosites from unoccupied Asn or glutamine (Gln) thus excluded the interference of N-glycosylation on the identification of O-glycosylation. Second, we conducted simultaneous search on Nand O-linked glycans together in the same samples. In order to conclusively rule out artifactual assignment of N-linked glycan to nearby Ser (S) or Thr (T), we used four criteria for the characterization of O-glycopeptides, including (1) the MS/MS spectra contains glycans or oxonium ions (i.e., feature B ions); (2) the MS/MS spectra contains feature Y ions; (3) isotope distribution of precursor is reasonable; and (4) the retention time of glycopeptides and non-glycopeptides is comparable. We did extensive manual validation in order to select the valid O-glycopeptides for the specific site analysis. Diagnostic ions were used as the requisite criterion for the O-glycosite determination. In line with the previous study, we identified 22 N-glycosites with confirmation (Fig. 1a; Supplementary information, Fig. S2 and Dataset S1). For the first time, a total of 17 O-glycosites were identified on the S protein extracted from SARS-CoV-2 virions (Fig. 1a; Supplementary information, Fig. S3, Table S1 and Dataset S1), among which 14 sites were determined with diagnostic ions (Fig. 1a; Supplementary information, Fig. S3 and Table S1). The O-glycoforms of the S protein extracted from virions are diverse and in sharp difference with the reported glycoforms of purified S protein (Fig. 1b). We found that O-glycosylation occurred in clusters on the S protein. The S1 domain was more O-glycosylated with 11 sites, while the remaining 6 sites were detected at the N-terminal of the S2 domain (Fig. 1a). Interestingly, 11 out of 17 identified O-glycosites located near glycosylated Asn, including S60, T124, S151, T236, T604/S605, T618, S659, T1076, T1077, S1097 and T1100 (Fig. 1a, c; Supplementary information, Fig. S3 and Table S1). The glycopeptide containing T604 and S605 sites was well characterized, however, we were not able to determine the exact glycosylation site due to a lack of diagnostic ions. Therefore, we counted T604/S605 as one O-glycosite. In order to further investigate the dynamics between Nand O-linked glycosylation, we defined the three amino acids on each side of the glycosylated Asn within the consensus motif of NxS/T (x is not proline (P)) as the “position associated to N-sequon” (named N ± 1–3). There are 35 S/T within positions associated to N-sequon; 11 of them were O-glycosylated among which 10 sites were determined. It is intriguing that 7 out of the 10 sites (70%) were located at the N+ 2 position, which is in the consensus motif of N-glycosylation (Fig. 1d). All the identified N-glycosites and O-glycosites associated to N-sequon were mapped on the surface of S protein based on the cryo-EM structure of the trimeric SARS-CoV-2 S protein (Protein Data Bank (PDB) ID 6XR8) (Fig. 1e). To further validate the phenomenon that Nand the O-linked glycosylation occurred together in N-sequon-associated positions, we carried out site-directed mutagenesis. An N-to-Q

Volume 31
Pages 1123 - 1125
DOI 10.1038/s41422-021-00545-2
Language English
Journal Cell Research

Full Text