China CDC Weekly | 2021
Genome Characterization of COVID-19 Lineage B.1.1.7 Detected in the First Six Patients of a Cluster Outbreak — Shenzhen City, Guangdong Province, China, May 2021
Abstract
Screening for coronavirus disease 2019 (COVID-19) virus, also known as SARS-CoV-2, infection every seven days was performed for high-risk populations who worked at the Yantian Port in Yantian District, Shenzhen City, Guangdong Province. On May 20, 2021, an oropharyngeal swab from a 44-year-old male (Case A) tested preliminarily positive for COVID-19 by a quantitative real-time reverse transcription polymerase chain reaction (RT-qPCR) method in a third-party laboratory. On May 21, 2021, 3 types of specimens (nasopharyngeal swab, oropharyngeal swab, and anal swab) from this case were collected by Yantian CDC and were confirmed positive for COVID-19 virus by a RT-qPCR method simultaneously implemented in two commercial kits (Daan, Guangzhou, China and Bojie, Shanghai, China) in the virology laboratory of Shenzhen CDC (Table 1). Then, screening was initiated for employees from the Yantian Port and close contacts. A total of 5 cases were confirmed with COVID-19 infections between May 22, 2021 and May 24, 2021 (Table 1). These cases were transported immediately to the Shenzhen Third People’s Hospital for isolated treatment by ambulance after COVID-19 virus infection was confirmed. Specimens from the cases above collected by the Shenzhen Third People’s Hospital were sent to the virology laboratory of Shenzhen CDC for discharge assessment. High-throughput sequencing was performed for six COVID-19 virus strains from this study. First, viral RNA was extracted directly from 200-μL swab samples with the lowest Ct value in RT-qPCR tests using a High Pure Viral RNA Kit (Roche, Germany). Second, libraries were prepared using a Nextera® XT Library Prep Kit (Illumina, USA), and the resulting DNA libraries were sequenced on a MiSeq platform (Illumina) using a 300-cycle reagent kit (1). Last, mapped assemblies were generated using the COVID-19 virus/SARS-CoV-2 reference sequence Wuhan-Hu-1 (GenBank no. NC_045512.2). Nucleotide (nt) and amino acid (AA) differences between the six virus genome sequences from this study and the reference sequence Wuhan-Hu-1 were analyzed using the programs BioEdit 7.19 and MEGA version7 (2). The 6 strains from Case A, Case B, Case C, Case D, Case E, and Case F were designated as hCoV-19/Guangdong/IVDC-05-01-2/2021, hCoV-19/Guangdong/IVDC-05-02-2/2021, hCoV-19/Guangdong/IVDC-05-03/2021, hCoV-19/Guangdong/IVDC-05-04/2021, hCoV19/Guangdong/IVDC-05-05/2021, and hCoV-19/Guangdong/IVDC-05-06/2021, respectively, in this study. The genome sequences of these 6 strains were 29,844 nt, 29,867nt, 29,808 nt, 29,846 nt, 29,760 nt, and 29,832nt in length, respectively. Based on the “Pango lineages” rule (3), the 6 virus strains from this study were assigned to lineage B.1.1.7, which was also known as Variant of Concern 202012/01 (VOC-202012/01) or 20B/501Y.V1. The lineage B.1.1.7 was first identified in the UK in September 2020 and had 24 characteristic mutations (ORF1a: T1001I, A1708D, I2230T, del3675-3677;ORF1b: P314L;S: del69/70, del144, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H;ORF8:Q27stop, R52I, Y73C;N: D3L, R203K, G204R, S235F). Compared with the reference genome sequence Wuhan-Hu-1, 5 strains (hCoV-19/Guangdong/IVDC-05-01-2/2021, hCoV-19/Guangdong/IVDC-05-02-2/2021, hCoV-19/Guangdong/IVDC-05-03/2021, hCoV-19/Guangdong/IVDC-05-04/2021, and hCoV-19/Guangdong/IVDC-05-06/2021) displayed 38 nucleotide variation sites (C241T, C643T, C913T, C2536T, A2784G, C3037T, C3267T, C5388A, C5986T, T6954C, C7851T, G13975T, C14408T, C14676T, T15096C, C15279T, T16176C, C17430T, G17944T, G21578T, A23063T, C23271A, A23403G, C23604A, C23709T, T24506G, G24914C, C27972T, G28048T, A28111G, G28280C, A28281T, T28282A, G28739T, G28881A, G28882A, G28883C, and C28977T) and 18 deletion mutations (ORF1a: del11288-11296/TCTGGTTTT;S: del21766-21771/ACATGT, del21994-21996/TTA). Except for the mutations above, other two variation sites (ORF1a: C884T and S: A23898T) were observed in genome of the strain hCoV-19/Guangdong/IVDC-05-05/2021 (Case E). By comparing deduced amino acid sequences, the 5 SARS-CoV-2 strains (hCoV-19/Guangdong/IVDC-05-01-2/2021, hCoV-19/Guangdong/IVDC-05-02-2/2021, hCoV-19/Guangdong/IVDC-05-03/2021, hCoV-19/Guangdong/IVDC-05-04/2021, and hCoV-19/Guangdong/IVDC-05-06/2021) displayed 24 AA variation sites (ORF1a: N840S, T1001I, A1708D, I2230T, A2529V;ORF1b: G170C, P314L, V1493L;S: V6F, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H;ORF8: Q27stop, R52I, Y73C;N: D3L, A156S, R203K, G204R, and S235F) and 6 deletion mutations (ORF1a: S3675del, G3675del, and F3677 del;S: H69del, V70del, and Y144del). Except for the mutations above, 2 other variation sites (ORF1a: R207C;S: Q779L) were observed in amino acid sequence of the strain hCoV-19/Guangdong/IVDC-05-05/2021 (Case E). All of the characteristic mutations belonging to SARS-CoV-2 variant B.1.1.7 were found in genomes of the 6 SARS-CoV-2 strains from this study. Whole-genome sequencing (WGS) confirmed that all SARS-CoV-2 strains from this study were VOC 202012/01-lineage B.1.1.7, suggesting a common source of exposure at the Yantian Port. SARS-CoV-2 lineage B.1.1.7 is of growing concern because it has shown to be significantly more transmissible than other variants (4-7). As of now, the 4 SARS-CoV-2 VOCs (B.1.1.7, B.1.351, P.1, and B.1.617.2) have been imported into mainland China (8-11). There is a high risk that imported SARS-CoV-2 VOCs may cause local outbreaks and epidemics. In this study, we focused on laboratory testing and genome characterization of the pathogen. Detailed epidemiological investigation is essential in a follow-up report. Data availability: The six SARS-CoV-2 genome sequences determined in this study has been deposited in GISAID (www.gisaid.org) under the accession number EPI_ISL_2405168, EPI_ISL_2405169, EPI_ISL_2432955, EPI_ISL_2405170, EPI_ISL_2405171, and EPI_ISL_2405172.