Early Introduction and Community Transmission of SARS-CoV-2 Omicron Variant, New York, New York, USA

Dakai Liu; Yexiao Cheng; Hangyu Zhou; Lulan Wang; Roberto Hurtado Fiel; Yehudah Gruenstein; Jean Jingzi Luo; Vishnu Singh; Eric Konadu; Keither James; Calvin Lui; Pengcheng Gao; Carl Urban; Nishant Prasad; Sorana Segal-Maurer; Esther Wurzberger; Genhong Cheng; Aiping Wu; William Harry Rodgers

Disclosures

Emerging Infectious Diseases. 2023;29(2):371-380. 

In This Article

Methods

Sample Collection

Being in the epicenter of the COVID-19 pandemic, NewYork-Presbyterian Queens Hospital has received 185,870 specimens for SARS-CoV-2 RNA testing by diagnostic multiplex real-time PCR since the COVID-19 pandemic started in March 2020. In addition to those specimens, we analyzed specimens collected by mobile service vans from residential communities and workplaces. We used demographics, including residential and business addresses associated with collection sites, for the epidemiology analysis. We considered households as persons at the same residential address, identical business addresses as workplace, and family members as family. We determined traveler status by the home address and traveling inquiry performed during sampling. We advised all patients testing positive or exposed patients to follow the Centers for Disease Control and Prevention (CDC) quarantine guidelines.

Viral Genomic Next-generation Sequencing and Bioinformatics Processing

To investigate SARS-CoV-2 mutations and variant epidemiology, we performed next-generation sequencing (NGS) on the positive specimens with real-time PCR cycle threshold (Ct) value <33 cycles and analyzed virus mutations among the specimens from our laboratory and LabQ Diagnostics (New York, New York, USA). We performed NGS by using the Illumina COVID-Seq test kit (https://www.illumina.com). We extracted viral RNA from a viral transport medium containing a nasopharyngeal swab specimen, then performed cDNA synthesis through reverse transcription using random hexamer primers. We amplified the cDNA of the viral genome by 2 separate PCR reactions and pooled the products together. The fragments underwent bead-based tagmentation to the adaptor sequences. Subsequently, the adaptor-tagged fragments underwent another round of PCR amplification. Using the purification beads, we pooled and cleaned the indexed tagged libraries. We clustered pooled libraries onto a flow cell and then sequenced on the NovaSeq 6000 sequencing system (Illumina). We used VarSeq version 2.2.2 (Golden Helix, https://www.goldenhelix.com) for sequence analysis; we used consensus sequences of these viruses as input to Nextclade version 1.10.1[19] for quality control, mutation calling, and Nextstrain clade assignment. Viruses <29,000 nt in length or with Nextclade-assessed QC.overallStatus below good were considered low quality and removed.

Phylogenetic Analysis

To investigate the genetic relationship between Omicron viruses in NYC, we constructed a genotype network of all sequenced Omicron viruses; nodes represented nucleotide genotypes of viruses and edges between nodes represented pairs of nucleotide genotypes with the highest genetic similarity. We visualized this network using Gephi version 0.9.2.[20] We also constructed a phylogenetic tree of those Omicron viruses in NYC using Nextstrain SARS-CoV-2 workflow version 3.0.6[21] and visualized it as timescaled using Auspice version 2.33.0 (https://auspice.us), which is part of the Nextstrain workflow. We then identified different clades of Omicron viruses based on the genotype network and the phylogenetic tree.

To investigate the introductions of the Omicron variant in NYC, we downloaded all global Omicron sequences collected before December 11, 2021 and their metadata from GISAID (https://www.gisaid.org).[22] We removed sequences with incomplete information such as collection date or location. We performed mutation calling of these contextual sequences using Nextclade version 1.10.1.[19] We applied the same quality control standards for our sequenced samples as we did for GISAID sequences. Sequences that were <29,000 nt long or had Nextclade-assessed QC.overallStatus value below good were considered low quality and removed. To identify the genetic relationship between viruses clustered into different clades from NYC and the rest of the world, we constructed a phylogenetic tree using local viruses and global contextual viruses. We defined the viruses in NYC clustered into these clades as local viruses. For each clade, global viruses detected before the time at which we detected the virus within the clade in NYC were selected as contextual viruses. We used Nextstrain SARS-CoV-2 workflow version 3.0.6[21] to construct this phylogenetic tree and Auspice to visualize it as divergence-scaled.

To investigate the genetic relationship between viruses from travelers and locals, we reanalyzed the same phylogenetic tree that was used to investigate the genetic relationship between Omicron viruses in NYC and highlighted travelers. To reveal the detailed transmission pattern of Omicron in NYC, we analyzed the mutational profiles of Omicron viruses in 2 local districts. In the mutational profiles, we presented only the substitutions that were not Omicron-defining substitutions.

Data and Code Availability

We have provided GISAID accession numbers and metadata of Omicron sequences generated in this study (Appendix Table 1, https://wwwnc.cdc.gov/EID/article/29/2/22-0817-App1.xlsx) and the GISAID global Omicron sequences used in this study (Appendix Table 2). The source code used to generate the figures has been released at GitHub (https://github.com/wuaipinglab/sarscov2-omicron-nyc).

processing....