Page 98 - MI-2-3
P. 98
Microbes & Immunity SARS-CoV-2 complementary classification
HCV, and influenza A virus); (2) estimate the minimum For SARS-CoV-2, sequences from the major VOCs were
evolutionary timeframe required for SARS-CoV-2 to reach included in the analysis. These variants were selected
a level of genetic divergence equivalent to that observed based on their distinct S protein mutations and their role
in HIV-1, HCV, and influenza A virus; and (3) propose a in altering transmissibility, immune escape, and vaccine
complementary classification framework for SARS-CoV-2 efficacy.
that is based on evolutionary and phylogenetic principles. HIV-1 exhibits significant genetic diversity, with distinct
2. Materials and methods subtypes and CRFs contributing to its global dissemination.
The following subtypes/CRFs, representing the most
2.1. Study design, dataset acquisition, and sequence common lineages, were selected: A1, B, C, D, CRF01_AE,
selection and CRF02_AG. These subtypes were chosen due to their
89
This study aimed to propose a practical and evolutionarily high epidemiological relevance and established divergence
informed classification framework for SARS-CoV-2, patterns, which serve as a benchmark for evaluating the
grounded in a conservative, objective threshold of evolutionary dynamics of SARS-CoV-2. 90
genetic divergence that could support the identification HCV classification is based on well-defined genotypes
of biologically distinct lineages as the virus continues to and subtypes, which exhibit substantial genetic diversity.
91
evolve. To achieve this, the evolutionary divergence of The following subtypes were selected: subtype 1a,
SARS-CoV-2 variants was compared with that of three subtype 1b, subtype 3a, and subtype 4a. These subtypes
well-characterized RNA viruses – HIV-1, HCV, and represent widely studied lineages with distinct evolutionary
influenza A virus. These viruses were selected as their trajectories, providing an appropriate comparison for
lineage differentiation is associated with substantial assessing SARS-CoV-2 divergence. Influenza A virus
biological and functional differences in transmissibility, evolves through antigenic drift and shift, leading to the
immune evasion, and pathogenesis. emergence of distinct subtypes with significant genetic and
The viral genomes and corresponding genes chosen antigenic variation. The following subtypes were included:
92
for analysis were: (1) SARS-CoV-2: S gene (3,822 bases); H1N1 and H3N2. These subtypes were selected as they
(2) HIV-1: env gene (2,592 bases); (3) HCV: E1 gene (576 represent the dominant circulating influenza lineages over
bases); and (4) influenza A virus: HA gene (1,707 bases). the past century, providing a well-documented model of
These regions were selected due to their relevance in viral viral evolution and antigenic variation. For SARS-CoV-2,
entry, immune evasion, and vaccine targeting, besides sequences from the five major VOCs designated by the
being recognized as rapidly evolving genomic regions of WHO were analyzed, representing key evolutionary
each virus. 80-83 Full-length viral sequences were retrieved lineages that have dominated transmission waves globally:
from the Los Alamos HIV Database, Los Alamos HCV Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta
Database, GenBank Influenza Virus Database, and (B.1.617.2), and Omicron (B.1.1.529). 43
NCBI Virus Database. 84-87 Selection criteria included:
(1) Complete coding sequences (removal of sequences 2.3. Sequence alignment and genetic distance
with large gaps or ambiguous bases, defined as >1% estimation
nucleotide base); (2) diversity in collection date and Multiple sequence alignments were performed separately
geographical origin to ensure representation of global for HIV-1, HCV, influenza A, and SARS-CoV-2 using
viral evolution; (3) exclusion of recombinant sequences, MAFFT v7.490, employing the L-INS-i algorithm, which
except where recombination is inherent to viral evolution optimizes alignment accuracy for sequences with frequent
(e.g., HIV-1 circulating recombinant forms [CRFs]); and insertions and deletions, particularly in the HIV-1 env and
(4) at least 40 representative sequences per viral subtype/ influenza HA genes. Alignment quality was manually
93
88
lineage to ensure robust phylogenetic reconstruction. The reviewed in MEGA6, and sequences with poor alignment
selection of 50 (SARS-CoV-2), 58 (HIV-1), 40 (HCV), and were excluded. Genetic distances were estimated using the
94
49 (influenza A virus) sequences adhered to this standard. maximum composite likelihood (MCL) model, with rate
variation among sites modeled using a gamma distribution
2.2. Selection of viral subtypes/lineages (shape parameter = 1). Codon positions included 1 , 2 ,
nd
st
To ensure a comprehensive comparison of SARS-CoV-2 and 3 coding positions, as well as non-coding regions. 94,95
rd
genetic divergence with well-characterized RNA viruses, All positions containing gaps or missing data were
representative subtypes and lineages for HIV-1, HCV, removed before analysis. The final alignment of HIV-1
and influenza A virus were selected based on their global (n = 58) consisted of 2,214 nucleotide positions; HCV
prevalence and established phylogenetic classification. (n = 40) consisted of 513 nucleotide positions; influenza
Volume 2 Issue 3 (2025) 90 doi: 10.36922/MI025190042

