Page 99 - MI-2-3
P. 99
Microbes & Immunity SARS-CoV-2 complementary classification
A virus (n = 49) consisted of 939 nucleotide positions; (E1 region) mean divergence; and (3) influenza A virus
and SARS-CoV-2 (n = 50) consisted of 3,089 nucleotide subtypes (HA gene) mean divergence. ANOVA and
positions. Evolutionary analyses were conducted using Kruskal–Wallis tests were used to compare mean genetic
MEGA6 to compute pairwise genetic distances across all divergence values among viral groups, and the analysis was
sequence groups. 94 conducted using IBM SPSS Statistics for Windows, Version
To enhance the temporal signal in assessing the genetic 26.0 (IBM Corp, United States). It was hypothesized that
divergence of the viruses, we included sequences spanning if SARS-CoV-2 variants do not surpass the established
multiple years and epidemic phases as follows. For HIV-1, genetic divergence thresholds derived from HIV-1, HCV,
the sequence collection years spanned 1993 – 2017, from 20 and influenza A virus, this would indicate that their present
countries, including Argentina, Australia, Brazil, Botswana, classification is driven more by transient mutations than by
China, Cyprus, Spain, Ethiopia, Finland, Indonesia, meaningful virological differentiation.
India, Iran, Kenya, Nigeria, Sweden, Thailand, Tanzania, 2.6. Estimation of speciation time for SARS-CoV-2
Uganda, UK, and US. For HCV, the sequence collection based on genetic distances compared to HIV-1, HCV,
years spanned 1993 – 2023, from 15 countries, including and influenza A virus
Australia, Switzerland, China, Cuba, Germany, Egypt,
Spain, France, Ireland, India, Japan, Pakistan, Thailand, the To estimate the time required for SARS-CoV-2 to reach
UK, and the US. For influenza A, sequences were collected speciation-level divergence, we compared its genetic
in multiple locations in Sweden spanning 1992 – 2010. For distances to those observed in HIV-1, HCV, and influenza
SARS-CoV-2, sequences were collected during 2020 – 2025 A virus. Speciation thresholds were defined based on
from Canada, Ghana, Japan, New Zealand, and several the minimum genetic distances observed between
locations in the US, including Arizona, California, the recognized subtypes or genotypes in these viruses. Using
District of Columbia, Iowa, Indiana, Michigan, Minnesota, the established evolutionary rate of SARS-CoV-2 (0.0004
North Carolina, Nevada, New Jersey, North Carolina, – 0.002 s/s/y), we applied the formula: Years = Genetic
Oklahoma, Oregon, Pennsylvania, South Carolina, South distance threshold/evolutionary rate. The evolutionary rate
Carolina, and Washington. of SARS-CoV-2 was set at 0.0004 – 0.002 s/s/y, consistent
with published estimates. 17,51,99-103 To evaluate how
2.4. Maximum likelihood phylogenetic analysis recombination affects divergence, an adjusted evolutionary
To assess evolutionary relationships and determine rate model was incorporated based on the estimated
whether SARS-CoV-2 variants exhibit lineage divergence recombination frequency in SARS-CoV-2 genomes (~2.7%
comparable to that observed in HIV-1, HCV, and influenza recombinant ancestry). 104,105 A 1.5× acceleration factor was
A virus, maximum likelihood (ML) phylogenetic trees applied, as recombination has been shown to elevate the
were constructed using MEGA6. 94,95 The MCL model was viral evolutionary rate, 106,107 yielding adjusted evolutionary
employed for nucleotide substitution, with rate variation rates: lower bound, 0.0006 s/s/y; upper bound, 0.003 s/s/y.
among sites modeled using a gamma distribution (shape Monte Carlo simulation was performed by generating
parameter = 1). The analysis included all nucleotide 1,000 random evolutionary rates sampled uniformly from
sequences, with codon positions (first, second, third, and the adjusted range. Simulated rate values and calculated
non-coding regions) considered. Sequences containing time estimates were compiled into a structured dataset
gaps or missing data were removed before analysis. 94,95 and analyzed using the IBM SPSS Statistics for Windows,
An unrooted ML tree was generated with 100 ultrafast Version 26.0 (IBM Corp, United States). Descriptive
bootstrap replicates to evaluate branch support, with statistics (mean, standard deviation, and 95% confidence
bootstrap values exceeding 70% considered statistically intervals [CIs]) were computed for the estimated time
significant. The resulting tree was visualized using required to reach each threshold.
96
FigTree software to facilitate the interpretation of lineage
relationships. The inclusion of ≥40 sequences per virus 2.7. Basis of the proposal of new SARS-CoV-2
97
provided sufficient phylogenetic resolution. 96,98 classification scheme
To establish a robust and biologically meaningful
2.5. Defining the genetic divergence threshold for classification system for SARS-CoV-2 variants, a two-
variant classification pronged methodological approach was employed,
To establish an objective cutoff for defining distinct integrating (1) genetic divergence thresholds derived
viral variants, the genetic distances of SARS-CoV-2 from well-characterized viral evolution patterns, and
were compared to the following benchmarks: (1) HIV-1 (2) functional impact criteria identified through a
subtypes (env region) mean divergence; (2) HCV subtypes comprehensive review of literature on viral pathogenesis,
Volume 2 Issue 3 (2025) 91 doi: 10.36922/MI025190042

