Page 121 - MI-2-3
P. 121
Microbes & Immunity Statistical modeling of COVID-19 trends
where: It is important to note that the Granger causality test
(i) y represents the value of the dependent variable at does not confirm true causality in a philosophical or
t
time t, structural sense but rather indicates that past values of one
(ii) ϕ are the AR coefficients, series are useful in predicting another.
i
(iii) θ are the MA coefficients,
j
(iv) ϵ is the error term, 3.5.2. Segmented regression analysis and Chow test
t
(v) X represents the exogenous variable lagged by k Segmented regression analysis was employed to
t−k
periods. 22 quantify the impact of vaccination on the trend of new
Exogenous variables were incorporated to capture COVID-19 cases. This method estimates changes in trends
additional influences on the time series that could not be before and after an intervention, such as the introduction of
24
explained solely by its historical values. 9 a vaccination program. The resulting coefficients provide
estimates of the immediate change in case numbers and
The ARIMAX model was fitted using an automated the change in the trend following the intervention.
selection of ARIMA parameters (p, d, and q) while
incorporating the selected exogenous variable. Model To validate these findings, a Chow test was conducted to
performance was evaluated by comparing the ARIMAX assess the presence of a structural break at the intervention
model against the ARIMA model using standard evaluation point. This test evaluates whether the relationship between
metrics, such as AIC, RMSE, and MAE. 14,16 time and new COVID-19 cases differs significantly before
and after the intervention. Rejecting the null hypothesis
25
Model forecasts were generated for a holdout period indicates a statistically significant change in the trend
to assess predictive accuracy. The inclusion of exogenous post-intervention. A detailed mathematical formulation of
variables in the ARIMAX model enables an assessment both the segmented regression model and the Chow test is
of whether incorporating external factors can improve provided in the Supplementary File.
the forecasting performance and provide a more
comprehensive understanding of the dynamics affecting 3.5.3. RDD
the time series. This comparison between ARIMA and An RDD was employed to estimate the causal effect of
ARIMAX models provided insights into the benefits vaccine introduction on new COVID-19 cases, using
and limitations of incorporating external factors into the the start of mass vaccination as the cutoff point. RDD
26
forecasting process. 22 assumes that observations just before and after the cutoff
3.5. Evaluating the impact of vaccination on new are comparable except for the treatment effect. This effect is
COVID-19 cases estimated by comparing new COVID-19 cases immediately
before and after vaccination introduction. The parameter
To analyze the relationship between vaccination rates and of interest (β) represents the effect of the intervention at
the number of new COVID-19 cases, several statistical the cutoff. A non-parametric approach was used to flexibly
methods were employed, including Granger causality testing, model the relationship between time and new cases on
segmented regression, and regression discontinuity design either side of the cutoff. Detailed mathematical formulation
(RDD). These methods support a clearer understanding of and implementation of the RDD model are provided in the
both the temporal relationships and potential causal effects Supplementary File.
of vaccination on the incidence of new cases. 9,14
While RDD strengthens causal inference through
3.5.1. Granger causality test a quasi-experimental design, it remains dependent on
the assumption that other confounding factors vary
The Granger causality test was employed to evaluate continuously at the cutoff. As such, it does not provide
whether past vaccination rates provided predictive definitive proof of causality.
information for future new COVID-19 case numbers.
This test determines whether one time series provides 3.6. Regression analysis of COVID-19 infection rates
statistically significant information for forecasting another and determinants
time series, suggesting a potential causal relationship. In
23
this context, the null hypothesis states that vaccination rates 3.6.1. Linear regression analysis of COVID-19
do not Granger-cause new COVID-19 cases—implying infection rates and economic development
that past vaccination rates do not provide additional To investigate the relationship between COVID-19
predictive value for future case numbers after accounting infection rates and economic development, a linear
for past cases. A detailed mathematical formulation of the regression analysis was conducted with the infection
model is provided in the Supplementary File. rate as the dependent variable and GDP per capita as
Volume 2 Issue 3 (2025) 113 doi: 10.36922/MI025040007

