Page 117 - MI-2-3
P. 117
Microbes & Immunity Statistical modeling of COVID-19 trends
models have been utilized to predict COVID-19 case trends are examined to investigate potential causes for forecast
across different countries, often producing reasonably anomalies, such as policy changes and virus mutations.
accurate forecasts over limited time horizons. However, the Additionally, the study explores the socioeconomic
3
accuracy of ARIMA-based forecasts can vary significantly factors—including GDP per capita, healthcare resources,
depending on the region and time period, influenced by and human development index (HDI)—on COVID-
factors such as viral mutations, government interventions, 19 case numbers across countries. This multidimensional
and changes in population behavior. One notable approach allows for a more comprehensive comparison
4
limitation of ARIMA models is their exclusive reliance of ARIMA and ARIMAX models performance, while also
on historical data, without considering external factors offering valuable perspectives on the broader determinants
that might influence future trends—such as vaccination of the pandemic’s spread, contributing to future epidemic
rates, policy changes, or behavioral adaptations—which prevention and control strategies.
introduces greater uncertainty in long-term predictions.
2. Data collection
To address these limitations, the ARIMA with
exogenous variables (ARIMAX) model incorporates To conduct a comprehensive analysis of the COVID-19
external variables—such as vaccination rates—to enhance pandemic and its associated factors, a diverse set of
its predictive capabilities. By incorporating vaccination datasets was obtained from reputable sources, including
data, the model enables researchers to assess the potential the World Health Organization (WHO), Centers for
impact of vaccination campaigns on future case trends, Disease Control and Prevention, World Bank, and other
providing a more comprehensive understanding of national and international agencies. These datasets were
5
epidemic dynamics. Although previous studies have selected based on their relevance, comprehensiveness, and
shown that vaccination plays a crucial role in mitigating the frequency of updates to ensure that the analysis reflects
spread of COVID-19—leading to significant reductions in the most accurate and current information available.
new case numbers following mass immunization efforts — As shown in Table 1, the data include daily and weekly
6
most of these studies are region-specific or limited to reports of COVID-19 cases, deaths, and vaccination
particular periods and do not fully capture the complex trends, alongside key socioeconomic indicators such as
interactions between vaccination efforts, virus mutations, GDP per capita, HDI, Gini index, healthcare expenditures,
and policy interventions. and healthcare infrastructure data. These variables were
essential for modeling the progression of the pandemic and
In addition to time series forecasting, examining for evaluating the impact of various factors on infection
the relationship between COVID-19 incidence and rates.
socioeconomic factors is crucial. Previous research
has highlighted the influence of indicators—such as All statistical analyses were conducted using R
gross domestic product (GDP) per capita, healthcare version 4.4.3.
infrastructure, and other socioeconomic variables—in
shaping the impact of the pandemic across different 3. Methodology
7
regions. For example, countries with greater healthcare 3.1. Theoretical basis of the ARIMA model
spending and more robust medical systems tend to manage The ARIMA model is a statistical method commonly used
the crisis more effectively, resulting in lower mortality rates
and more effective containment strategies. However, most for analyzing and forecasting time series data. The general
8
of the existing research relies on single-variable analyses form of an ARIMA model with order (p, d, q) is represented
and does not fully capture the complex, multifaceted by the following equation:
interactions among socioeconomic factors, which may ϕ(B)∇ x = θ(B)ε t (I)
d
t
contribute to significant disparities in COVID-19 outcomes where:
across different countries. (i) ∇ = (1−B) is the differencing operator, with B
d
d
This study aims to advance existing research by applying representing the backshift operator, 9
both ARIMA and ARIMAX models to predict short-term (ii) ϕ(B) = 1−ϕ B − · · · − ϕ B is the autoregressive (AR)
p
1
p
COVID-19 case trends in the United States (US) and coefficient polynomial, 9
globally. In the ARIMAX model, vaccination rates are (iii) θ(B) = 1−θ B − · · · − θ B is the moving average (MA)
q
1
q
incorporated as an exogenous variable to enhance predictive coefficient polynomial, 9
accuracy and provide deeper insights into the relationship (iv) ε denotes white noise error terms that satisfy the
t
between vaccination efforts and new case trends. following properties: E (ε ) = 0, Var (ε ) = σ ,
2
t
t
Discrepancies between predicted and actual case numbers (v) E (ε ε ) = 0 for s ≠ t, 9
t s
Volume 2 Issue 3 (2025) 109 doi: 10.36922/MI025040007

