Page 117 - MI-2-3
P. 117

Microbes & Immunity                                                  Statistical modeling of COVID-19 trends



            models have been utilized to predict COVID-19 case trends   are  examined  to  investigate  potential  causes  for  forecast
            across  different  countries,  often  producing  reasonably   anomalies, such as policy changes and virus mutations.
            accurate forecasts over limited time horizons.  However, the   Additionally, the study explores the socioeconomic
                                               3
            accuracy of ARIMA-based forecasts can vary significantly   factors—including GDP per capita, healthcare resources,
            depending on the region and time period, influenced by   and human development index (HDI)—on COVID-
            factors such as viral mutations, government interventions,   19 case numbers across countries. This multidimensional
            and changes in population behavior.  One notable   approach allows for a more comprehensive comparison
                                             4
            limitation  of  ARIMA  models  is  their  exclusive  reliance   of ARIMA and ARIMAX models performance, while also
            on historical data, without considering external factors   offering valuable perspectives on the broader determinants
            that might influence future trends—such as vaccination   of the pandemic’s spread, contributing to future epidemic
            rates, policy changes, or behavioral adaptations—which   prevention and control strategies.
            introduces greater uncertainty in long-term predictions.
                                                               2. Data collection
              To address these limitations, the ARIMA with
            exogenous  variables (ARIMAX) model incorporates   To conduct a comprehensive analysis of the COVID-19
            external variables—such as vaccination rates—to enhance   pandemic and its associated factors, a diverse set of
            its predictive capabilities. By incorporating vaccination   datasets was obtained from reputable sources, including
            data, the model enables researchers to assess the potential   the World Health Organization (WHO), Centers for
            impact of vaccination campaigns on future case trends,   Disease Control and Prevention, World Bank, and other
            providing a more comprehensive understanding of    national and international agencies. These datasets were
                             5
            epidemic dynamics.  Although previous studies have   selected based on their relevance, comprehensiveness, and
            shown that vaccination plays a crucial role in mitigating the   frequency of updates to ensure that the analysis reflects
            spread of COVID-19—leading to significant reductions in   the most accurate and current information available.
            new case numbers following mass immunization efforts —  As shown in  Table 1, the data include daily and weekly
                                                        6
            most of these studies are region-specific or limited to   reports of COVID-19  cases, deaths, and vaccination
            particular periods and do not fully capture the complex   trends, alongside key socioeconomic indicators such as
            interactions between vaccination efforts, virus mutations,   GDP per capita, HDI, Gini index, healthcare expenditures,
            and policy interventions.                          and healthcare infrastructure data. These variables were
                                                               essential for modeling the progression of the pandemic and
              In addition to time series forecasting, examining   for evaluating the impact of various factors on infection
            the relationship between COVID-19 incidence and    rates.
            socioeconomic factors is crucial. Previous research
            has highlighted the influence of indicators—such as   All statistical analyses were conducted using R
            gross domestic product (GDP) per capita, healthcare   version 4.4.3.
            infrastructure, and other socioeconomic variables—in
            shaping the impact of the pandemic across different   3. Methodology
                  7
            regions.  For example, countries with greater healthcare   3.1. Theoretical basis of the ARIMA model
            spending and more robust medical systems tend to manage   The ARIMA model is a statistical method commonly used
            the crisis more effectively, resulting in lower mortality rates
            and more effective containment strategies.  However, most   for analyzing and forecasting time series data. The general
                                             8
            of the existing research relies on single-variable analyses   form of an ARIMA model with order (p, d, q) is represented
            and does not fully capture the complex, multifaceted   by the following equation:
            interactions among socioeconomic factors, which may   ϕ(B)∇ x  = θ(B)ε t                       (I)
                                                                      d
                                                                        t
            contribute to significant disparities in COVID-19 outcomes   where:
            across different countries.                        (i)  ∇   = (1−B)   is the differencing operator, with  B
                                                                    d
                                                                            d
              This study aims to advance existing research by applying   representing the backshift operator, 9
            both ARIMA and ARIMAX models to predict short-term   (ii)  ϕ(B) = 1−ϕ B − · · · − ϕ B  is the autoregressive (AR)
                                                                                       p
                                                                            1
                                                                                      p
            COVID-19  case trends in the United States (US) and   coefficient polynomial, 9
            globally. In the ARIMAX model, vaccination rates are   (iii) θ(B) = 1−θ B − · · · − θ B  is the moving average (MA)
                                                                                      q
                                                                           1
                                                                                    q
            incorporated as an exogenous variable to enhance predictive   coefficient polynomial, 9
            accuracy and provide deeper insights into the relationship   (iv)  ε   denotes white noise error terms that satisfy the
                                                                   t
            between vaccination efforts and new case trends.      following properties: E (ε ) = 0, Var (ε ) = σ ,
                                                                                                    2
                                                                                       t
                                                                                                t
            Discrepancies between predicted and actual case numbers   (v)  E (ε ε ) = 0 for s ≠ t, 9
                                                                     t s
            Volume 2 Issue 3 (2025)                        109                           doi: 10.36922/MI025040007
   112   113   114   115   116   117   118   119   120   121   122