Page 128 - MI-2-3
P. 128

Microbes & Immunity                                                  Statistical modeling of COVID-19 trends




             A                                                 ARIMA model, which does not account for such external
                                                               factors.
                                                                 To test this hypothesis, the dataset starting from
                                                               December 13, 2020—the beginning of the vaccination
                                                               campaign—was employed. Given the evidence that the
                                                               impact of vaccination is more pronounced over time,
                                                               the first forecast period selected spanned from January
                                                               5, 2020, to June 27, 2021—approximately 6 months after
                                                               vaccination began. This period was used to predict the
                                                               cumulative number of cases for the subsequent 3 months.
             B                                                 Following this, the training data period was gradually
                                                               extended across 3 time periods:
                                                               (i)  January 5, 2020 – June 27, 2021,
                                                               (ii)  January 5, 2020 – December 26, 2021,
                                                               (iii) January 5, 2020 – September 25, 2022.
                                                                 Figures  6,  S5,  and  S6  compare  the  ARIMA  and
                                                               ARIMAX  model  predictions across  these periods,  while
                                                               Tables 5, S5, and S6 present the evaluation metrics (AIC,
                                                               RMSE, and MAE) for both models.
                                                                 These  results  indicate  that the  ARIMAX  model
                                                               generally produces forecasts that are closer to the actual
            Figure  5. Segmented and regression discontinuity analysis of   data than those of the ARIMA model, as evidenced by
            COVID-19 cases. (A) Segmented regression analysis of COVID-19 cases.   lower  RMSE  and MAE  values  in  certain  time  periods.
            (B) Regression discontinuity plot, where blue points represent   However, in some cases, the ARIMAX model exhibits
            the observed data and the red line indicates the fitted regression   greater deviation from the actual data, resulting in higher
            discontinuity model. The vertical black line marks the intervention date   RMSE values. Notably, improvements in RMSE and
            (December 13, 2020).
                                                               MAE do not always correspond to lower AIC values.
            non-significant  p=0.636. This suggests that while there   For example, in the third period, although the ARIMAX
            may be an observable shift in the trend of new cases at   model provides more accurate predictions (reflected by
            the intervention point, it is not statistically significant   lower RMSE and MAE values), its AIC value is higher than
            at conventional levels. As shown in  Table S4, the RDD   that of the ARIMA model. This highlights the trade-off
            results further support the idea that the immediate impact   between model complexity and goodness-of-fit inherent in
            of vaccination is not statistically significant.  Figure  5B   the AIC calculations.
            provides a visualization of the RDD results, highlighting   4.6. Multivariate regression analysis of global
            the discontinuity at the intervention point. The non-  COVID-19 infection rates
            significant result from the RDD analysis is consistent   To  investigate  the  factors  influencing  COVID-19
            with the findings from both the Granger causality test   infection rates across different countries, it was initially
            and the segmented regression analysis, indicating that the   hypothesized  that  countries  with  advanced  healthcare
            immediate impact of vaccination is not significant.
                                                               systems and greater access to medical resources would
            4.5. Forecast on COVID-19 cases using ARIMAX       exhibit lower infection rates. However, an analysis of the
            model, with vaccination rates as the exogenous     top 10 countries by infection rate as of December 31, 2023
            variable                                           (Figure S7) contradicts this assumption. Several highly
                                                               developed countries, including Luxembourg, Denmark,
            Based on the results in the previous section, where   and Austria, appear among those with the highest infection
            vaccination demonstrates a significant long-term impact   rates, challenging the initial hypothesis.
            on the reduction of new COVID-19  cases, a logical   To further examine this relationship, a linear
            extension was made to incorporate the number of    regression analysis was conducted using GDP per capita
            vaccinations as an exogenous variable in ARIMAX models.   as an indicator of a country’s level of development and
            It was hypothesized that the inclusion of this variable   the COVID-19 infection rate as the outcome variable. The
            could improve forecast accuracy compared to the standard   scatterplot with the fitted regression line is illustrated in


            Volume 2 Issue 3 (2025)                        120                           doi: 10.36922/MI025040007
   123   124   125   126   127   128   129   130   131   132   133