Page 134 - MI-2-3
P. 134

Microbes & Immunity                                                  Statistical modeling of COVID-19 trends



            5. Conclusion                                      influencing the spread of COVID-19. Multicollinearity
                                                               causes  instability in coefficient  estimates,  reducing  the
            The comprehensive statistical analysis of COVID-   reliability of the model’s predictions.  In some cases, this
                                                                                            31
            19 trends—employing  ARIMA, ARIMAX,  multiple      instability leads the ARIMAX model to perform worse
            regression, and spatial autocorrelation models—provides   than the simpler ARIMA model, which does not encounter
            valuable insights into the dynamics of the pandemic both   this complication.
            globally and within the US. These findings highlight the
            strengths and limitations of different modeling approaches   Timing also plays a crucial role in the performance of
            and the complexity of factors influencing COVID-19 case   the ARIMAX model. The effects of vaccination on COVID-
                                                                                                          45
            numbers.                                           19 cases often involve variable and unpredictable lags.  If
                                                               the model fails to capture the appropriate lag structure,
              The ARIMA models demonstrate robust performance
            in predicting short-term COVID-19 trends, particularly   it could lead to inaccurate predictions. For example, the
                                                               time required for immunity to develop post-vaccination or
            when case dynamics follow relatively stable patterns.    differences in response across population groups can cause
                                                         14
            However, the models show limitations when sudden   mismatches between vaccination data and observed case
            changes occur in infection rates, such as those caused   changes, further complicating the accuracy of ARIMAX
            by sudden policy shifts or the emergence  of new virus   predictions.
            variants.  These situations often reduce predictive
                   43
            accuracy, suggesting that while ARIMA models effectively   Additionally, the ARIMAX model carries a risk of
            capture general trends, they may require augmentation   overfitting, especially when it becomes overly complex in
            or combination with other models to better account for   relation to the available data. Overfitting occurs when the
            sudden, non-linear changes. 44                     model captures noise or random fluctuations in the training
                                                               data as meaningful patterns, reducing its predictive accuracy
              The ARIMAX models, which incorporate exogenous             32
            variables such as vaccination data, provide a more   on new data.  This issue becomes more pronounced when
                                                               vaccination data are included, as the added complexity
            nuanced analysis by accounting for external influences   could reduce the model’s generalizability.
            on COVID-19 case numbers.  However, the effectiveness
                                   22
            of the ARIMAX model depends heavily on the specific   In  the  multiple  regression  analysis,  several
            characteristics of the time period and the data. For   socioeconomic factors emerge as significant predictors of
            instance, during periods when the impact of vaccination   COVID-19 case numbers. For example, previous research
            on case numbers is delayed or less pronounced, the model   indicates that factors such as population density, median
            struggles to accurately capture the true relationship   income, and access to healthcare services demonstrate
            between variables.  This is particularly evident when   strong correlations with case numbers.  These findings
                                                                                               48
                           45
            vaccine uptake is gradual or when vaccination effects take   highlight the unequal impact of the pandemic across
            time to appear in the population. Under these conditions,   various demographic groups and regions. Specifically,
            the model may overestimate or underestimate the influence   areas with higher population density and lower income
            of vaccination, leading to skewed forecasts. 46    levels tend to report higher case numbers, likely due to the
                                                               challenges in practicing social distancing and the limited
              Several challenges arise in  applying  the ARIMAX                       8
            model. Firstly, the model assumes a direct and linear   access to healthcare services.
            effect of the exogenous variable (vaccination rates) on the   The  regression  analysis  further  emphasizes  the
            dependent variable (COVID-19 cases), which may not fully   importance of incorporating a broad range of socioeconomic
            capture the complex, non-linear relationships involved.    factors when assessing the spread of COVID-19. However,
                                                         14
            Factors such as varying vaccine efficacy, the emergence   the model also reveals certain limitations. The relationships
            of new virus variants, shifts in public behavior, and policy   between the independent variables and COVID-19  case
            interventions (e.g., lockdowns, mask mandates) influence   numbers are not always linear, suggesting the need for
            the  effectiveness  of  vaccination  efforts  in  reducing  case   more advanced modeling approaches that can capture
            numbers.  If these factors are not properly incorporated,   these complexities.  Moreover, the presence of interaction
                   47
                                                                              29
            the ARIMAX model may incorrectly attribute changes   effects among the variables, such as the combined impact of
            in case numbers to vaccination, leading to inaccurate   income and healthcare access, suggests that future models
            predictions.                                       should explore these interactions to better understand the
                                                               pandemic’s dynamics.
              Moreover, including vaccination data as an exogenous
            variable introduces the risk of multicollinearity, particularly   Spatial autocorrelation analyses provide additional
            if the vaccination rates correlate with other factors   insights, particularly regarding the geographic clustering


            Volume 2 Issue 3 (2025)                        126                           doi: 10.36922/MI025040007
   129   130   131   132   133   134   135   136   137   138   139