Page 213 - IJOCTA-15-4
P. 213

Data-driven optimization and parameter estimation for an epidemic model
            population.   The question of the reproduction    much emphasis on the small values where data
            number of the disease on the metric graph, R 0 ,  may be lacking.
            is still an open problem at this time.                As we will see in Section 3.1, the objective
                To validate the model—i.e., to determine      function exhibits Rosenbrock-like behavior, with
            whether the inclusion of additional transport     steep sides and a long, corrugated valley of lo-
            terms is necessary—one would ideally start from   cal minima. This structure indicates that the in-
            a general PDE with candidate terms and use        verse problem is likely ill-posed; there is not one
            data to decide which terms should remain in       unique set of parameters that acts as a global min-
            the model. 79,80  This process may involve machine  imizer. Instead, we aim to find a plausible set of
            learning or other techniques beyond the scope of  parameters that minimizes the 2-norm and quali-
            the present work. Here, we adopt the model as de-  tatively matches the data, particularly the timing
            scribed above, consistent with the previous stud-  and width of the infection curve. 62,63
            ies.                                                  For the metric graph under study, we must de-
                The system of PDEs is approximated numeri-    termine or adjust the following global sensitivity
            cally using the validated methodology introduced  parameters:
            in our previous work. 30  In brief, we use a forward
                                                                   • Adjustments (c β and c η ) to the approxi-
            finite difference (FD) approximation in time and
                                                                     mated transmission rate and removal rate
            a centered FD in space. This explicit numerical          at the vertex.
            scheme is easily scalable to larger networks with-     • Adjustment (c λ ∈ (0, 1)) to the vertex-to-
            out needing to invert a large matrix.                    edge exchange rate.
                                                                   • Edge-to-vertex exchange rate (α ∈ (0, 1)).
            2.3. Optimization-based parameter                      • Global scaling parameter for edge-to-edge
                 estimation
                                                                     exchange c v .
            We compare our model to the smoothed Ministry          • Edge diffusion coefficient (d e ) for the net-
            of Health data 64  from the first fully recorded wave    work.
            of COVID-19 in Poland: early February through         For the first optimization step, we assume
            mid-May 2021 (see Appendix A for a more thor-     these six scaling parameters are constant across
            ough discussion of data pre-processing). We seek  the entire network.    Since these global scal-
            to minimize the difference between the smoothed   ing parameters multiply our data-informed initial
            data and the function output I v (t) at each vertex  guesses (Appendix B), it is important to have a
            over time.                                        good initial guess. Once a good global set of scal-
                There are well-documented cases of under-     ing parameters is found, the individual parame-
            reporting, 81  with one study estimating that only  ters can be manually adjusted to improve the fit.
            60% of COVID cases   48  in Poland were detected,     A global sensitivity analysis (Appendix C)
            while another study claiming it may be as low as  showed that the model is highly sensitive to the
             1  of all cases. 49  Some reasons for under-reporting,
             4                                                scaling c β of the transmission rates β v . Scaling
            both in Poland and worldwide, may include the     the removal rates η v by c η contributes more to
            presence of asymptomatic cases, 82  limited access  the time and amplitude of the peak infection than
            to testing, 83  and reluctance to either be tested 83  the cumulative infection rate, while the edge-to-
            or seek medical care. 84                          vertex transmission rate α is more influential in
                Though the amplitudes of the incidence rates  the number of cumulative infections. The model
            are unreliable due to under-reporting, it is rea-  is not very sensitive to changes in the diffusion co-
            sonable to assume that the shapes of the infec-   efficient d e , scaling of the edge-to-vertex exchange
            tion curves are more reliable than the values, in  rate by c λ , or changes to the edge-to-edge skipping
            particular, the time of peak infection and the vari-  parameter c v (the latter was also observed in). 30
            ance. Therefore, for vertex v, we use the 2-norm  Thus, we make some simplifications: we keep both
            of the difference between the smoothed data and   the edge diffusion coefficient and the global scal-
            the model output, both first normalized by their  ing of the edge-to-edge skipping parameter con-
            maxima. This normalization preserves the shape    stant for the entire network.
            of the infection curves while under-emphasizing
            the unknown amplitude.     It is able to convey   2.3.1. Optimization methodology
            the trends in the data without requiring precise  Our optimization has the following two phases:
            knowledge of the total number of infected peo-    global and local. Starting with the initial guesses
            ple. We discard the first and last 20 days of the  described in Appendix B, we first fit a single set of
            modeled period for the computation of the nor-    optimization parameters (c β , c η , c λ , c v , α) for the
            malized 2-norm difference, so we do not place too  entire network. As our objective function, we use
                                                           755
   208   209   210   211   212   213   214   215   216   217   218