Page 87 - AJWEP-22-6
P. 87

ML-based C  for side trapezoidal labyrinth weirs
                                                         d
                       P                                            In the SVM setup, the values of tuning parameters C and γ
                DDR=     -1                                 (XV)
                       O                                            were determined as 67 and 5.5, respectively. The values of
                                                                    RMSE, MAE, R , and C d(DDRmax)  during both the training
                                                                                   2
                  Where O and P are observed and predicted values,   and testing phases were as follows: 0.02679, 0.02318,
                respectively; O  and P are the mean of the observed and   0.79808,  and  7.50,  respectively;  and  0.0113,  0.0066,
                the  predicted  values,  respectively;  and  N  is  the  total   0.7573, and 10.36, respectively.
                dataset number. The first three criteria assess the mean   Table  5 summarizes  the setting  parameters  of the
                                                                                                                    2
                error values associated with the implemented models.   superior GEP model. The values for RMSE, MAE, R ,
                To address this deficiency, Noori et al.  introduced the   and  C d(DDRmax)   were  observed  as  follows:  0.03847,
                                                  51
                DDR index. To enhance interpretation and visualization,   0.03275,  0.74962,  and  5.35,  respectively,  during  the
                the  Gaussian  distribution  of  DDR  values  should  be   training  phase  and  0.0181,  0.0109,  0.5624,  and  6.32,
                depicted as a standard normal distribution. To achieve   respectively, during  the  testing  phase.  The  operators
                                                                                                                   1
                this, two steps are followed. First, the DDR values (C )   involved in the GEP are +, −, ×,/, √, e , ln, x , x ,  x .
                                                                                                             2
                                                                                                       x
                                                                                                                3
                                                               d
                                                                                                                   3
                are  standardized,  resulting  in  the  calculation  of  the   Figure 3 demonstrates the ET of the GEP output. The
                normalized  DDR  value  (C d[DDR] ) using a Gaussian   corresponding values in Figure 3 are G1C1 = −0.476807,
                function. Second, a plot is created, where C d(DDR) values   G2C0 = 2.40567, G2C1 = −0.476807, G3C0 = −0.395355,

                are plotted against their standardized counterparts   G3C1 = −0.665436. It is worth noting that d0, d1, and
                                      (DDR)graph, a greater alignment
                (Z DDR ). In the Z DDR  vs. C d                                 B    H       w
                of  error  distribution  toward  the  centerline  and  larger   d2 stand for   ,   d  , and   1  , respectively.
                C     values indicate increased accuracy.                       w 1  P       w 2
                 d(DDR)
                                                                       The MLP model in this study has the following
                2.5. Overall methodology                            performance  metric  values  (RMSE, MAE, R ,  and
                                                                                                               2
                Machine learning prediction models were developed   C d[DDRmax] ) associated with it: 0.02426, 0.02031, 0.81602,
                using the discussed SVM, GEP,  ANN, and MARS        and 8.07, respectively, during the training phase, and
                methods. Of the aggregate  data gathered  from the   0.0111, 0.0065, 0.6878, and 11.32, respectively, during
                experiments,  the proportions allocated  for the phases   the  testing  phase.  The  last  model  employed  in  this
                of  the  model’s  training  and  testing  were  70%  and   investigation  is the MARS  model, with performance
                30%,  respectively.  The  developed  models  were  then   metrics reaching 0.04995, 0.04381, 0.51011, and 4.02,
                tested for their performance as per the metrics given in   respectively,  during  the  training  phase  and  0.0245,
                Equations XII-XV.                                   0.0149, 0.5593, and 4.64, respectively, during the testing
                                                                    phase. In the process of the MARS model development,
                3. Results and discussion                           an  initial  set  of  21  basis  functions  was  taken  into
                                                                    account during the first step. Subsequently, during the
                Table 4 presents an overview of the statistical performance   second step (pruning step), 18 of these basis functions
                criteria for the predicted models. For the establishment   were removed. Ultimately, the optimal MARS model,
                of the SVM, an evaluation was conducted on both the   consisting of three basis functions, was obtained. The
                RBF and the polynomial kernel function. Subsequent   representation of the acquired MARS model is presented
                testing of these kernel functions revealed that the RBF   in Equation XVI, while the elaborated representation is
                kernel function outperformed the polynomial function.   detailed in Table 6.

                 Table 4. Summary of the model performance
                 Model                    Training phase                                Testing phases
                            RMSE        MAE         R 2      C d (DDRmax)  RMSE        MAE          R 2     C d (DDRmax)
                 SVM        0.02679    0.02318    0.79808       7.50       0.0113      0.0066     0.7573      10.36
                 GEP        0.03847    0.03275    0.74962       5.35       0.0181      0.0109     0.5624      6.32
                 MLP        0.02426    0.02031    0.81602       8.07       0.0111      0.0065     0.6878      11.32
                 MARS       0.04995    0.04381    0.51011       4.02       0.0245      0.0149     0.5593      4.64
                 Notes: R  is the determination coefficient; C d (DDRmax)  is the maximum normalized developed discrepancy ratio value.
                       2
                 Abbreviations: GEP: Gene expression programming; MAE: Mean absolute error; MARS: Multivariate adaptive regression splines;
                 MLP: Multilayer perceptron; RMSE: Root mean square error; SVM: Support vector machine.



                Volume 22 Issue 6 (2025)                        81                           doi: 10.36922/AJWEP025120081
   82   83   84   85   86   87   88   89   90   91   92