Page 85 - AJWEP-22-6
P. 85

ML-based C  for side trapezoidal labyrinth weirs
                                                         d
                 Table 3. Summary of hydraulic and geometric        linear chromosomes are translated into ETs, which are
                 parameters                                         the functional representation of the chromosome. ETs

                 No.  W     R     L      t   N   w     B     Q      are hierarchical  structures that  represent  the solution
                     (cm) (cm)   (cm)  (mm)        2       (L/s)    in  a  form  that  can  be  evaluated.  On  the  other  hand,
                                                 w     w
                                                   1    1           functions  are  mathematical  operations  (e.g.,  addition
                 1    60    45    66     5   -    -     -  5–50     and subtraction) or logical operations, while terminals
                 2               136         5   0.42 1.00          are  variables  or  constants  that  serve  as  the  input  for
                 3               136             0.35 1.00          the functions. A fitness function evaluates how well a
                                                                    chromosome solves the problem at hand. It compares
                 4               136             0.30 1.00          the output of the ET with the desired output, assigning a
                 5               152             0.42 1.25          fitness score to each chromosome. 44
                 6               152             0.35 1.25             To begin modeling with GEP, the first step is to clearly
                 7               152             0.30 1.25          define  the  problem,  whether  it  involves  regression,
                 8               170             0.42 1.50          classification, or symbolic modeling. After identifying
                 9               170             0.35 1.50          the  problem  type,  a  suitable dataset is collected  and
                 10              170             0.30 1.50          carefully  preprocessed to handle missing values and
                                                                    normalize or standardize features if necessary, ensuring
                 Notes: B is the weir height; L is the weir length; N is the number   the data quality supports accurate modeling. Once the
                 of cycles; R is the arc radius; t is the crest thickness; W is the
                 width of a cycle; W  is the inside apex width of the middle cycle;   data is ready, the next step is to split it into training and
                               1
                 W  is the inside apex width of the end cycle.      testing subsets – typically allocating 70–80% of the data
                  2
                                                                    for training and the remainder for testing, which allows
                tasks  or  fit  a  hyperplane  within  an  ε-insensitive   for  evaluation  of  the  model’s  predictive  capability.
                region  in  regression  tasks.  Following  training,  model   With the data split, the structure of the GEP model is
                validation  is conducted  through cross-validation  to   configured  by  setting  parameters  such  as  the  number
                assess  performance and identify opportunities for   of genes, head size, function  set (e.g., mathematical
                improvement.  Hyperparameters  are  then  fine-tuned   operators  like  +, –, *, and/),  and terminal  set (input
                using techniques such as grid  search  or random    variables and constants). The evolutionary process then
                search.  Finally,  the  model  is  evaluated  on  the  testing   begins, during which a population of random solutions
                dataset using appropriate evaluation metrics, including   is generated and evolved over several generations using
                accuracy,  precision,  recall,  F1-score  for  classification   genetic  operators, such as mutation,  transposition,
                tasks, and RMSE and mean absolute error (MAE) for   and  recombination.  The  fitness  of  each  chromosome
                regression tasks. This final evaluation provides insights   is evaluated based on fitness functions, such as mean
                into the  model’s  generalization capability and overall   squared error for regression and classification accuracy
                effectiveness.                                      for classification tasks. Through successive generations,
                                                                    the population evolves to produce increasingly accurate
                2.3.2. GEP                                          solutions. Once the model training  is complete,  the
                The GEP is an evolutionary  algorithm  that creates   optimally evolved expression or model is selected, and
                models and solutions by evolving computer programs   its performance is evaluated on the testing dataset using
                or  expressions.  Developed  by  Ferreira,  the  GEP is   appropriate  metrics.  Finally,  the  resulting  symbolic
                                                    43
                an extension of genetic  programming  and genetic   expression  can  be  interpreted  and,  if  necessary,
                algorithms. It combines the strengths of both approaches   simplified  for  use  in  practical  applications  or  further
                to effectively solve complex problems in various fields,   analysis. 45
                including  artificial  intelligence,  bioinformatics,  and
                engineering.                                        2.3.3. ANN
                  Chromosomes     and   expression   trees   (ETs),   The ANN stands as an advanced mathematical approach
                functions and terminals, and fitness functions are three   proficient  in  mapping  intricate  systems  rooted  in
                fundamental  concepts  of  the  GEP.  Chromosomes  in   datasets. A prevalent category of ANN is the multilayer
                GEP  are  linear  strings  of  fixed  length  composed  of   perceptron  (MLP), which is extensively  employed  in
                genes, which encode the solutions. Each gene consists   research studies. The effective deployment of an MLP
                of a head and a tail. The head contains functions and   model necessitates the specification of suitable transfer
                terminals, while the tail contains only terminals. These   functions and the configuration of an optimal structure,



                Volume 22 Issue 6 (2025)                        79                           doi: 10.36922/AJWEP025120081
   80   81   82   83   84   85   86   87   88   89   90