Page 85 - AJWEP-22-6
P. 85
ML-based C for side trapezoidal labyrinth weirs
d
Table 3. Summary of hydraulic and geometric linear chromosomes are translated into ETs, which are
parameters the functional representation of the chromosome. ETs
No. W R L t N w B Q are hierarchical structures that represent the solution
(cm) (cm) (cm) (mm) 2 (L/s) in a form that can be evaluated. On the other hand,
w w
1 1 functions are mathematical operations (e.g., addition
1 60 45 66 5 - - - 5–50 and subtraction) or logical operations, while terminals
2 136 5 0.42 1.00 are variables or constants that serve as the input for
3 136 0.35 1.00 the functions. A fitness function evaluates how well a
chromosome solves the problem at hand. It compares
4 136 0.30 1.00 the output of the ET with the desired output, assigning a
5 152 0.42 1.25 fitness score to each chromosome. 44
6 152 0.35 1.25 To begin modeling with GEP, the first step is to clearly
7 152 0.30 1.25 define the problem, whether it involves regression,
8 170 0.42 1.50 classification, or symbolic modeling. After identifying
9 170 0.35 1.50 the problem type, a suitable dataset is collected and
10 170 0.30 1.50 carefully preprocessed to handle missing values and
normalize or standardize features if necessary, ensuring
Notes: B is the weir height; L is the weir length; N is the number the data quality supports accurate modeling. Once the
of cycles; R is the arc radius; t is the crest thickness; W is the
width of a cycle; W is the inside apex width of the middle cycle; data is ready, the next step is to split it into training and
1
W is the inside apex width of the end cycle. testing subsets – typically allocating 70–80% of the data
2
for training and the remainder for testing, which allows
tasks or fit a hyperplane within an ε-insensitive for evaluation of the model’s predictive capability.
region in regression tasks. Following training, model With the data split, the structure of the GEP model is
validation is conducted through cross-validation to configured by setting parameters such as the number
assess performance and identify opportunities for of genes, head size, function set (e.g., mathematical
improvement. Hyperparameters are then fine-tuned operators like +, –, *, and/), and terminal set (input
using techniques such as grid search or random variables and constants). The evolutionary process then
search. Finally, the model is evaluated on the testing begins, during which a population of random solutions
dataset using appropriate evaluation metrics, including is generated and evolved over several generations using
accuracy, precision, recall, F1-score for classification genetic operators, such as mutation, transposition,
tasks, and RMSE and mean absolute error (MAE) for and recombination. The fitness of each chromosome
regression tasks. This final evaluation provides insights is evaluated based on fitness functions, such as mean
into the model’s generalization capability and overall squared error for regression and classification accuracy
effectiveness. for classification tasks. Through successive generations,
the population evolves to produce increasingly accurate
2.3.2. GEP solutions. Once the model training is complete, the
The GEP is an evolutionary algorithm that creates optimally evolved expression or model is selected, and
models and solutions by evolving computer programs its performance is evaluated on the testing dataset using
or expressions. Developed by Ferreira, the GEP is appropriate metrics. Finally, the resulting symbolic
43
an extension of genetic programming and genetic expression can be interpreted and, if necessary,
algorithms. It combines the strengths of both approaches simplified for use in practical applications or further
to effectively solve complex problems in various fields, analysis. 45
including artificial intelligence, bioinformatics, and
engineering. 2.3.3. ANN
Chromosomes and expression trees (ETs), The ANN stands as an advanced mathematical approach
functions and terminals, and fitness functions are three proficient in mapping intricate systems rooted in
fundamental concepts of the GEP. Chromosomes in datasets. A prevalent category of ANN is the multilayer
GEP are linear strings of fixed length composed of perceptron (MLP), which is extensively employed in
genes, which encode the solutions. Each gene consists research studies. The effective deployment of an MLP
of a head and a tail. The head contains functions and model necessitates the specification of suitable transfer
terminals, while the tail contains only terminals. These functions and the configuration of an optimal structure,
Volume 22 Issue 6 (2025) 79 doi: 10.36922/AJWEP025120081

