Page 77 - AIH-2-3

P. 77

Artificial Intelligence in Health CNN model for leukemia diagnosis

Table 2. (Continued)
Epochs Model Accuracy Precision Recall F1‑Score Method for handling Key contributions
(%) (%) (%) (%) imbalanced data
50 SVM + CNN Hybrid 82 80 81 80.5 No explicit method Final performance stabilizes,
good but not competitive
with DL models
50 Traditional ML 72 70 72 71 No explicit method Best possible performance
but still behind DL
approaches
Abbreviations: CNN: Convolutional neural network; DL: Deep learning; ML: Machine learning; SVM: Support vector machine.

In contrast, models such as ResNet101 Ensemble and
ALLNET show strong performance but do not explicitly
address data imbalance issues, leading to slightly lower
overall accuracy and precision. While these models are
competitive, they fall short in scenarios where balanced
classification is crucial.
Support vector machine + CNN Hybrid and traditional
ML models perform relatively well in earlier epochs but
are ultimately limited by their inability to handle complex
imbalanced datasets and achieve lower performance
metrics compared to the DL approaches.

3.2. Training process
The training process for the DL models involved the Figure 5. Confusion matrix values for CNN versus RNN: (A) confusion
following steps: matrix for CNN prformance at epoch 3, and (B) confusion matrix for
1. Data splitting: The C-NMC dataset was divided into RNN performance at Epoch 3.
Abbreviations: CNN: Convolutional neural network; RNN: Recurrent
training, validation, and test sets. This partitioning neural network.
ensures that the models are trained on a substantial
portion of the data while being validated and tested on • Specificity assesses the model’s ability to correctly
separate subsets to evaluate performance. identify negative cases.
2. Loss function: Different loss functions were used • Precision measures the accuracy of the positive
based on the classification task. predictions.
3. For binary classification (normal vs. abnormal cells), • F1Score provides a balanced measure of precision
the binary cross-entropy loss function was employed. and recall.
4. For multiclass classification (e.g., different types of Figure 5 displays the performance of CNN and recurrent
leukemia), the categorical cross-entropy loss function neural network (RNN) algorithms over three epochs in
was utilized. terms of confusion matrix values for the C-NMC dataset.
5. In this paper, Tversky loss function optimizer was
selected with CNN for its efficiency in handling large At epoch 3, the performance of both the CNN and
datasets and its ability to adapt the learning rate during RNN models can be represented through their confusion
training. A learning rate scheduler was also used to matrices, showing their classification effectiveness. For the
dynamically adjust the learning rate to enhance model CNN model, the confusion matrix is as follows: 58 true
convergence. positives (TP), 8 FP, 6 FN, and 54 true negatives (TN).
6. Evaluation metrics: The models were evaluated using This indicates that the CNN model correctly identified
several metrics to provide a comprehensive assessment 58 positive cases, misclassified 8 negative cases as FP,
of their performance: missed 6 actual positive cases (FN), and correctly classified
• Accuracy measures the overall correctness of the 54 negative cases.
model’s predictions. In comparison, the RNN model at epoch 3 has a slightly
• Sensitivity (Recall) evaluates the model’s ability to lower performance, with 55 TP, 9 FP, 7 FN, and 52 TN.
correctly identify positive cases. This shows that while both models are performing well, the

Volume 2 Issue 3 (2025) 71 doi: 10.36922/aih.4710

72 73 74 75 76 77 78 79 80 81 82