Page 41 - AIH-2-1
P. 41
Artificial Intelligence in Health Deep learning on chest X-ray and CT for COVID-19
training. It is crucial to avoid exceeding this optimal value, of EfficientNet, ResNet, and SeResNext for COVID-19
as it may lead to overshooting the global minima of the detection using X-ray images. Key innovations include the
loss function. This balanced approach ensures efficient use of a dynamic learning rate scheduler to identify the
training and better model generalization by avoiding both optimal learning rate, setting adaptive boundaries (10 –
−4
slow convergence and the risk of overshooting the global 10 ) to avoid local minima, and employing discriminative
−5
minima. learning rates tailored for different network layers (ranging
from 0.0001 to 0.01) to enhance feature extraction and
Adjusting the learning rate plays a critical role in
training neural networks. A smaller value of the learning abstraction. In addition, we implemented a one-cycle
training policy, dynamically adjusting learning rates across
rate leads to gradual changes in the loss function but can epochs to improve model performance and stability.
prolong convergence due to small gradients. Conversely, These training methodologies significantly enhance the
a larger value of the learning rate may cause overshooting model’s robustness and accuracy, providing a distinctive
of the global minimum. Hence, striking a balance between contribution to the field. Altogether, our proposed model
the two extremes is essential for efficient training and introduces several key innovations that enhance the
better generalization. Furthermore, employing a fixed robustness and accuracy of these models, making our work
learning rate can lead to challenges such as becoming distinct from prior studies.
trapped in local minima or saddle points where gradients
are insufficient for optimization. To mitigate these issues, 3. Results
we set upper and lower bounds on the learning rate,
typically differing by a factor of ten. For instance, in our A total of 1763 images were used to build the ML model, of
experiments, the upper bound is set at 10 and the lower which 1260 images were used to train the model and 251
−4
bound at 10 . This range allows the model to explore the and 252 images were employed for validating and testing
−5
loss landscape effectively, avoiding stagnation in local the model, respectively. After the screening, the 563 images
minima or saddle points. In addition, in lieu of employing indicating COVID-19 were split into train (450 images),
a uniform learning rate throughout the entire network valid (71 images), and test datasets (72 images).
during the training process, this study adopts a strategy Figure 4A-D shows the confusion matrix for various
known as discriminative learning rates. Here, learning CNN architectures. ResNet and DenseNet obtained the
rates are tailored for different layers of the classifier, best accuracy, at 94.09%. The confusion matrix corresponds
typically ranging between 0.0001 and 0.01. This approach to the predictions of the model on the test dataset.
acknowledges that various network layers capture distinct The model is able to distinguish clearly among various
types of information, thus warranting diverse learning classes. The confusion matrices indicate that EfficientNet
rates. Initial layers receive lower learning rates compared is best at classifying normal images and SeResNext is
to later layers, reflecting their differing roles in feature best at classifying pneumonia. ResNet performs best for
extraction and abstraction. classifying images pertaining to COVID-19.
The training methodology incorporates a one-cycle Figure 5 shows the predicted class, actual class, loss,
training policy, characterized by a dynamic adjustment of and probability of actual class for a set of misclassified
learning rates across epochs. Initially, a higher learning rate images for each model in the format “Prediction/Actual/
is applied, gradually decreasing toward the final epoch. This Loss/Probability” on the top of each image. Each image
technique promotes improved model performance and illustrates the target class results for the model, highlighting
stability, facilitating parameter updates at an appropriate areas where the model’s predictions did not align with the
pace. By mitigating the risk of local minima entrapment, true labels.
the model’s generalizability is enhanced. Figure 6 shows the sensitivity as well as specificity of
Implementation details involve Python programming the CNN models used in this work. The sensitivity of a
utilizing the fastai library, a PyTorch-based open-source class measures the proportion of images belonging to a
45
platform tailored for deep learning model development. particular class that are correctly classified by the model.
Execution takes place on Google Collaboratory, leveraging The specificity of a class measures the proportion of images
its provision of a free K-80 GPU with 12GB RAM, ideal that do not belong to the class of interest and are correctly
for executing ML algorithms. The source code and neural classified by the model.
network weights utilized in this study are publicly available
46
for reference. In the following, we summarize some of the 4. Discussion
key novelties in the training strategy in the present study. EfficientNet has the highest specificity toward images of
We used training strategies to optimize the performance class COVID-19 and ResNet has the highest sensitivity
Volume 2 Issue 1 (2025) 35 doi: 10.36922/aih.2888

