Page 41 - AIH-2-1
P. 41

Artificial Intelligence in Health                            Deep learning on chest X-ray and CT for COVID-19



            training. It is crucial to avoid exceeding this optimal value,   of EfficientNet, ResNet, and SeResNext for COVID-19
            as it may lead to overshooting the global minima of the   detection using X-ray images. Key innovations include the
            loss function. This balanced approach ensures efficient   use of a dynamic learning rate scheduler to identify the
            training and better model generalization by avoiding both   optimal learning rate, setting adaptive boundaries (10  –
                                                                                                          −4
            slow convergence and the risk of overshooting the global   10 ) to avoid local minima, and employing discriminative
                                                                 −5
            minima.                                            learning rates tailored for different network layers (ranging
                                                               from  0.0001  to  0.01)  to enhance  feature  extraction and
              Adjusting the learning rate plays a critical role in
            training neural networks. A smaller value of the learning   abstraction. In addition, we implemented a one-cycle
                                                               training policy, dynamically adjusting learning rates across
            rate leads to gradual changes in the loss function but can   epochs to  improve model performance  and stability.
            prolong convergence due to small gradients. Conversely,   These  training  methodologies  significantly  enhance  the
            a larger value of the learning rate may cause overshooting   model’s robustness and accuracy, providing a distinctive
            of the global minimum. Hence, striking a balance between   contribution to the field. Altogether, our proposed model
            the two extremes is essential for efficient training and   introduces several key innovations that enhance the
            better generalization. Furthermore, employing a fixed   robustness and accuracy of these models, making our work
            learning rate can lead to challenges such as becoming   distinct from prior studies.
            trapped in local minima or saddle points where gradients
            are insufficient for optimization. To mitigate these issues,   3. Results
            we set upper and lower bounds on the learning rate,
            typically differing by a factor of ten. For instance, in our   A total of 1763 images were used to build the ML model, of
            experiments, the upper bound is set at 10  and the lower   which 1260 images were used to train the model and 251
                                             −4
            bound at 10 . This range allows the model to explore the   and 252 images were employed for validating and testing
                      −5
            loss landscape effectively, avoiding stagnation in local   the model, respectively. After the screening, the 563 images
            minima or saddle points. In addition, in lieu of employing   indicating COVID-19 were split into train (450 images),
            a uniform learning rate throughout the entire network   valid (71 images), and test datasets (72 images).
            during the training process, this study adopts a strategy   Figure  4A-D shows the confusion matrix for various
            known as discriminative learning rates. Here, learning   CNN architectures. ResNet and DenseNet obtained the
            rates are tailored for different layers of the classifier,   best accuracy, at 94.09%. The confusion matrix corresponds
            typically ranging between 0.0001 and 0.01. This approach   to the predictions of the model on the test dataset.
            acknowledges that various network layers capture distinct   The model is able to distinguish clearly among various
            types of information, thus warranting diverse learning   classes. The confusion matrices indicate that EfficientNet
            rates. Initial layers receive lower learning rates compared   is best at classifying normal images and SeResNext is
            to later layers, reflecting their differing roles in feature   best at classifying pneumonia. ResNet performs best for
            extraction and abstraction.                        classifying images pertaining to COVID-19.

              The training methodology incorporates a one-cycle   Figure  5 shows the predicted class, actual class, loss,
            training policy, characterized by a dynamic adjustment of   and probability  of actual class for a set of misclassified
            learning rates across epochs. Initially, a higher learning rate   images for each model in the format “Prediction/Actual/
            is applied, gradually decreasing toward the final epoch. This   Loss/Probability” on the top of each image. Each image
            technique promotes improved model performance and   illustrates the target class results for the model, highlighting
            stability, facilitating parameter updates at an appropriate   areas where the model’s predictions did not align with the
            pace. By mitigating the risk of local minima entrapment,   true labels.
            the model’s generalizability is enhanced.            Figure 6 shows the sensitivity as well as specificity of

              Implementation details involve Python programming   the CNN models used in this work. The sensitivity of a
            utilizing the fastai library,  a PyTorch-based open-source   class measures the proportion of images belonging to a
                                 45
            platform tailored for deep learning model development.   particular class that are correctly classified by the model.
            Execution takes place on Google Collaboratory, leveraging   The specificity of a class measures the proportion of images
            its provision of a free K-80 GPU with 12GB RAM, ideal   that do not belong to the class of interest and are correctly
            for executing ML algorithms. The source code and neural   classified by the model.
            network weights utilized in this study are publicly available
                      46
            for reference.  In the following, we summarize some of the   4. Discussion
            key novelties in the training strategy in the present study.   EfficientNet has the highest specificity toward images of
            We used training strategies to optimize the performance   class  COVID-19  and  ResNet  has  the  highest  sensitivity


            Volume 2 Issue 1 (2025)                         35                               doi: 10.36922/aih.2888
   36   37   38   39   40   41   42   43   44   45   46