Page 40 - AIH-2-1
P. 40
Artificial Intelligence in Health Deep learning on chest X-ray and CT for COVID-19
2.2.3. Method 3: DenseNet Transfer learning uses the initial weights of the neural
DenseNet architecture was proposed by Xie et al. Its network pre-trained on a different database, which is
40
43
architecture is very similar to that of ResNet, where ImageNet for the present study. Transfer learning plays
there are feed-forward connections from each layer to a crucial role in improving the model performance
the next layer. In DenseNet, feature maps of one layer when working on a dataset where the number of images
38
are concatenated with feature maps of all the following are limited. The amount of labeled data available in the
layers. This approach offers a benefit by leveraging biomedical domain is limited mainly due to the time taken
features extracted from early layers for subsequent to annotate the dataset. Initializing the model weights in this
layers. Convolution blocks are sequentially stacked, and manner helps the model to capture important information
interspersed with basic convolution layers to preserve from the images. The initial layers of the network capture
dimensionality across the network’s depth. It consists of very generic information from an image such as horizontal
various “dense blocks.” A simple “dense block” is depicted and vertical edges, whereas the later layers of the network
in Figure 2D. capture patterns in an image that are very specific to the
dataset of study as previously described. This pre-training
41
2.2.4. Method 4: EfficientNet on a large dataset provides a solid foundation, allowing the
Most of the architectures such as ResNet, VGG Net, and model to start with a better understanding of general visual
Inception Net, are created manually by researchers where patterns. The fine-tuning process adapts the model to the
they specify the complete network architecture upfront, unique features and characteristics of the target dataset,
for example, number of layers, filter size, and number allowing it to specialize in recognizing patterns relevant to
of channels based on previous experiments/experience. the biomedical images at hand.
EfficientNets are created using neural architecture search 2.3. Training
42
where the complete model is built algorithmically by
44
keeping a constraint on the number of parameters. It uses We employed a learning rate scheduler to determine the
ResNet as a baseline model and modifies the number of most effective learning rate for our specific dataset. During
layers, number of channels, and input image dimensions this process, the learning rate is cautiously increased after
in the baseline model to create the desired model. The each mini-batch, with the corresponding loss recorded at
smallest model, i.e., with a minimum number of parameters each increment. Subsequently, we plotted the loss versus
in EfficientNet is called b0, and seven other models are learning rate, as illustrated in Figure 3, revealing how
generated by changing the constraints, such as the number different learning rates impacted the model’s performance.
of parameters and the number of Floating Point Operations Notably, for a very low learning rate, the loss diminishes
Per Second. EfficientNet b7 is the largest model among at a slower pace. As the learning rate increased, the loss
EfficientNets. EfficientNets have very less inference time showed a rapid decline, indicating the optimal range.
when compared to other models with a similar number of Beyond this point, further increases in the learning rate
parameters. As the image size increases, larger EfficientNet caused the loss to rise sharply, suggesting overshooting.
models are preferred since they have a greater number of By identifying the point of the steepest loss decline (0.002
layers and the channel size also increases. This helps in in our case), we determined the optimal learning rate for
obtaining useful features from the larger image.
Overall, the rational for choosing this architecture
is as follows: ResNet was selected for its skip connection
architecture, which facilitates stable learning during
training; DenseNet for its dense connectivity facilitating
feature reuse across layers; SeResNext for its integration
of SE modules for enhanced feature recalibration; and
EfficientNet for its efficient model scaling strategy, which
collectively provides a diverse range of architectural
innovations for achieving accurate and reliable image
classification. It should be noted that it is always possible
(and sometimes more desirable) to make an ensemble of
these models for further improving the overall prediction.
Since the present article focuses on the fundamental aspects
of the implementation of these models, the ensemble Figure 3. Scheduling the learning rate by investigating its impact on the
strategy is not explored in this work. loss function. Image created by the authors.
Volume 2 Issue 1 (2025) 34 doi: 10.36922/aih.2888

