Page 94 - GTM-4-3
P. 94
Global Translational Medicine CNNs for overfitting and generalizability in fracture detection
2. Materials and methods (i) Input layer: Accepts images of size 224 × 224 × 3
(ii) Convolutional layers: Extract spatial features using
Datasets were sourced from publicly available Kaggle filters of size 3 × 3
repositories containing several types of fractures, pre- (iii) Batch normalization: Normalizes activations to
classified as either fractures or non-fractures. 26,27 Database improve training stability
1 provided a curated dataset of X-ray images, while (iv) Rectified linear unit activation: Applies the activation
26
Database 2 offered additional annotated data for further function f(x) = max(0,x), to introduce non-linearity
testing. Since both datasets are open-access and have been (v) Max-pooling layers: Reduce spatial dimensions using
27
thoroughly de-identified, the potential risks associated with pooling windows of size 2 × 2
privacy breaches, re-identification, and misuse of sensitive (vi) Dropout layers: Introduce a dropout rate of 20% to
information are effectively eliminated. Both datasets were mitigate overfitting
preprocessed to ensure consistency as follows. (vii) Fully connected layer: Maps feature representations to
2.1. Dataset preparation the two output classes
(viii) Softmax layer: Converts outputs to probabilities for
The dataset used in this study consisted of 4,900 X-ray classification:
images organized into two classes: fractured and not
fractured. To ensure consistency, all images were resized exp z
i
to 224 × 224 pixels, and grayscale images were converted Py ix | k z (II)
to the red, green, blue (RGB) format to match the input j1 exp j
requirements of the CNN. This preprocessing step can be
represented as Equation I: In Equation II, z represents the logit for class i, and k is
i
the number of classes (in this case, k = 2).
I RGB = Resize (I input , 224×224) (I) The architecture strategically interleaves feature
extraction and regularization layers to optimize
where I RGB represents the preprocessed RGB image, and
I is the original image. fracture pattern recognition while curbing overfitting.
input Convolutional layers progressively localized multiscale
The dataset was divided into three subsets: 70% (3,430 fracture signatures, from pixel-level intensity gradients to
images) for training, 15% (735 images) for validation, macro-scale trabecular disruptions. Batch normalization
and 15% (735 images) for testing. The distributions were stabilized activations across variations in X-ray contrast, a
checked to ensure balanced representation of the target common source of domain shift. Dropout layers explicitly
classes across all subsets. disrupted co-adapted feature reliance during training,
The preprocessing pipeline prioritized compatibility forcing the network to consolidate robust diagnostic cues
with established CNN architectures while aligning with resilient to missing inputs.
clinical imaging standards. Resizing images to 224 × 224 2.3. Training procedure
pixels balanced computational efficiency with preservation
of fracture-relevant anatomical detail, ensuring features The model was trained using the Adam optimizer with
like cortical discontinuities remained resolvable. Grayscale an initial learning rate of 0.001. The mini-batch size was
conversion to RGB accommodated pretrained models set to 32, and training was conducted over 10 epochs. The
without altering diagnostic content, as fracture detection training loss L was computed using the categorical cross-
primarily relies on structural contrasts rather than entropy loss function (Equation III):
spectral depth. Class-balanced splitting across training, 1 N k
validation, and testing subsets mitigated biases in fracture = − ∑∑ j= 1 y ij log ˆ y ij (III)
1
i=
prevalence, ensuring model evaluations reflected real- N
world diagnostic challenges rather than dataset-specific ˆ
artifactual advantages. where y is the true label, y is the predicted probability
ij
i,j
for class j, and N is the total number of samples in the
2.2. Model architecture mini-batch.
The CNN architecture used in this study comprised a Validation performance was monitored at regular
series of convolutional layers, batch normalization layers, intervals to ensure effective learning. The dropout layers
rectified linear unit activation functions, max-pooling were only applied during the training phase, ensuring
layers, dropout layers, and a fully connected layer for final that validation metrics reflected true model performance.
classification. The architecture is summarized as follows: The training regimen balanced convergence speed with
Volume 4 Issue 3 (2025) 86 doi: 10.36922/gtm.8526

