Page 48 - MSAM-4-3
P. 48

Materials Science in Additive Manufacturing                           AI-driven defect detection in metal AM




            Table 4. Parameter settings for the faster R‑CNN model
            Parameter                  Value      Description
            (Training+Validation): Test  9:1      Split of training, validation, and test sets
            Training: Validation        9:1       Split of training and validation data
            input_shape               (600, 600)  Each input image is resized to 600×600
            Backbone                  resnet50    Used for feature extraction
            anchors_size              (4, 16, 32)  Sets anchor box sizes during training, generating anchor boxes at three different scales to
                                                  detect objects of varying sizes
            Freeze_Epoch                50        End epoch for the frozen stage
            Freeze_batch_size            4        Batch size during the frozen stage
            Freeze_lr                   1e-4      Learning rate during the frozen stage
            nFreeze_Epoch               100       End epoch for the unfrozen stage
            Unfreeze_batch_size          4        Batch size during the unfrozen stage
            Unfreeze_lr                 1e-5      Learning rate during the unfrozen stage
            Confidence threshold        0.5       Predicted boxes with confidence scores above 0.5 are considered valid detections during
                                                  object detection
            nms_iou threshold           0.3       Intersection Over Union threshold for non-maximum suppression; when the overlap of two
                                                  predicted boxes exceeds 0.3, the model retains the higher-scoring box and suppresses the
                                                  other to reduce redundancy
            Test anchors_size         (4, 8, 16)  Anchor box sizes used during model evaluation
            R-CNN: region-based-Convolutional neural network

            faster inference speed, and sound detection performance.   Figures 2 and 3, respectively. Both training and validation
            YOLOv5 employs mosaic data augmentation, which     losses for ResNet-50 decrease rapidly in the initial stages
            combines four images into one input image, enhancing   and stabilize, with closely aligned trends, indicating good
            the detection of small objects. The model also provides an   model fitting and effective learning without overfitting.
            automated hyperparameter optimization system that searches   However, its validation accuracy fluctuates, suggesting
            for the best training configuration for hyperparameters, such   weaker generalization. In addition, the training accuracy
            as learning rate, weight decay, and mosaic probability. 39  of EfficientNetV2B0 surpasses 90% within the first few
              The model training consists of two stages: Freezing   batches and remains stable at around 100%, demonstrating
            and unfreezing. First, the dataset and pre-trained weights   superior learning capability and stability compared to
            are  loaded  (to  accelerate  training  and  enhance  feature   ResNet-50. To ensure a fair comparison, both models were
            extraction capabilities). In the freezing stage, only the   trained for 50 epochs, but EfficientNetV2B0 exhibited slight
            detection head parameters are trained, while the backbone   overfitting in the final stages, as indicated by declining
            network weights remain unchanged to reduce memory   training loss, increasing validation loss, and fluctuations in
            usage. In the unfreezing stage, the backbone network   validation accuracy.
            parameters are unlocked, allowing the entire model to   The training and validation loss changes during the Faster
            participate in training and improve performance. During   R-CNN training process are displayed in Figure 4. In the first 50
            training, loss values are recorded, and model weights   epochs of the freezing phase, the training and validation losses
            are saved after each epoch. A training log is generated to   rapidly decrease in the initial few epochs and then gradually
            evaluate the model’s performance.                  stabilize. The smoothness of the training and validation
              The main settings and parameters used for the    losses, with a difference of <0.1, indicates that the model did
            Faster  R-CNN  and  YOLOv5  models  are  presented  in   not experience significant overfitting or underfitting. During
            Tables 4 and 5, respectively.                      the following 50 epochs, when the unfreezing phase begins
                                                               and the model undergoes full parameter updates, fluctuations
            3. Results                                         in the loss are normal. However, after that, the loss decreases

            3.1. Loss and accuracy                             but does not fully converge.
            The loss and accuracy changes for the image classification   One limitation of the Faster R-CNN model is that it
            models, ResNet50 and EfficientNetV2B0, are featured in   selects 256 mini-batches of anchor boxes from the same


            Volume 4 Issue 3 (2025)                         6                         doi: 10.36922/MSAM025150022
   43   44   45   46   47   48   49   50   51   52   53