Page 68 - AIH-1-1
P. 68

Artificial Intelligence in Health                                       Advancing fetal health classification



            performance, outperforming the other models. Therefore,   5. Results
            to train and evaluate the fetal health classification model,
            the LightGBM classifier was employed. The LGBMClassifier   Figure 2 depicts the numerical results of LightGBM model
            implementation available on the scikit-learn library was   in fetal health classification. The performance of the model
            utilized. The model was trained using a 20-fold cross-  was evaluated using various metrics, including accuracy,
            validation procedure, which involved dividing the dataset   area under the curve (AUC), recall, precision, F1 score,
            into 20 subsets. We trained the model on 19 subsets and   kappa, and Matthews correlation coefficient (MCC).
            evaluated its performance on the remaining subset. This   The highlights of the results presented in Figure 2 are
            process was repeated 20 times, with each subset serving as   as follows:
            the evaluation set once. Through aggregation of the results,   •   The  LightGBM  model  showcased  remarkable
            a comprehensive evaluation of the model’s performance   performance in the classification of fetal health
            was obtained.
                                                                  conditions, achieving an outstanding accuracy of
              SMOTE  was  applied  to  address  any  class  imbalance   98.32%. This elevated accuracy underscores the
            issues present in the dataset by balancing the distributions   model’s proficiency in correctly categorizing the
            of the different classes to ensure that the model learned   majority of instances within the dataset, attesting to
            from a more representative dataset. This preprocessing   its robust learning capabilities.
            step enhanced the model’s ability to handle imbalanced   •   Further enhancing its evaluative prowess, the model
            class distributions and improved its overall performance.  yielded an impressive AUC score of 0.9985. This
              The LightGBM classifier was configured with default   exceptional AUC score signifies the model’s excellent
            hyperparameters, including a learning rate of 0.1, an unlimited   discrimination ability, effectively distinguishing
            maximum tree depth, and a minimum of 20 samples required   between diverse fetal health classes. The high AUC value
            in each leaf. A  hundred boosting iterations were utilized,   indicates that the model excels in accurately ranking
            and the number of leaves in each tree was set to 31. The   instances according to their predicted probabilities,
            model was trained using all available parallel threads (−1) to   adding a layer of confidence to its predictive capabilities.
            leverage efficient computational resources.        •   In terms of recall, the LightGBM model achieved a
                                                                  remarkable score of 0.9937. This noteworthy metric
              To  assess the model’s  generalization  capability, the   underscores  the  model’s  adeptness  in  correctly
            dataset was split into a training set comprising 80% of the   identifying instances belonging to the positive class,
            data and a test set containing the remaining 20%. The test   whether indicative of a healthy or abnormal fetal health
            set was not used during the model training process and   condition. The high recall value attests to the model’s
            served as an independent dataset for evaluating the model’s   sensitivity, which is particularly crucial in capturing
            performance on unseen data.                           instances with positive labels and minimizing false

            4.5. Performance metrics                              negatives.
                                                               •   Precision,  a measure of the model’s accuracy in
            To evaluate the performance of the fetal health classification   classifying instances as positive, demonstrated a
            model, several standard metrics, including accuracy,   commendable score of 0.9790. This result underscores
            precision, recall, and F1 score, were used. These metrics   the model’s precision in correctly identifying instances
            provided  insights  into  the  model’s  ability  to  correctly
            classify fetal health conditions. The confusion matrix
            was also analyzed to understand the distribution of true
            positives, true negatives, false positives, and false negatives.
            The evaluation metrics allowed us to assess the strengths
            and limitations of the model in fetal health classification.
              This experimental setup allowed for the development
            of a reliable and accurate ML model for fetal health
            classification. The dataset selection, data preprocessing,
            feature selection, model training, and evaluation process
            were  carefully designed  to  ensure the validity and  rigor
            of these experiments. The following sections present
            experimental  results  and  a  discussion  of  the  finding,
            offering insights into the performance and implications of
            the proposed model for fetal health assessment.    Figure 3. Receiver operating characteristic curve.


            Volume 1 Issue 1 (2024)                         62                        https://doi.org/10.36922/aih.2121
   63   64   65   66   67   68   69   70   71   72   73