Page 44 - AIH-1-3
P. 44

Artificial Intelligence in Health                                  Predicting mortality in COVID-19 using ML



            3.2.6. KNNs                                        values of the different hyperparameters. In this study,

            The  KNNs are  a non-parametric  supervised  learning   we used “GridSearchCV()” as a 10-fold cross-validation
            method used in classification problems, where “non-  method. It accepts the respective ML method and the sets
            parametric” means that the input and output data will be   of hyperparameter values as input and outputs the optimal
            similar in type. The method was discovered by Fix and   value for each hyperparameter. This process resulted in
            Hodges  in 1951 and was subsequently developed by   nine different combinations for every ML method across
                  35
            Cover.  KNN classifies new samples based on their value   the six datasets, creating a total of 54 different models
                 54
            distance from samples with a known class label, relying on   for each ML method, and 324 models in total for all six
            the logic that similar samples belong to the same class.    methods. We ran each model 10  times (iterations) and
                                                         55
                                                               calculated the mean to avoid extreme values in the metrics,
            The class to which each new sample will belong depends
            on its distance from the k previous samples in the training   resulting in 540 iterations for each ML method and a total
            dataset. KNN can be used for classification problems with   of 3,240 iterations for all ML methods. The flowchart for
            discrete  variable  objectives  or  regression  problems  with   creating, training, and evaluating each model is shown in
            continuous variable objectives. In this study, we used the   Figure 17.
            “KNeighborsClassifier” method from the sklearn library.  4. Results

            3.3. The importance of attributes                  This section presents the metrics used in the evaluation
            For each ML method, except for MLPs and KNN, we    and  the evaluation  results  of the  created models. The
            used three different sets of attributes, depending on the   evaluation results are presented for both all models created
            importance score that each attribute aggregated according   as well as the models with the highest overall score for each
            to  the  “feature_importances_”  method.  This  sklearn   ML method. In addition, an overall ranking of all models
            library method is a vector of shape available in certain   according to their highest score is presented.
            Python predictors and provides a relative measure of the   4.1. Evaluation metrics
                                                         56
            importance of each feature in the predictions of the model.
            For the “MLPClassifier” and “KNeighborsClassifier,”   To assess the performance of 324 ML models, we used the
                                                                                      36
                                                                                36
                                                                                                36
            the score for each attribute was calculated as the   metrics of precision,  recall,  F1 score,  and the AUC-
                                                                   37
                                                                                                      36
            normalized sum of the scores from the four previous   ROC,  computed through the confusion matrix,  and the
            methods: “LogisticRegression,” “DecisionTreeClassifier,”   runtime metric. The confusion matrix is a summary of
            “RandomForestClassifier,” and “XGBClassifier.”     the prediction results of a model, depicting the number of
                                                               correct and incorrect predictions made by the evaluation
              These three sets had a different number of attributes:   model. The predictions are categorized into four groups:
            One contained all 22 attributes, another included the 15   True positives (TP), false positives (FP), true negatives
            most important attributes and the last contained only the   (TN), and false negatives (FN).  The TP is the correctly
                                                                                        36
            10 most important attributes. The following diagrams in   predicted  positive  value,  FP  is  the  wrongly  predicted
            Figures 6-16 illustrate the attribute rankings and the SHAP   positive value, TN is the correctly predicted negative value,
            (SHapley Additive exPlanation) summary plots for all six   and FN is the wrongly predicted negative value for the
            ML methods.                                        samples of the training set. Based on these four parameters,
            3.4. Hyperparameter values optimization            we can calculate precision, recall, F1 Score (F1 Score), and
                                                               the AUC-ROC.
            We used three different sets of hyperparameters for
            each ML method. The first set contained the default   Precision is calculated as the ratio of TP to the total
            values (default), the second set contained the first set of   predicted positive observations, giving us the model’s
            optimized values  for the ML  method’s hyperparameters   percentage of correctly predicted positive values. It is given
            (opt_01), and  the third set contained  the second  set of   by Equation I.
            optimized hyperparameters values (opt_02). These two          TP
            optimized hyperparameter sets were created using the   Precision= (TP+FP)
            “GridSearchCV” method from the sklearn library. To form                                        (I)
            the two sets of optimized hyperparameters, including   Recall is the ratio of TP to the total number of positive
            the optimal values for most hyperparameters, we applied   values. It is given by Equation II below.
            sklearn’s “GridSearchCV()” grid search method. This         TP
            method is used to search for the optimal value of each   Recall=
            hyperparameter through a given grid containing all possible   (TP+FN)                          (II)


            Volume 1 Issue 3 (2024)                         38                               doi: 10.36922/aih.2591
   39   40   41   42   43   44   45   46   47   48   49