Page 55 - AIH-2-2
P. 55

Artificial Intelligence in Health                            Predicting ICU mortality: A stacked ensemble model




            Table 1. Dataset variables
            Categorical Gender    Intubation      Readmission      Emergency surgery  Age           Lymphoma
                     GCS (Eyes response) GCS (Verbal response)  GCS (Motor response)  Operative/Non-operative Immunosuppression  AIDS
                     Hepatic failure  Metastatic cancer  Leukemia  Cirrhosis        Thrombolysis    Dialysis
            Numerical  Hematocrit  Albumin        Temperature      Heart rate       Respiratory rate  FiO
                                                                                                      2
                     PaO          PaCO            Arterial pH      Na  (sodium)     Urine output    Creatinine
                                                                     +
                        2             2
                     Mean arterial   Blood urea nitrogen   White blood cell count  Blood sugar level  Bilirubin
                     pressure
            Output   ICU mortality
            Abbreviations: AIDS: Acquired immune deficiency syndrome; FiO : Fraction of inspired oxygen; GCS: Glasgow Comma Scale; ICU: Intensive care unit;
                                                       2
            PaO : Partial pressure of oxygen; PaCO : Partial Pressure of Carbon Dioxide.
               2                     2

                                                                                                            26
            imputations, we aimed to ensure that the output column   specific  performance  metrics  of  ensemble  learning,
            of  the dataset  contained  non-empty  values;  reducing   ultimately leading to the development of effective model
            the available dataset records to 148,532 ICU patients for   combinations.  The final model was trained in 127.82 s
            further pre-processing (139,917 survived and 8,615 non-  using an Apple M1 Max 32-Core GPU with 32 GB of
            survived).                                         unified RAM.

              To address the high-class imbalance in the dataset, we   3.1. Development of ML models
            undersampled the majority class (survived) during model
            training using random selection. This approach aimed   The purpose of this research was to develop several robust
            to  prevent  bias  toward survival  and  ensure  the  model   models capable of predicting patients’ ICU mortality.
                                                               Typically, the first step was data pre-processing and
            accurately estimates the probability of ICU patient mortality
            by fine-tuning the respective model hyperparameters. In   hyperparameters’ tuning by employing various strategies.
            that way, we developed a balanced dataset resulting in a   The developed model exploited the native pre-processing
            dataset of 17,230  patients with an equal distribution of   capabilities of each distinct algorithm. It utilized feature
                                                               selection, data scaling, categorical variable encoding,
            survived and non-survived patients – 8615 records each.
                                                               and feature importance  analysis  techniques  which  were
              Undersampling can resolve bias toward the minority   tailored to the architecture of each model. Through
            class  and offers significant  computational advantages  by   refinement, the most effective training processes were
            reducing training time and memory demands. It can also   determined, maximizing the predictive power of the
            reduce the model’s complexity and has the potential to   algorithms. This enabled the determination of the most
            improve interpretability. This may also reduce noise within   suitable settings for each algorithm’s key hyperparameters,
            the minority class and enhance generalization performance   toward the direction of the optimal performance. Through
            by focusing on more valuable data points. 19       this iterative optimization process, the performance of our
                                                               models  was  optimized,  the  bias  was mitigated,  and the
            3. Methodology                                     generalization ability of the model was certified on novel,
            Various ML and deep learning algorithms were employed   first-time-seen data.
            in this research, such as Decision Trees, Random Forests,    As it has been already mentioned, several ML algorithms
                                                         20
            Extra Trees,  XGBoost,  CatBoost,  Light Gradient-  were considered. This paper focuses on the most robust
                                22
                                          23
                      21
            Boosting Machine (LightGBM),  and Neural Networks.    ones, namely Decision Trees, gradient-boosted trees
                                                         25
                                      24
            Moreover, various ensemble learning algorithms were used.   (GBTs), and Random Forests. Decision trees can be easily
            To ascertain the optimal architecture, a comprehensive   interpreted, although they are often prone to overfitting.
            model evaluation was undertaken.                   GBTs, on the other hand, mitigate this by sequentially
              This process involved leveraging the default feature   building ensembles of trees, while Random Forests
            handling capabilities of each ML model. The hyperparameter   combine multiple decision trees through diverse subsets of
            values were carefully optimized and defined during the   data and input features, leading to a better generalization
            training  phase to maximize  performance, as  described   and accuracy of the respective model.
            in more detail in the subsection below. The developed   Moreover, we took into consideration additional
            models were evaluated by considering the values of   contemporary and state-of-the-art algorithms. XGBoost,


            Volume 2 Issue 2 (2025)                         49                               doi: 10.36922/aih.4981
   50   51   52   53   54   55   56   57   58   59   60