Page 143 - EJMO-9-1
P. 143

Eurasian Journal of Medicine and
            Oncology
                                                                        Machine learning insights into heart failure outcomes


            “accuracy_score,” function from the scikit-learn metrics
            module. The methodology employed in this study, as
            depicted in  Figure  1, encompassed comprehensive data
            preprocessing, feature importance analysis, correlation
            matrix computation, and machine learning model
            implementation for predicting death events among HF
            patients.

              The verification process involved splitting the dataset
            into training and testing subsets. Specifically, after
            preprocessing the data, 80% of the dataset was utilized for
            training the machine learning models, and the remaining
            20% was reserved for testing. The models, including logistic
            regression, random forest, GBM, and others, were trained
            using the training subset to learn patterns and relationships
            between the features and the target variable (“DEATH
            EVENT”). Verification was performed using the testing
            dataset, which was not involved in training to ensure an
            unbiased assessment. The performance of each model was
            evaluated against real clinical outcomes recorded in the
            dataset. Key metrics, such as accuracy, precision, recall,   Figure  1. Flowchart illustrating the methodology used in the study,
            F1-score, and the area under the curve of the receiver   entailing data preprocessing, feature importance analysis, correlation
            operating characteristic (AUC-ROC) curve, were used   matrix computation, and machine learning model implementation for
                                                               predicting death events among heart failure patients
            to quantify predictive performance. Confusion matrices
            further provided detailed insights into true positives, false
            positives, true negatives, and false negatives, offering a   Table 2. Selected attributes based on the correlation with
                                                               “Death Event”
            nuanced understanding of each model’s ability to correctly
            identify death events.                             Attributes                 Feature importance scores
                                                               Time                               0.35
            3. Results
                                                               Serum creatinine                   0.14
            3.1. Feature importance analysis                   Ejection fraction                  0.12
            The results of the study are summarized in Table 2, which   Platelets                 0.082
            presents  the selected  attributes  based  on  correlation   Creatinine phosphokinase  0.079
            with the target variable “DEATH EVENT” and their   Age                                0.077
            corresponding  feature  importance  scores  obtained  from   Serum sodium             0.073
            a  random forest regressor  model.  The  table  provides  a   Anemia                  0.013
            comprehensive overview of the relative importance of
            each attribute in predicting the occurrence of death events   Sex                     0.013
            among HF patients.  Table 2 illustrates that the attribute   Smoking                  0.012
            “time” exhibited the highest feature importance score   High blood pressure           0.011
            of  0.356, emphasizing its  significant  predictive  power   Diabetes                 0.011
            in forecasting outcomes. This is followed by “serum
            creatinine” with a score of 0.142 and “ejection fraction” with
            a score of 0.127. These findings underscore the importance   Moreover, the inclusion of categorical variables, such as
            of longitudinal follow-up duration, renal function, and   “anemia,” “sex,” “smoking,” “high blood pressure,” and
            cardiac function as key predictors of adverse outcomes   “diabetes,” in the analysis further enriches the predictive
            in HF patients. In addition, attributes such as “platelets,”   model. While  these  variables  exhibit  relatively  lower
            “creatinine phosphokinase,” and “age” also demonstrated   feature  importance  scores than  the  formerly  mentioned
            notable feature importance scores, indicating their   variables, their contributions to the overall predictive
            relevance in prognostic modeling. These physiological   performance should not be overlooked. Overall, the results
            and demographic factors contribute valuable insights into   presented in Table 2 highlight the utility of a data-driven
            risk stratification and treatment planning for HF patients.   approach in identifying clinically relevant predictors of


            Volume 9 Issue 1 (2025)                        135                              doi: 10.36922/ejmo.6583
   138   139   140   141   142   143   144   145   146   147   148