Page 149 - EJMO-9-1
P. 149

Eurasian Journal of Medicine and
            Oncology
                                                                        Machine learning insights into heart failure outcomes


            the need for further optimization of models to improve   factors associated with HF outcomes. The dataset, “Heart
            their sensitivity in detecting positive cases, as highlighted   Failure Clinical Records.csv,”  underwent  preprocessing
            in  previous  research.   In  addition,  our  comparison  of   to address missing values and was prepared for analysis.
                             19
            confusion matrices across different machine learning   Significant predictors of death events among HF patients
            algorithms, such as  linear  regression,  random  forest,   were  identified,  including age, serum  creatinine,  and
            SVM, and GBM, provides insights into the comparative   ejection fraction, through feature importance analysis
            performance of these models, which is consistent with prior   and correlation matrix computation. Various machine
            studies evaluating the efficacy of different algorithms in   learning models, including logistic regression, random
            healthcare applications.  Overall, our prediction analysis   forest, SVM, and GBM, were employed to predict death
                               21
            using confusion matrices corroborates previous findings   events, revealing varying levels of performance across
            on the importance of comprehensive model evaluation   models. While some models demonstrated promising
            and highlights the ongoing need for model refinement to   accuracy and predictive power, others exhibited
            enhance predictive accuracy and clinical applicability in   opportunities for improvement, particularly in reducing
            real-world healthcare settings.                    false positive and false negative rates. Overall, the findings
                                                               underscore the importance of data-driven approaches in
              The study has several limitations that may impact its   understanding HF outcomes and highlight the need for
            generalizability. First, the relatively small sample size   continued refinement of predictive models to enhance
            restricts the ability to capture the full variability of HF   clinical decision-making and patient care in real-world
            outcomes in larger and more diverse populations. In   settings.
            addition, algorithm-specific shortcomings were noted,
            such as the sensitivity of SVMs to imbalanced data and   Acknowledgments
            the  potential  for  overfitting  in  random  forest  models
            with small datasets. Missing data were handled using   The authors express their gratitude to Dr. Bharath Kumar
            mean  imputation,  which  may  oversimplify  complex   Kakkireni, Founder and Chairman of KBK Multispeciality
            relationships in the dataset. The lack of external validation   Hospitals, for his invaluable support of this work. VKY
            further limits the generalizability of the findings to other   extends sincere thanks to Prof. S. A. Kori, Honorable Vice-
            clinical settings. Finally, while longitudinal data were   Chancellor of the Central University of Andhra Pradesh,
            included, the observational nature of the dataset prevents   for his unwavering support and encouragement.
            definitive causal inferences between predictors and   Funding
            outcomes.
                                                               None.
              The discussion highlights key findings from the analysis
            of HF data, including insights from the original dataset,   Conflict of interest
            feature importance analysis, correlation matrix assessment,
            and the evaluation of machine learning models. The original   The authors declare that they have no competing interests.
            dataset provides a comprehensive overview of clinical and   Author contributions
            demographic attributes among HF patients, shedding light
            on risk factors and disease progression. Feature importance   Conceptualization: Vinod Kumar Yata, Sunil Junapudi
            analysis identifies significant predictors such as age, serum   Formal analysis: Shivaprasad Chitta
            creatinine,  and  ejection  fraction,  which  are  crucial  for   Investigation: Vinod Kumar Yata, Sunil Junapudi
            prognosis. The correlation matrix reveals associations   Methodology: Vinod Kumar Yata, Sunil Junapudi
            between clinical variables and death events, aiding in   Writing – original draft: Supriya Chandu, Vinod Kumar
            risk assessment. Evaluation of machine learning models   Yata, Sunil Junapudi
            shows varying performance levels, indicating potential   Writing – review & editing: Krishna Chaitanya Katha, Syam
            for improvement in predictive accuracy. Overall, the   Sundar Junapudi
            study emphasizes the value of data-driven approaches in   Ethics approval and consent to participate
            understanding HF outcomes and underscores the need for
            continued refinement of predictive models for enhanced   Not applicable as the data was obtained from an open-
            clinical utility.                                  source platform.
            5. Conclusion                                      Consent for publication

            This study utilized a comprehensive dataset sourced   The dataset utilized in this study was obtained from
            from Kaggle to investigate clinical and demographic   Kaggle, an open-source platform for sharing datasets and


            Volume 9 Issue 1 (2025)                        141                              doi: 10.36922/ejmo.6583
   144   145   146   147   148   149   150   151   152   153   154