Page 149 - EJMO-9-1
P. 149
Eurasian Journal of Medicine and
Oncology
Machine learning insights into heart failure outcomes
the need for further optimization of models to improve factors associated with HF outcomes. The dataset, “Heart
their sensitivity in detecting positive cases, as highlighted Failure Clinical Records.csv,” underwent preprocessing
in previous research. In addition, our comparison of to address missing values and was prepared for analysis.
19
confusion matrices across different machine learning Significant predictors of death events among HF patients
algorithms, such as linear regression, random forest, were identified, including age, serum creatinine, and
SVM, and GBM, provides insights into the comparative ejection fraction, through feature importance analysis
performance of these models, which is consistent with prior and correlation matrix computation. Various machine
studies evaluating the efficacy of different algorithms in learning models, including logistic regression, random
healthcare applications. Overall, our prediction analysis forest, SVM, and GBM, were employed to predict death
21
using confusion matrices corroborates previous findings events, revealing varying levels of performance across
on the importance of comprehensive model evaluation models. While some models demonstrated promising
and highlights the ongoing need for model refinement to accuracy and predictive power, others exhibited
enhance predictive accuracy and clinical applicability in opportunities for improvement, particularly in reducing
real-world healthcare settings. false positive and false negative rates. Overall, the findings
underscore the importance of data-driven approaches in
The study has several limitations that may impact its understanding HF outcomes and highlight the need for
generalizability. First, the relatively small sample size continued refinement of predictive models to enhance
restricts the ability to capture the full variability of HF clinical decision-making and patient care in real-world
outcomes in larger and more diverse populations. In settings.
addition, algorithm-specific shortcomings were noted,
such as the sensitivity of SVMs to imbalanced data and Acknowledgments
the potential for overfitting in random forest models
with small datasets. Missing data were handled using The authors express their gratitude to Dr. Bharath Kumar
mean imputation, which may oversimplify complex Kakkireni, Founder and Chairman of KBK Multispeciality
relationships in the dataset. The lack of external validation Hospitals, for his invaluable support of this work. VKY
further limits the generalizability of the findings to other extends sincere thanks to Prof. S. A. Kori, Honorable Vice-
clinical settings. Finally, while longitudinal data were Chancellor of the Central University of Andhra Pradesh,
included, the observational nature of the dataset prevents for his unwavering support and encouragement.
definitive causal inferences between predictors and Funding
outcomes.
None.
The discussion highlights key findings from the analysis
of HF data, including insights from the original dataset, Conflict of interest
feature importance analysis, correlation matrix assessment,
and the evaluation of machine learning models. The original The authors declare that they have no competing interests.
dataset provides a comprehensive overview of clinical and Author contributions
demographic attributes among HF patients, shedding light
on risk factors and disease progression. Feature importance Conceptualization: Vinod Kumar Yata, Sunil Junapudi
analysis identifies significant predictors such as age, serum Formal analysis: Shivaprasad Chitta
creatinine, and ejection fraction, which are crucial for Investigation: Vinod Kumar Yata, Sunil Junapudi
prognosis. The correlation matrix reveals associations Methodology: Vinod Kumar Yata, Sunil Junapudi
between clinical variables and death events, aiding in Writing – original draft: Supriya Chandu, Vinod Kumar
risk assessment. Evaluation of machine learning models Yata, Sunil Junapudi
shows varying performance levels, indicating potential Writing – review & editing: Krishna Chaitanya Katha, Syam
for improvement in predictive accuracy. Overall, the Sundar Junapudi
study emphasizes the value of data-driven approaches in Ethics approval and consent to participate
understanding HF outcomes and underscores the need for
continued refinement of predictive models for enhanced Not applicable as the data was obtained from an open-
clinical utility. source platform.
5. Conclusion Consent for publication
This study utilized a comprehensive dataset sourced The dataset utilized in this study was obtained from
from Kaggle to investigate clinical and demographic Kaggle, an open-source platform for sharing datasets and
Volume 9 Issue 1 (2025) 141 doi: 10.36922/ejmo.6583

