Page 50 - AIH-1-1

P. 50

Artificial Intelligence in Health AI model for cardiovascular disease prediction

can be efficiently used to predict the outcome from the out to predict CVD using SVM, RF, DT, and KNN. The
existing dataset. Predicting a dependent variable from the results are explicitly discussed with the DT classification
values of independent variables is one of the applications accuracy of 73% . Khourdifi and Bahaj proposed a
[37]
of these machine learning techniques due to large data machine learning algorithm for heart disease prediction
resources that are difficult to manage manually as in the and classification using particle swarm optimization (PSO)
health-care sector. Some of the techniques used for these and ant colony optimization (ACO). The classification
[38]
prediction problems are SVM, NN, DT, regression, and average accuracy for PSO and ACO is 99.65% .
Naive Bayes classifiers. An ensemble classification method Shah et al. proposed a methodology for the prediction
for improving the accuracy of a weak classifier of heart of heart disease infection using Naive Bayes, DT, KNN,
disease was developed by combining multiple classifiers. and RF algorithms on a dataset with 303 instances and
The results of the ensemble techniques in the bugging and 76 attributes . The evaluation performance results
[39]
boosting are effective in improving the prediction accuracy showed that the KNN algorithm had the highest accuracy
of weak classifiers . score of 90.789% . An SVM classifier and GA have
[28]
[39]
Chowdhury et al. used the multilayer NN and the been combined to improve the performance of the SVM
backpropagation learning algorithm with the heart disease classifier in predicting heart disease based on risk factors.
[40]
dataset . The initialization of NN weights was optimized A system with an accuracy of 95% was obtained .
[29]
using a GA. An accuracy of about 98% was obtained. Due Haq et al. developed a machine learning model for
to the limited dynamism in patterns and associations CVD risk prediction in accordance with a dataset that
among the data mining techniques used , a feature contains 11 features used to forecast CVD . The dataset
[30]
[41]
subset selection method was applied to medical data. This was collected from Kaggle on CVD with approximately
method used a Naive Bayes classifier to select 5 attributes 70,000 patient records used for CVD prediction. This
from 15 attributes. It was able to find a critical nugget, Kaggle dataset has plenty of training and validation records.
which reduced the irrelevant attribute, and found the top The machine learning models used are NNs, RF, Bayesian
critical nuggets. The principal limitation of this method is networks, C5.0, and QUEST. The results acquired have a
that it can use only one data mining technique. high prediction accuracy of 99.1%, which is significantly
[41]
Srivenkatesh proposed CVD prediction using machine superior to previous methods .
learning algorithms, such as SVM, random forest (RF), Taylan et al. utilize machine learning to predict,
Naive Bayes classifier, and logistic regression, for vascular classify, and improve the diagnostic accuracy of CVDs
presumption, with the logistic regression showing a better using support vector regression (SVR), multivariate
accuracy of 77.06% when compared with another machine adaptive regression splines, M5Tree model, and NNs for
learning algorithm . Sharma and Parmar proposed a the training process . While the KNN, Naive Bayes,
[31]
[42]
heart disease prediction method using a deep learning NN and adaptive neuro-fuzzy inference system (ANFIS)
model . The dataset used in their works was obtained were used to predict 17 CVD risk factors, the mixed-data
[32]
from the UCI repository for deep training, and the results transformation and classification methods were employed
obtained are promising for CVD prediction . Separately, for categorical and continuous variables which predict
[32]
Mohan et al. predicted heart disease using machine CVD risk. However, the result obtained outperformed the
learning models, such as logistic regression, KNN, SVM, well-known statistical and machine learning approaches,
DT, and RF, during the dataset training . The accuracy a clear indication of their versatility and utility in
[33]
values obtained for the K-neighbor classifier are 0.95619%, CVD classification. The investigation indicates that the
SVM 0.9561945%, DT 0.91050%, RF classifier 0.95404%, prediction accuracy of ANFIS for the training process is
and LR 0.95592% . Chowdary used RF, LR, and ANN, 96.56%, and SVR is 91.95% .
[33]
[42]
with activation by KNN, Gaussian Naive Bayes (GNB), Tran et al. developed a prediction mortality model for
and rectified linear unit (ReLu) for prediction of the heart patients with CVDs to support health-care services .
[43]
disease infection . The average performance accuracy and The dataset used was obtained from the Medicare Benefits
[34]
precision obtained were 89% and 91.6%, respectively [34,35] .
Scheme and Pharmaceutical Department, Australia,
Rabbi et al. evaluated the performance of data mining between 2004 and 2014. The dataset contains about
classification techniques for heart disease prediction using 346,201 patient records. Some of the AI algorithms used
three popular classification techniques, such as KNN, SVM, in prediction include LR, RF, extra trees (ET), gradient
and ANN, achieving 82.963%, 85.1852%, and 73.3333% in boosting trees (GBT), and deep learning algorithms.
accuracy, respectively . Empirical performance analysis However, some of the minority deceased patient records
[36]
of various machine learning techniques has been carried contained in the dataset were experimented separately using

Volume 1 Issue 1 (2024) 44 https://doi.org/10.36922/aih.1746

45 46 47 48 49 50 51 52 53 54 55