Page 56 - AIH-2-4
P. 56
Artificial Intelligence in Health Synthetic data for obesity level prediction
Table 1. (Continued)
Study Dataset ML algorithm Results
Vairachilai Kaggle COVID-19 Healthy Diet Protein Food Item The proposed model achieved high predictive accuracy, with MAPE of
et al. 31 dataset Prediction Regression 29% for meat and milk and 31% for oil crops and vegetable products.
model The integration of protein-rich food variables allowed refined modeling
of feature influence in obesity prediction
Forte et al. 32 FITescola project dataset CNN The proposed model achieved 75% accuracy in obesity classification.
®
The inclusion of physical fitness variables improved feature
interpretability and overall model performance
Yağın et al. 33 Physical Activity and Eating Habits Trained NN with The proposed model achieved 93.06% accuracy in obesity classification,
dataset from İnönü University; Bayesian optimization outperforming prior methods. The integration of Bayesian
includes alcohol use, device use, optimization enhanced the model’s ability to select critical features
and meal frequency
Gözükara Bağ Web-based public dataset on LR, RF, XGBoost with The proposed model achieved 99.33% accuracy using logistic
et al. 34 physical activity and nutrition Bayesian optimization regression, with improved classification accuracy after feature selection.
(gender, BMI, diet, etc.) The inclusion of nutritional and activity data further strengthened the
model’s predictive capacity
Abbreviations: ANN: Artificial neural network; BME: Bagging meta-estimator; Bi-LSTM: Bidirectional long short-term memory; BMI: Body mass
index; CNN: Convolutional neural network; COVID-19: Coronavirus disease 2019; DT: Decision tree; GB: Gradient boosting; GBM: Gradient boosting
machine; k-NN: k-nearest neighbors; LIME: Local interpretable model-agnostic explanations; LogitBoost: Logistic regression boosting; LR: Logistic
regression; MAPE: Mean absolute percentage error; MLP: Multi-layer perceptron; NB: Naïve Bayes; NN: Neural network; POA: Pelican optimization
algorithm; PSO: Particle swarm optimization; RF: Random Forest; SGD: Stochastic Gradient Descent; SHAP: Shapley additive explanations;
SVM: Support vector machine; UCI: University of California, Irvine; XAI: Explainable artificial intelligence; XGBoost: Extreme gradient boosting.
obesity type I, 35–39.9 as obesity type II, and 40 or higher highlighted this result as a paradigm shift, demonstrating
as obesity type III. the effectiveness of attention-based deep sequential models
In the study by Helforoush and Sayyad , titled Hybrid in enabling accurate obesity risk prediction.
15
Metaheuristic ANN-PSO, various ML models were applied Shakti et al. evaluated multiple ML frameworks on
17
for obesity risk prediction. The authors proposed a the UCI obesity dataset, which contains 2,111 instances
hybrid artificial neural network optimized using particle with 17 attributes related to eating habits and lifestyle
swarm optimization (ANN-PSO). When evaluated on factors. The models tested included k-nearest neighbors
the University of California, Irvine (UCI) obesity dataset (k-NN), support vector machine (SVM), random forest
– which contains 2,111 records and 17 features related to (RF), gradient boosting (GB), and a multilayer perceptron
dietary habits and physical conditions – the ANN-PSO (MLP) neural network. Among these, the MLP classifier
model achieved an accuracy of ~92%, outperforming achieved the highest accuracy at 97.2%, followed closely by
traditional regression models. To enhance interpretability, GB at ~96.2%. These findings highlight that incorporating
the study employed Shapley additive explanation analysis, diverse features – such as dietary habits and physical
which revealed that weight and height were among the activity – alongside robust learning algorithms like neural
most influential features in predicting obesity levels. networks (NNs) can yield high classification performance.
These findings highlight the potential of metaheuristic The study emphasizes that such levels of accuracy are
optimization methods to improve the performance of essential for enabling targeted interventions for individuals
neural networks in personalized obesity risk profiling. at risk of obesity.
Ayub et al. developed an attention-enhanced Yağmur proposed a hybrid model that combines a
16
18
bidirectional long short-term memory (ABi-LSTM) model decision tree (DT) classifier with the pelican optimization
to classify individuals into obesity categories using the same algorithm (POA), a metaheuristic optimization technique,
dataset. Their deep learning architecture incorporated an to enhance obesity level classification. Utilizing the
attention mechanism to emphasize key features – such as 2,111-instance dataset, the model applied fuzzy parameter
height, weight, and activity level – allowing the model to tuning via POA to optimize the tree’s decision thresholds
capture complex patterns within the data. The proposed for multiclass categorization. The hybrid DT-POA
ABi-LSTM achieved a multiclass classification accuracy of approach reportedly outperforms the standard DT model
96.5%, representing a substantial improvement in precision, in predicting obesity levels. Although the precise accuracy
recall, and F1-score over existing approaches. The authors value is not explicitly stated, the author highlights the
Volume 2 Issue 4 (2025) 50 doi: 10.36922/AIH025140027

