Page 57 - AIH-2-4

P. 57

Artificial Intelligence in Health Synthetic data for obesity level prediction

model’s effectiveness and suggests that it can serve as a obesity dataset. The study involved a comprehensive set
robust tool to assist healthcare professionals in obesity of preprocessing steps, including handling missing values,
risk assessment. This study illustrates how evolutionary encoding categorical attributes related to diet and physical
optimization algorithms can improve the performance of activity, and additional data preparation procedures.
traditional classifiers in this domain. A feedforward ANN was then implemented in Python and

Özkurt implemented multiple ML algorithms in trained on the preprocessed dataset. The model achieved
19
conjunction with explainability techniques to predict obesity a classification accuracy of 97% across seven obesity
risk. The study utilized data from 2,111 individuals in the UCI categories, indicating its effectiveness in capturing the
obesity dataset, which contains attributes related to dietary patterns between eating habits, physical conditions, and
habits and physical conditions. A range of ML classifiers – obesity outcomes. The authors emphasized that careful data
including DT, RF, Naïve Bayes, k-NN, and extreme gradient cleaning and hyperparameter optimization were critical to
boosting (XGBoost) – were evaluated. Among these, achieving this high level of performance. Their findings
the XGBoost model achieved the highest classification highlight that even relatively simple NN architectures can
accuracy at approximately 92% for obesity level prediction. yield accuracy comparable to more complex or ensemble-
To enhance interpretability, the author employed Shapley based models when properly optimized.
additive explanations to identify key features influencing the Azad et al. proposed a stacking ensemble model that
22
model’s decisions. The analysis revealed that family history integrates XAI techniques for obesity risk classification,
of obesity, vegetable intake, and frequency of between-meal published in early 2025. In their study, the researchers
consumption were among the most influential predictors. combined multiple base classifiers within a stacked
These findings demonstrate that boosting algorithms, when architecture and employed local interpretable model-agnostic
integrated with explainable AI (XAI) techniques, can deliver explanations (LIME) to provide local interpretability. The
both high predictive performance and valuable insights into model was evaluated on the standard obesity dataset,
obesity-related risk factors. achieving an accuracy of ~98%, which slightly outperformed

In a related study, Wang presented their findings in E3S previously reported best-performing models such as GB
20
Web of Conferences, focusing on obesity level prediction and XGBoost (~97.8%). Beyond the improved predictive
using lifestyle habit features while deliberately excluding performance, the integration of LIME offered valuable
direct anthropometric measures such as height and weight insight into individual predictions, addressing the “black-
to assess model generalizability. The study evaluated a range box” issue. Comparative analysis demonstrated that the
of ML algorithms, including logistic regression variants proposed approach outperformed all prior studies in
(ordinal and multinomial), ensemble methods (LogitBoost terms of classification accuracy. This research highlights
and XGBoost), and standard classifiers (Naïve Bayes, SVM, the effectiveness of combining diverse classifiers through
RF, and k-NN). Among these, the LogitBoost ensemble ensembling and underscores the importance of incorporating
achieved the highest performance, with an accuracy of XAI techniques to enhance model transparency, particularly
~70% and a Kappa statistic of ~0.65. In contrast, the in clinical decision-making contexts.
XGBoost model performed poorly, reaching an accuracy Solomon et al. introduced a majority-voting ensemble
23
of ≤20% due to the exclusion of key features. Other model composed of GB, XGBoost, and an MLP NN to
models, such as SVM, k-NN, and RF, achieved accuracies classify obesity levels. Utilizing the Latin American obesity
ranging from 75% to 79%. Although these values are lower dataset, their hybrid ensemble achieved an accuracy of
than those reported in studies that incorporate BMI- 97.16%, surpassing the best-performing individual model
related features, the author provided important insights. (XGBoost), which attained 96.37%. This result, published
Specifically, they emphasized that when anthropometric in Diagnostics in 2023, established a high-performance
data are unavailable, lifestyle indicators play a critical role benchmark and has since been frequently cited by 2024
in obesity prediction. Feature importance analysis revealed studies as a state-of-the-art reference. By comparing
that the mode of transportation (e.g., riding a bike) was multiple algorithms, the authors demonstrated that an
the most influential predictor, followed by family history ensemble model can effectively leverage the strengths of
of overweight and frequency of vegetable consumption. its individual components. The majority-voting approach
This comparative study suggests that even in the absence of outperformed all single classifiers, highlighting the
direct body measurements, lifestyle-related attributes can advantage of combining diverse learning paradigms. The
still support reasonably accurate obesity risk assessments. impact of this work is further reflected in subsequent
Okpe et al. proposed a multilayer perceptron ANN research, such as that of Azad et al., which aimed to
22
21
model for multiclass obesity classification using the UCI exceed the benchmark established by this study.
Volume 2 Issue 4 (2025) 51 doi: 10.36922/AIH025140027

52 53 54 55 56 57 58 59 60 61 62