Page 75 - AIH-2-4
P. 75

Artificial Intelligence in Health                                   Synthetic data for obesity level prediction






















































            Figure 29. Performance metrics plots of the five most successful classifiers on the tabular variational autoencoder dataset (using height and weight
            attributes)

            meal patterns are associated with higher obesity risk, and   to predictions, which is consistent with these clinical
            our synthetic augmentation appeared to capture these   findings.
            signals effectively for the ML models.               Our comparison highlights practical considerations
              Moreover, these findings align with recent nutrition   for applying generative data methods in health. Consistent
                                                                                  7
                                   11
            research. Colonnello  et al.  found that dysfunctional   with  Hernadez  et al.,   we  found that  SMOTE-type
            eating behaviors (e.g., night eating) are correlated with   oversampling and VAE-based generation can effectively
            lipid and metabolic abnormalities; we note that such   balance and expand tabular health data. The poorer
                                                                                                  7
            behaviors are indirectly represented in our features   performance of CTGAN (in the no-BMI case) suggests
            (e.g., meal frequency, alcohol use). 11   El-Sehrawy   that GAN-based approaches may require more data or
            et al.  reported that elevated TyG index values and   tuning to capture complex categorical relationships in this
                12
            disordered eating often co-occur in individuals with   dataset. Importantly, synthetic data offer benefits beyond
                                                  12
                                                                                           8
            obesity, suggesting metabolic–diet linkages. In our   model accuracy. Arora and Arora  emphasize that fully
            models, features related to eating patterns (e.g., intake   anonymized synthetic patient data can “replace the use
            of high-calorie foods, frequency of snacks) contribute   of real patient data in certain contexts.” In our work, all



            Volume 2 Issue 4 (2025)                         69                          doi: 10.36922/AIH025140027
   70   71   72   73   74   75   76   77   78   79   80