Page 113 - AIH-2-4
P. 113

Artificial Intelligence in Health                                  Autonomic nervous system patterns in men



            iteratively recalculated until convergence (Equation II) by   4. Results
            minimizing the sum of squared errors (Equation III).
                                                               Table 1 presents the anthropometric, physical, and HRV
              The following equations were applied in the K-means   data of the participants. Normality tests indicate that age,
            clustering process:                                body mass, height, and MRR follow a Gaussian distribution
                                                               (p  ≥ 0.05), and the low standard deviations support the
                           i
            dp q,       n  q   2                 (I)    homogeneity of the sample. However, SDNN, RMSSD,
                              p
                               i
                       i1
                                                               pNN50, LF, and HF do not follow a normal distribution (p <
                1                                              0.05). These findings highlight the importance of HRV data
            m         x                               (II)    normalization in the context of machine learning, particularly
                n k    iC k  i                               when applying techniques such as PCA and K-means.
            Where:                                               After dimensionality reduction using PCA, a non-
            •   x is the HRV values                            hierarchical K-means clustering algorithm was applied to
                i
            •   n is the number of points in C k               the first two PCs. The algorithm was initialized randomly
                k
            •   m is the centroid of a cluster
            •   d is the Euclidean distance
            •   p is a data point                              Table 1. Anthropometric characteristics and heart rate
                                                               variability parameters of the participants
            •   q is a cluster centroid
            •   q is the i-th attribute of data point q        Variables        Mean±standard deviation  p‑value
                i
            •   p is the i-th attribute of data point p.       Age (years)            22.0±2.8          0.200
                i
            J    k     ( x  m ) 2                    (III)   Body mass (kg)         65.2±6.9          0.935
             k  1   iC k  i  k                               Height (cm)           171.0±6.5          0.745
                  i  
            Where:                                             MRR (ms)             935.0±132.2         0.571
            •   J  is the within-cluster sum of squares (WCSS), which   SDNN (ms)    62.8±30.9          0.008
                k
               is the objective function for K-means           RMSSD (ms)            72.7±44.6          0.001
            •   k is the number of clusters                    pNN50 (%)             36.6±24.5          0.007
            •   m  is the centroid of cluster k.               LF (%)                49.0±21.8          0.015
                 k
              To identify distinct subgroups within the dataset, the   HF (%)        51.8±22.2          0.031
            elbow method was used to determine the optimal number   Abbreviations: HF: High-frequency; LF: Low-frequency; MRR: Mean
            of clusters (k) by evaluating the WCSS, while silhouette   R-R interval; pNN50: The proportion of adjacent normal-to-normal
            analysis measures how well each data point fits within its   intervals differing by more than 50 ms; RMSSD: The root mean square
                                                               of successive differences between adjacent intervals; SDNN: The
            assigned cluster. Once the optimal k was established, the   standard deviation of all normal-to-normal intervals.
            K-means  algorithm partitioned  the data  by iteratively
            refining cluster centroids until membership stabilized. The
            quality of this final partition was visually validated using
            a silhouette plot, which graphically displays the cohesion
            and separation of the resulting clusters.  To complement
                                            47
            this  analysis,  agglomerative  clustering  was  conducted
            using Ward’s linkage method with a Euclidean distance
            metric, and the output was visualized as a dendrogram.
              After identifying three clusters, a one-way analysis
            of variance was conducted to assess whether there are
            significant differences in HRV parameters among the
            groups. Additionally, post hoc Tukey’s tests were applied for
            pairwise comparisons. The magnitude of these differences
            was evaluated using Cohen’s  d effect size, and statistical
            significance was assessed using 95% confidence intervals
            (CI) for the mean difference. All statistical analyses were
            performed in MATLAB 2020b (MathWorks, United States)   Figure 1. K-means clustering based on principal component coefficients
            with a significance level set at α = 0.05.         derived from normalized heart rate variability data


            Volume 2 Issue 4 (2025)                        107                          doi: 10.36922/AIH025050006
   108   109   110   111   112   113   114   115   116   117   118