Page 70 - AIH-1-4
P. 70

Artificial Intelligence in Health                                     ML models for heartbeat classification




            Table 1. Summary of mappings between beat annotations   highlights  several  challenges,  including  (I)  handling
            and Association for the Advancement of Medical EC57   missing  and  imbalanced  data,  (II)  selecting  a  robust
            categories 18                                      classification algorithm, (III) managing ECG signal
                                                               complexity, (IV)  addressing  computational demands,
            Label      Category            Annotation          and (V) recognizing methodological limitations. To
            0          N            • Normal                   achieve the objectives of this study, seven classification
                                    • Left/right bundle branch block
                                    • Atrial escape            algorithms were used to classify heartbeat categories, and
                                    • Nodal escape             their performance was evaluated. Herein, ML models-
            1          S            • Atrial premature         nearest neighbors (KNN), naive Bayes (NB) classifier,
                                    • Aberrant atrial premature  random forest (RF) classifier, logistic regression (LR),
                                    • Nodal premature          eXtreme  gradient  boosting  (XGBoost)  classifier,  support
                                    • Supraventricular premature  vector machines (SVMs), and decision trees (DTs) were
            2          V            • Premature ventricular contraction  employed, along with the incorporation of the FT and
                                    • Ventricular escape       Gaussian noise injection techniques. The overall design of
            3          F            • Fusion of ventricular and normal  the implementation method is presented in Figure 5. In this
                                                               study, we utilized the advantages of Pearson correlation and
                                                               the associated p-values, which serve as statistical tools, to
                                                               unveil meaningful relationships and dependencies within
                                                               our data. In addition, we introduced controlled noise to
                                                               enhance the robustness of our models, allowing them to
                                                               better adapt to real-world variations. Furthermore, the FT
                                                               technique was leveraged to extract essential frequency-
                                                               domain features from our data. This combination enabled
                                                               our models to make more accurate predictions and better
                                                               understand  complex  data  patterns,  offering  valuable
                                                               insights for various applications, comparable to state-of-
                                                               art algorithms.
                                                                 The ML models enhanced through the proposed
                                                               approach can exhibit improvements in terms of key metrics
            Figure 2. Distribution of the training set         such as accuracy and F1 score. KNN and NB are favored
                                                               for their simplicity and real-time efficiency, while RF and
                                                               XGBoost excel in handling complex interactions and large
                                                               datasets, offering robustness and feature importance. LR is
                                                               valued for its interpretability in binary classification, SVMs
                                                               are known for managing high-dimensional data and noise,
                                                               and DTs are favored for their clear and interpretable results.
                                                               These  algorithms  were  selected  for  their  effectiveness  in
                                                               handling  complex  ECG  data,  computational  efficiency,
                                                               adaptability, and noise robustness, ensuring reliable
                                                               performance and scalability. Notably, the  ability of ML
                                                               models to handle motion artifacts in ECG signals varies.
                                                               RF and XGBoost  are  particularly robust under noisy
                                                               environments due to their ensemble nature,  while SVMs
                                                                                                  19
                                                               effectively maintain accuracy using kernel methods.  KNN
                                                                                                        20
                                                               and NB face challenges with respect to noise sensitivity;
            Figure 3. Distribution of the test set             however, preprocessing techniques such as FT can help in
                                                               this regard. LR and DT require feature engineering and
            2.3. Methods used                                  noise reduction for exhibiting better performance, 21,22  with

            Enhancing the classification prediction accuracy can aid   ensemble methods further boosting their robustness.
            in the early diagnosis of cardiovascular diseases. However,   As detailed in a previous study,  Pearson’s correlation
                                                                                           23
            a review of current state-of-the-art research indicates that   test is a statistical method used to assess the relationship
            metrics and prediction rates often fall short.  Literature   between two continuous variables. It yields a coefficient


            Volume 1 Issue 4 (2024)                         64                               doi: 10.36922/aih.3543
   65   66   67   68   69   70   71   72   73   74   75