Page 73 - AIH-1-4
P. 73

Artificial Intelligence in Health                                     ML models for heartbeat classification



            leveraging the geometric distance in a multidimensional   2.3.4. DTs
            space. The Euclidean distance between two observations is   DT classifiers are among the most supervised learning
            expressed in (V).
                                                               algorithms and are widely utilized for classification tasks.
                        2         2                            These trees are constructed from the provided data
            d =  ij  2  ( x −  i  x j ) ( y+  i  +  y j )  (V)  using straightforward equations as they employ attribute
                                                               selection measures such as the gain ratio measure to rank
              where  d  represents the Euclidean distance between   attributes and identify the most important ones. This
                     ij
            points i and j. Point i is represented as x =[x ,x ,x ,…,x ],y .   process enables researchers to determine the most effective
                                                      iM
                                               i2
                                             i1
                                          i
                                                         i
                                                  i3
            Meanwhile, point j is represented as x =[x ,x ,x ,…,x ],y .
                                          j  j1  j2  j3  jM  j  attributes for prediction purposes. DT is a prominent
            2.3.2. LR                                          data mining technique for creating classification models,
                                                               and it is highly practical due to its speed,  lack of the
                                                                                                  35
            LR is a supervised ML algorithm widely used in multivariate   requirement of domain knowledge or parameter tuning,
            statistical methods to predict binary outcomes based on   ability to handle multidimensional data, and capacity to
            a set of observed independent variables. It specifically   produce easily interpretable classification rules. Generally,
            handles categorical response variables that represent   DT classifiers offer good accuracy and common examples
            binary events,  rather than  continuous parameters. The   include ID3, C4.5/C5.0/J48, CART, and Random Tree. 36
            detailed process of constructing an LR model can be
                          33
            found elsewhere.  The final output of LR is a probability,   2.3.5. SVMs
            ranging from 0 to 1, that represents the occurrence
            likelihood of an event. This probability is mathematically   SVMs are supervised ML models designed for pattern
            expressed in (VI):                                 classification and regression tasks, based on the structural
                                                               risk minimization theory.  These non-parametric
                                                                                       37
                    1                                          approaches employ kernels to tackle non-linear and
            P =                                        (VI)
             x
                1 e+  −  C +  0 Cx                             high-dimensional problems by mapping data into a
                       1
                                                               higher-dimensional space, where linear separation is
              where P  represents the occurrence probability of the   feasible. SVMs optimize the margin between classes, thus
                     x
            event, and e denotes the base of the natural logarithm. The   minimizing the overfitting risk and enhancing the model’s
            terms C  + C  are model parameters. The coefficients of the   generalization ability for unseen data. By constructing an
                  0
                      1
            LR  model  are  estimated  using  the  maximum  likelihood   optimal hyperplane that maximizes the margin between
            method, which involves choosing coefficients that   classes, SVMs ensure effective classification and regression
            maximize the probability of the model correctly predicting   across various fields.
            the observed outcomes.
                                                               2.3.6. XGBoost
            2.3.3. RF
                                                               XGBoost, introduced by Tianqi Chen, is a powerful ML
            RF is an ensemble learning technique employed for   algorithm grounded in the principle of gradient boosting.
            classification and regression tasks. It builds multiple   Officially released on March 27, 2014, XGBoost is designed
            classifiers and combines classifier outputs during
            training. The RF model comprises numerous DTs, each   to enhance the performance of DTs. Since its introduction,
            contributing to the final prediction by providing a vote for   XGBoost has become a popular choice in data science
            the most common class. This method enhances prediction   competitions and applications due to its efficiency,
                                                                                   38
            performance by employing uncorrelated trees created   accuracy, and scalability.  This model is versatile and can
            through bootstrap aggregation, wherein training subsets   be used to tackle regression and classification problems.
            are generated through replacement, allowing data points to   In regression tasks, it predicts continuous outcomes
            be sampled multiple times to create diverse subsets. Cross-  based on the input features, while in classification tasks,
            validation within the RF model reduces estimation and out-  it categorizes data into distinct classes. Mathematically,
            of-bag errors, yielding highly reliable trees and improving   the gain from XGBoost is used in a regularized boosting
                             34
            prediction  accuracy.   Moreover,  all  characteristics  are   technique, which is defined by equation (VII)
            bounded using a stochastic methodology. An inherent         G 2     G 2    (G + G  ) 2  
            benefit of the RF model is its ability to generate an extensive   Gain =  1   L  +  R  −  R  L    −  α    (VII)
            array of trees, which enhances diversity and mitigates   2 H +     L  β  H +  R  β  H +  R  H +  L  β    
            bias-related concerns. Following the tree generation, a
            new observation is classified by averaging its vote across   where the first and second terms represent the score of
            all the DTs.                                       the left child and right child, respectively, and the third term


            Volume 1 Issue 4 (2024)                         67                               doi: 10.36922/aih.3543
   68   69   70   71   72   73   74   75   76   77   78