Page 73 - AIH-1-4

P. 73

Artificial Intelligence in Health ML models for heartbeat classification

leveraging the geometric distance in a multidimensional 2.3.4. DTs
space. The Euclidean distance between two observations is DT classifiers are among the most supervised learning
expressed in (V).
algorithms and are widely utilized for classification tasks.
2 2 These trees are constructed from the provided data
d = ij 2 ( x − i x j ) ( y+ i + y j ) (V) using straightforward equations as they employ attribute
selection measures such as the gain ratio measure to rank
where d represents the Euclidean distance between attributes and identify the most important ones. This
ij
points i and j. Point i is represented as x =[x ,x ,x ,…,x ],y . process enables researchers to determine the most effective
iM
i2
i1
i
i
i3
Meanwhile, point j is represented as x =[x ,x ,x ,…,x ],y .
j j1 j2 j3 jM j attributes for prediction purposes. DT is a prominent
2.3.2. LR data mining technique for creating classification models,
and it is highly practical due to its speed, lack of the
35
LR is a supervised ML algorithm widely used in multivariate requirement of domain knowledge or parameter tuning,
statistical methods to predict binary outcomes based on ability to handle multidimensional data, and capacity to
a set of observed independent variables. It specifically produce easily interpretable classification rules. Generally,
handles categorical response variables that represent DT classifiers offer good accuracy and common examples
binary events, rather than continuous parameters. The include ID3, C4.5/C5.0/J48, CART, and Random Tree. 36
detailed process of constructing an LR model can be
33
found elsewhere. The final output of LR is a probability, 2.3.5. SVMs
ranging from 0 to 1, that represents the occurrence
likelihood of an event. This probability is mathematically SVMs are supervised ML models designed for pattern
expressed in (VI): classification and regression tasks, based on the structural
risk minimization theory. These non-parametric
37
1 approaches employ kernels to tackle non-linear and
P = (VI)
x
1 e+ − C + 0 Cx high-dimensional problems by mapping data into a
1
higher-dimensional space, where linear separation is
where P represents the occurrence probability of the feasible. SVMs optimize the margin between classes, thus
x
event, and e denotes the base of the natural logarithm. The minimizing the overfitting risk and enhancing the model’s
terms C + C are model parameters. The coefficients of the generalization ability for unseen data. By constructing an
0
1
LR model are estimated using the maximum likelihood optimal hyperplane that maximizes the margin between
method, which involves choosing coefficients that classes, SVMs ensure effective classification and regression
maximize the probability of the model correctly predicting across various fields.
the observed outcomes.
2.3.6. XGBoost
2.3.3. RF
XGBoost, introduced by Tianqi Chen, is a powerful ML
RF is an ensemble learning technique employed for algorithm grounded in the principle of gradient boosting.
classification and regression tasks. It builds multiple Officially released on March 27, 2014, XGBoost is designed
classifiers and combines classifier outputs during
training. The RF model comprises numerous DTs, each to enhance the performance of DTs. Since its introduction,
contributing to the final prediction by providing a vote for XGBoost has become a popular choice in data science
the most common class. This method enhances prediction competitions and applications due to its efficiency,
38
performance by employing uncorrelated trees created accuracy, and scalability. This model is versatile and can
through bootstrap aggregation, wherein training subsets be used to tackle regression and classification problems.
are generated through replacement, allowing data points to In regression tasks, it predicts continuous outcomes
be sampled multiple times to create diverse subsets. Cross- based on the input features, while in classification tasks,
validation within the RF model reduces estimation and out- it categorizes data into distinct classes. Mathematically,
of-bag errors, yielding highly reliable trees and improving the gain from XGBoost is used in a regularized boosting
34
prediction accuracy. Moreover, all characteristics are technique, which is defined by equation (VII)
bounded using a stochastic methodology. An inherent  G 2 G 2 (G + G ) 2 
benefit of the RF model is its ability to generate an extensive Gain = 1  L + R − R L  − α (VII)
array of trees, which enhances diversity and mitigates 2 H +   L β H + R β H + R H + L β  
bias-related concerns. Following the tree generation, a
new observation is classified by averaging its vote across where the first and second terms represent the score of
all the DTs. the left child and right child, respectively, and the third term

Volume 1 Issue 4 (2024) 67 doi: 10.36922/aih.3543

68 69 70 71 72 73 74 75 76 77 78