Page 27 - AIH-2-1
P. 27
Artificial Intelligence in Health COVID-19 diagnosis: FPA, k-NN, and SVM classifiers
Step 3: Determine the type of pollination based on a that uses hyperplanes to separate different classes. The
predetermined probability p. Generate a random number distance of a feature vector from these hyperplanes indicates
r ∈ [0, , and if r < p, where p is the switching probability, how likely it is to belong to a specific class. 66,67 In ML, SVM
1
then global pollination and flower constancy take place, as is a model that classifies and predicts outcomes based on
described by Equation I: training data. The main goal of SVM is to identify the best
Lg
t
x t 1 x * x t i (I) hyperplane that divides two classes within a feature set. In
other words, the SVM training method builds a model that
i
i
assigns new examples into one of the two classes based on
t
Where x denotes the solution of i at iteration t, γ is a a set of training examples for binary classification. A SVM
i
scaling factor, and g is the current optimal solution at assigns training samples in a spatial arrangement that
*
iteration . The parameter L is the pollination strength, in maximizes the separation between the two classes. When
which essentially a step size is drawn from Levy flight, new samples are introduced, they are similarly positioned
which is given by Equation II: in this space, and their class is forecasted based on which
side of the hyperplane they fall. The SVM classifier was
*( )*sin(
) 1
L ~ 2 * s ,( ) 0 (II) trained using the set of FPA-selected features. Then, the
S 1 performance of the trained SVM classifier was validated
using the test dataset. 67
Where, Γ(λ) denotes the standard gamma function,
and this distribution is for S > 0. λ is the tail amplitude 4. Results
of the distribution’s control parameter. Commonly, it
is recommended to use λ = 1.5, which is followed in all This section includes a description of the real-time
simulations. dataset and public dataset used in this research, as well as
performance evaluation, comparison results of ML and DL
Step 4: Otherwise, if r > p, then the local pollination classifiers, and experimental results.
and the flower constancy are performed, as described by
Equation III: 4.1. Dataset outline
x
t
t
x t 1 x x (III) The research utilized two datasets: a real-time dataset
t
i
j
collected from Bharat Scan Centre, Chennai, India, and a
i
k
COVID-19 CT dataset obtained from the GitHub repository.
t
t
Where, x and x are pollens from other flowers of the In the real-time dataset, CT slices were labeled by an
j
k
same plant species, with j and k chosen at random from all expert radiologist as either “normal” or “COVID-19.” This
the solutions. ε ∈ [0, is a random number. dataset includes images from 41 individuals, comprising
1
Step 5: Evaluate each new solution x t +1 in the population 26 with COVID-19 and 15 with healthy lungs. Among the
i
and update the population according to their fitness value. COVID-19 patients, 19 exhibited mild severity while seven
had moderate severity; the cohort included 17 females and
Step 6: Calculate the current best solution g* by ranking nine males. The ages of the COVID-19 patients ranged
the solution. from 23 to 49 years, with an average of 36 years. The images
Step 7: Repeat Steps 3 through 6 until MaxGeneration is in the dataset have a pixel size of 512 × 512 and are in jpg
reached or until convergence is achieved. format. The nodule size ranges from 3 to 30 mm, with
lesions primarily located in the sub-pleural and posterior
In this feature selection Step 24 features have been
selected namely Area, Minor Axis Length, Convex Area, respiratory zones. The ROIs were patchy GGO, bilateral
GGO, subpleural GGO, peripheral GGO, broncho-vascular
Eccentricity, Cluster Prominence 2, Cluster Prominence thickening, traction bronchiectasis, consolidations, and
3, Contrast 1, Contrast 3, Correlation 2, Correlation 3,
Difference Variance 4, Dissimilarity 2, Dissimilarity 4, GGO with consolidations.
Energy 1, Entropy 1, Entropy 2, Entropy 4, Homogeneity The datasets have been divided into training and
4, Information Measure of Correlation 1, Information testing datasets, with training datasets comprising 80%
Measure of Correlation 4, Inverse Difference 1, Sum of the total and testing datasets 20%. To preserve privacy,
Average 1, Sum entropy 1, Sum of Squares Variance 4. we have masked all personal information from CT slices.
Each ROI has been differentiated based on the opinion
3.5. Classification of an experienced radiologist. In addition, the radiologist
The SVM algorithm was employed to train the optimal manually identified and described each ROI. Table 4 gives
feature subset. A SVM is a supervised learning algorithm an overview of the real-time dataset.
Volume 2 Issue 1 (2025) 21 doi: 10.36922/aih.3349

