Page 27 - AIH-2-1
P. 27

Artificial Intelligence in Health                          COVID-19 diagnosis: FPA, k-NN, and SVM classifiers



              Step 3: Determine the type of pollination based on a   that uses hyperplanes to separate different classes. The
            predetermined probability p. Generate a random number   distance of a feature vector from these hyperplanes indicates
            r ∈ [0, , and if r < p, where p is the switching probability,   how likely it is to belong to a specific class. 66,67  In ML, SVM
                 1
            then global pollination and flower constancy take place, as   is a model that classifies and predicts outcomes based on
            described by Equation I:                           training data. The main goal of SVM is to identify the best
                      Lg
                  t

            x t 1   x    *  x t i                   (I)    hyperplane that divides two classes within a feature set. In
                                                               other words, the SVM training method builds a model that
             i
                  i
                                                               assigns new examples into one of the two classes based on
                      t
              Where  x  denotes the solution of i at iteration t, γ is a   a set of training examples for binary classification. A SVM
                      i
            scaling factor, and  g is the current optimal solution at   assigns training samples in a spatial arrangement that
                             *
            iteration . The parameter L is the pollination strength, in   maximizes the separation between the two classes. When
            which essentially a step size is drawn from Levy flight,   new samples are introduced, they are similarly positioned
            which is given by Equation II:                     in this space, and their class is forecasted based on which
                                                             side of the hyperplane they fall. The SVM classifier was
                 *( )*sin(
                          )  1
            L ~            2  *     s ,(   ) 0        (II)    trained using the set of FPA-selected features. Then, the
                              S 1                           performance of the trained SVM classifier was validated
                                                               using the test dataset. 67
              Where,  Γ(λ) denotes the standard gamma function,
            and this distribution is for S > 0. λ is the tail amplitude   4. Results
            of the distribution’s control parameter. Commonly, it
            is recommended to use λ = 1.5, which is followed in all   This section includes a description of the real-time
            simulations.                                       dataset and public dataset used in this research, as well as
                                                               performance evaluation, comparison results of ML and DL
              Step 4: Otherwise, if r > p, then the local pollination   classifiers, and experimental results.
            and the flower constancy are performed, as described by
            Equation III:                                      4.1. Dataset outline
                          x
                           t
                  t
            x t 1    x     x                    (III)   The research utilized two datasets: a real-time dataset
                        t
                  i
                        j
                                                               collected from Bharat Scan Centre, Chennai, India, and a
             i
                           k
                                                               COVID-19 CT dataset obtained from the GitHub repository.
                            t
                      t
              Where,  x  and  x  are pollens from other flowers of the   In the real-time dataset, CT slices  were labeled by an
                      j
                            k
            same plant species, with j and k chosen at random from all   expert radiologist as either “normal” or “COVID-19.” This
            the solutions. ε ∈ [0,  is a random number.        dataset includes images from 41 individuals, comprising
                            1
              Step 5: Evaluate each new solution x t +1  in the population   26 with COVID-19 and 15 with healthy lungs. Among the
                                          i
            and update the population according to their fitness value.  COVID-19 patients, 19 exhibited mild severity while seven
                                                               had moderate severity; the cohort included 17 females and
              Step 6: Calculate the current best solution g* by ranking   nine males. The ages of the COVID-19  patients ranged
            the solution.                                      from 23 to 49 years, with an average of 36 years. The images
              Step 7: Repeat Steps 3 through 6 until MaxGeneration is   in the dataset have a pixel size of 512 × 512 and are in jpg
            reached or until convergence is achieved.          format. The nodule size ranges from 3 to 30  mm, with
                                                               lesions primarily located in the sub-pleural and posterior
              In this feature selection Step 24 features have been
            selected namely Area, Minor Axis Length, Convex Area,   respiratory zones. The ROIs were patchy GGO, bilateral
                                                               GGO, subpleural GGO, peripheral GGO, broncho-vascular
            Eccentricity, Cluster Prominence 2, Cluster Prominence   thickening, traction bronchiectasis, consolidations, and
            3, Contrast 1, Contrast 3, Correlation 2, Correlation 3,
            Difference Variance 4, Dissimilarity 2, Dissimilarity 4,   GGO with consolidations.
            Energy 1, Entropy 1, Entropy 2, Entropy 4, Homogeneity   The datasets have been divided into training and
            4,  Information  Measure  of Correlation  1,  Information   testing datasets, with training datasets comprising 80%
            Measure of Correlation 4, Inverse Difference 1, Sum   of the total and testing datasets 20%. To preserve privacy,
            Average 1, Sum entropy 1, Sum of Squares Variance 4.  we have masked all personal information from CT slices.
                                                               Each ROI has been differentiated based on the opinion
            3.5. Classification                                of an experienced radiologist. In addition, the radiologist
            The SVM algorithm was employed to train the optimal   manually identified and described each ROI. Table 4 gives
            feature subset. A SVM is a supervised learning algorithm   an overview of the real-time dataset.
            Volume 2 Issue 1 (2025)                         21                               doi: 10.36922/aih.3349
   22   23   24   25   26   27   28   29   30   31   32