Page 57 - IJOCTA-15-1
P. 57

BSO: Binary Sailfish Optimization for feature selection in sentiment analysis

                                           5
            Two degrees of SA are possible. The first anal-   Optimization algorithm (WOA), which have pre-
            ysis method is called fine-grained analysis. Fine-  viously been utilized as feature selectors in text
            grained analysis use rule-based, dictionary-based  mining and SA. Ultimately, our findings demon-
            and machine learning (ML) approaches to de-       strate that the BSO is more effective for SA, as
            scribe sub-sentence-level properties such as words,  it outperformed existing methods and enhanced
            sentences and clauses. 6,7  The second analysis is  classification accuracy.
            coarse-grained analysis. Coarse-grained method
            indicates polarity at level of sentence.  Shi et  To summarize the contributions of this study to
               8
                                9
            al. and Zeng et al. discovered that user com-     the literature, we present the following points:
            ments’ fine-grained SA can help suggestions be-
            come more accurate. Zeng et al. 10  analyzed sen-      • User comments were collected from n11
            timents of the reviewers’ comments in order to           and Trendyol, which are prominent online
            make a product recommendation. Combining SA              sales platforms in Turkey, for the purpose
            of reviews can significantly reduce user-item rat-       of conducting SA.
            ing data’ sparsity, by that enhancing the recom-       • All combinations of preprocessing tech-
            mendation system’s accuracy. 11  As a result, sev-       niques utilized in text mining were sys-
            eral studies indicate that deriving complete and         tematically evaluated.
            precise user choices from SA’s sing-grained level      • Alongside four ML algorithms, we pro-
            is challenging. 12                                       posed and tested one deep learning (DL)

                                                                     model.
            Feature selection is the process of selecting and      • The BSO algorithm was applied to the
            finding the most useful features in the data set.        collected dataset and utilized as a feature
            This process greatly affects the performance of          selector in SA for the first time, with its
            the ML models. Because of the complexity of ex-          performance compared against HS, BA,
            plaining and interpreting a model with too many          ASO, and WOA.
            and unimportant variables, it will be necessary to     • The results indicate that preprocessing
            transform the model into an easier-to-understand         methods can lead to performance degra-
            format with variable selection. At the same time,        dation in certain instances, highlighting
            increasing the training time of the model can be         that the Multinomial Naive Bayes algo-
            prevented in this way, because with the decrease         rithm outperformed both the ML and DL
            of the variables, the computation cost will also de-     models.
            crease, and the preparation time of the model will     • Despite operating with a larger feature
            also be shortened. 13  With the selection of vari-       set compared to other optimization algo-
            ables, it is aimed to prevent the overfitting that       rithms, the BSO algorithm demonstrates
            may occur in the model with feature selection. In        superior performance by achieving a re-
            the case of overfitting, it may cause the model          markable F-score of 0.91. This success is
            success to be high in the training dataset and low       attributed to its ability to balance feature
            in the test dataset. Even if the records in the          size and model accuracy effectively, utiliz-
            test dataset are not similar to the records in the       ing nearly half of the total available fea-
            training dataset, the error rate in the model will       tures.
            be high.

                                                              This paper is organized as follows: Section 2 pro-
            Initially, we collected product reviews from se-
                                                              vides an overview of relevant literature pertaining
            lected online platforms.  Subsequently, we ap-
                                                              to the proposed technique. The proposed study,
            plied established preprocessing methods suitable
                                                              along with the materials and methods employed,
            for text datasets to the collected data. Following
                                                              is detailed in Section 3. Section 4 presents the
            this, we conducted feature selection methods on
                                                              experimental results and a discussion of the find-
            the dataset to identify the words that best differ-
                                                              ings. Finally, Section 5 concludes the paper and
            entiate the classes in SA. A significant contribu-
                                                              outlines the implications of the study.
            tion of this study to the literature is the applica-
            tion of the Binary Sailfish Optimizer (BSO) as a
            feature selector for the first time in the context  2. Literature review
            of SA. We then compared the performance of the
            BSO with other optimization algorithms, specifi-  In this section, studies on SA will be examined
            cally Harmony Search (HS), Bat Algorithm (BA),    under two headings: SA based on user-item rat-
            Atom Search Optimization (ASO), and Whale         ings and user reviews.
                                                            51
   52   53   54   55   56   57   58   59   60   61   62