Page 71 - IJOCTA-15-1
P. 71

BSO: Binary Sailfish Optimization for feature selection in sentiment analysis

                                        Table 5. Parameters utilized in the algorithms

                                         Algorithms Parameters            Values
                                                              α             0.9
                                                              γ             0.9
                                         BA
                                                              β            [0, 1]
                                                              r            [0, 1]
                                                            hmcr            0.7
                                         HS
                                                             par            0.3
                                                              A              4
                                         BSO
                                                              ε            0.001
                                                              α             50
                                         ASO
                                                              β             0.2
                                         WOA         No parameter to set

                                         Table 6. Results of optimization algorithms

                              Metrics                   HS      BA     ASO    WOA BSO
                              Accuracy                  0.869   0.893 0.898 0.832    0.908
                              Precision                 0.847   0.878 0.879 0.801    0.895
                              Recall                    0.899   0.912 0.921 0.883    0.925
                              F-score                   0.873   0.894 0.90    0.84   0.91
                              Avg. selected feature size 10105 9934    9899   9378   12754


            5. Conclusion and future work                     Furthermore, we applied four fundamental pre-
                                                              processing methods. Upon analysis of the results,
                                                              it was observed that these preprocessing tech-
            Sentiment analysis (SA) involves the classification
                                                              niques did not yield a significant improvement in
            of emotions expressed in text reviews as either
                                                              performance; in some instances, they even led to a
            positive or negative. It serves as an information
                                                              decrease in accuracy. Among the ML algorithms
            extraction technique that enables businesses to
                                                              evaluated, the Multinomial Naive Bayes (MNB)
            enhance or develop their operations through the   algorithm demonstrated the best performance for
            analysis of user feedback. Consequently, compa-
            nies can make informed decisions based on data    SA, achieving an F-score of 0.902. In compari-
                                                              son, the Bidirectional Long Short-Term Memory
            and respond promptly to critical situations.
                                                              (BiLSTM) model produced an F-score of 0.883
                                                              on the raw dataset, thereby trailing behind the
            In this study, we aimed to improve the accuracy   MNB. Ultimately, despite the BSO selecting a
            of SA for product reviews by developing a deep    greater number of features than the existing al-
            learning (DL) based model alongside four ma-      gorithms, it was concluded that the BSO outper-
            chine learning (ML) algorithms applied to user    formed the other four algorithms with an F-score
            comments. Additionally, we employed feature se-
                                                              of 0.91 while utilizing nearly half of the total fea-
            lection methods to eliminate irrelevant noise data  tures available.
            within the dataset. This approach facilitates a
            more representative dataset, reduces the search
            space, and enhances the performance of the al-
            gorithms. Notably, we applied the Binary Sailfish  We conducted a SA of user comments using TF-
            Optimizer (BSO) as a feature selection method for  IDF, focusing on specific categories. While our
            a textual dataset, marking its first use in SA. To  findings contribute to understanding sentiment in
            evaluate the performance of the BSO, we also im-  this context, several limitations must be acknowl-
            plemented the binary variants of Harmony Search   edged. Firstly, the analysis was conducted on user
            (HS), Bat Algorithm (BA), Atom Search Opti-       comments collected from only specific categories,
            mization (ASO), and Whale Optimization algo-      which may restrict the generalizability of the find-
            rithm (WOA) on the same dataset. The dataset      ings. Different categories often exhibit distinct
            for this study was collected from Trendyol and    sentiment expressions, suggesting that a more di-
            n11, which are prominent online sales platforms   verse dataset would enhance the robustness of the
            in Turkey.                                        results. Additionally, while the TF-IDF method

                                                            65
   66   67   68   69   70   71   72   73   74   75   76