Page 71 - IJOCTA-15-1

P. 71

BSO: Binary Sailfish Optimization for feature selection in sentiment analysis

Table 5. Parameters utilized in the algorithms

Algorithms Parameters Values
α 0.9
γ 0.9
BA
β [0, 1]
r [0, 1]
hmcr 0.7
HS
par 0.3
A 4
BSO
ε 0.001
α 50
ASO
β 0.2
WOA No parameter to set

Table 6. Results of optimization algorithms

Metrics HS BA ASO WOA BSO
Accuracy 0.869 0.893 0.898 0.832 0.908
Precision 0.847 0.878 0.879 0.801 0.895
Recall 0.899 0.912 0.921 0.883 0.925
F-score 0.873 0.894 0.90 0.84 0.91
Avg. selected feature size 10105 9934 9899 9378 12754

5. Conclusion and future work Furthermore, we applied four fundamental pre-
processing methods. Upon analysis of the results,
it was observed that these preprocessing tech-
Sentiment analysis (SA) involves the classification
niques did not yield a significant improvement in
of emotions expressed in text reviews as either
performance; in some instances, they even led to a
positive or negative. It serves as an information
decrease in accuracy. Among the ML algorithms
extraction technique that enables businesses to
evaluated, the Multinomial Naive Bayes (MNB)
enhance or develop their operations through the algorithm demonstrated the best performance for
analysis of user feedback. Consequently, compa-
nies can make informed decisions based on data SA, achieving an F-score of 0.902. In compari-
son, the Bidirectional Long Short-Term Memory
and respond promptly to critical situations.
(BiLSTM) model produced an F-score of 0.883
on the raw dataset, thereby trailing behind the
In this study, we aimed to improve the accuracy MNB. Ultimately, despite the BSO selecting a
of SA for product reviews by developing a deep greater number of features than the existing al-
learning (DL) based model alongside four ma- gorithms, it was concluded that the BSO outper-
chine learning (ML) algorithms applied to user formed the other four algorithms with an F-score
comments. Additionally, we employed feature se-
of 0.91 while utilizing nearly half of the total fea-
lection methods to eliminate irrelevant noise data tures available.
within the dataset. This approach facilitates a
more representative dataset, reduces the search
space, and enhances the performance of the al-
gorithms. Notably, we applied the Binary Sailfish We conducted a SA of user comments using TF-
Optimizer (BSO) as a feature selection method for IDF, focusing on specific categories. While our
a textual dataset, marking its first use in SA. To findings contribute to understanding sentiment in
evaluate the performance of the BSO, we also im- this context, several limitations must be acknowl-
plemented the binary variants of Harmony Search edged. Firstly, the analysis was conducted on user
(HS), Bat Algorithm (BA), Atom Search Opti- comments collected from only specific categories,
mization (ASO), and Whale Optimization algo- which may restrict the generalizability of the find-
rithm (WOA) on the same dataset. The dataset ings. Different categories often exhibit distinct
for this study was collected from Trendyol and sentiment expressions, suggesting that a more di-
n11, which are prominent online sales platforms verse dataset would enhance the robustness of the
in Turkey. results. Additionally, while the TF-IDF method

66 67 68 69 70 71 72 73 74 75 76