Page 57 - IJOCTA-15-1
P. 57
BSO: Binary Sailfish Optimization for feature selection in sentiment analysis
5
Two degrees of SA are possible. The first anal- Optimization algorithm (WOA), which have pre-
ysis method is called fine-grained analysis. Fine- viously been utilized as feature selectors in text
grained analysis use rule-based, dictionary-based mining and SA. Ultimately, our findings demon-
and machine learning (ML) approaches to de- strate that the BSO is more effective for SA, as
scribe sub-sentence-level properties such as words, it outperformed existing methods and enhanced
sentences and clauses. 6,7 The second analysis is classification accuracy.
coarse-grained analysis. Coarse-grained method
indicates polarity at level of sentence. Shi et To summarize the contributions of this study to
8
9
al. and Zeng et al. discovered that user com- the literature, we present the following points:
ments’ fine-grained SA can help suggestions be-
come more accurate. Zeng et al. 10 analyzed sen- • User comments were collected from n11
timents of the reviewers’ comments in order to and Trendyol, which are prominent online
make a product recommendation. Combining SA sales platforms in Turkey, for the purpose
of reviews can significantly reduce user-item rat- of conducting SA.
ing data’ sparsity, by that enhancing the recom- • All combinations of preprocessing tech-
mendation system’s accuracy. 11 As a result, sev- niques utilized in text mining were sys-
eral studies indicate that deriving complete and tematically evaluated.
precise user choices from SA’s sing-grained level • Alongside four ML algorithms, we pro-
is challenging. 12 posed and tested one deep learning (DL)
model.
Feature selection is the process of selecting and • The BSO algorithm was applied to the
finding the most useful features in the data set. collected dataset and utilized as a feature
This process greatly affects the performance of selector in SA for the first time, with its
the ML models. Because of the complexity of ex- performance compared against HS, BA,
plaining and interpreting a model with too many ASO, and WOA.
and unimportant variables, it will be necessary to • The results indicate that preprocessing
transform the model into an easier-to-understand methods can lead to performance degra-
format with variable selection. At the same time, dation in certain instances, highlighting
increasing the training time of the model can be that the Multinomial Naive Bayes algo-
prevented in this way, because with the decrease rithm outperformed both the ML and DL
of the variables, the computation cost will also de- models.
crease, and the preparation time of the model will • Despite operating with a larger feature
also be shortened. 13 With the selection of vari- set compared to other optimization algo-
ables, it is aimed to prevent the overfitting that rithms, the BSO algorithm demonstrates
may occur in the model with feature selection. In superior performance by achieving a re-
the case of overfitting, it may cause the model markable F-score of 0.91. This success is
success to be high in the training dataset and low attributed to its ability to balance feature
in the test dataset. Even if the records in the size and model accuracy effectively, utiliz-
test dataset are not similar to the records in the ing nearly half of the total available fea-
training dataset, the error rate in the model will tures.
be high.
This paper is organized as follows: Section 2 pro-
Initially, we collected product reviews from se-
vides an overview of relevant literature pertaining
lected online platforms. Subsequently, we ap-
to the proposed technique. The proposed study,
plied established preprocessing methods suitable
along with the materials and methods employed,
for text datasets to the collected data. Following
is detailed in Section 3. Section 4 presents the
this, we conducted feature selection methods on
experimental results and a discussion of the find-
the dataset to identify the words that best differ-
ings. Finally, Section 5 concludes the paper and
entiate the classes in SA. A significant contribu-
outlines the implications of the study.
tion of this study to the literature is the applica-
tion of the Binary Sailfish Optimizer (BSO) as a
feature selector for the first time in the context 2. Literature review
of SA. We then compared the performance of the
BSO with other optimization algorithms, specifi- In this section, studies on SA will be examined
cally Harmony Search (HS), Bat Algorithm (BA), under two headings: SA based on user-item rat-
Atom Search Optimization (ASO), and Whale ings and user reviews.
51

