Page 63 - IJOCTA-15-1
P. 63
BSO: Binary Sailfish Optimization for feature selection in sentiment analysis
scenario is determined in Section 4.2.3. The vec-
torized comments are then evaluated for SA using # of documents in collection
IDF(t) =
the DL model. Following this stage, we assess the # of documents in which the t occurs
performance of both the ML and DL models ac- (2)
cording to the specified evaluation metrics. The
classification results obtained from the ML and
DL algorithms are presented in Section 4.2.3.
3.3.2. Sailfish Optimization Algorithm (SOA)
Finally, after determining the optimal k value,
as well as the algorithms and preprocessing tech-
niques that yield the best results, the last experi- In this section, the working principle of the SOA
mental scenario of this study is conducted. In this are explained.
scenario, the BSO is applied as a feature selection
method in SA using the established metrics. In A population-based meta-heuristic algorithm
addition to BSO, Harmony Search (HS), Bat Al- called SOA was motivated by the attack-
gorithm (BA), Atom Search Optimization (ASO), alternation technique of a group of sailfish chasing
and Whale Optimization algorithm (WOA) are school of sardines. Shadravan et al. 23 created the
also employed with the same metrics to evalu- meta-heuristic SOA, which incorporates the be-
ate their performance. These optimization algo- havior of both a predatory group of sailfish and
rithms facilitate the identification of the optimal a prey population of sardines. The sailfish is cat-
set of words for effective classification in SA. De- egorized as a social predator since it hunts and
tailed information regarding the BSO is provided catches its prey in groups. When compared to
in Section 3.3.3, while the results of this experi- individual hunting, cooperative hunting can help
mental scenario are discussed in Section 4.3. The hunters save energy while still achieving their goal
structure of the proposed model is illustrated in of catching prey. 78 In cooperative hunting, preda-
Figure 1. tors utilize diverse killing tactics. For example,
the group of sailfish can be identified by the va-
riety of their attack methods. It consists of each
group member attacking the school of prey (sar-
dine) alone at a certain time, injuring or hunt-
ing some of them while remaining group members
store their power. 79 Sailfish are assumed to be
3.3.1. Term Frequency-Inverse document
candidate solutions in this technique, and the sail-
frequency
fish’s position within the search space is one of the
critical variables. 23 Sailfish are thought to be dis-
Term Frequency (TF) value is commonly pre- persed in the search space, whereas sardine place-
ferred to determine how important each term is ments aid in the discovery of the optimal solu-
for a given document within more than one set of tion. With its changing location vectors, Sailfish
documents. While calculating this value, the fre- may search one, two, three, or hyper-dimensional
quency of each term in the document is taken as space. 80 The algorithm makes every effort to ran-
basis. 77 The higher the frequency of a term, the domize the movement of search agents (both sail-
higher its TF value. Eq. 1 is measurement of how fish and sardine). When a sailfish attacks a school
frequently a term t appears in a document d: of prey, it can update his position in relation to
them. Furthermore, the sailfish can adjust his po-
sition in order to occupy unoccupied space around
count of t in d
TF(t, d) = (1) the prey school and imitate surrounding the prey.
total # of terms in d When a member of the prey group (sardine) is
damaged, the prey group (sardine) adjusts posi-
The statistical metric, which called as Inverse tion in order to avoid the sailfish attacks that fol-
Document Frequency (IDF) measures and deter- low. The upgraded locations’ strength may be
mines the term’s significance in a textual cor- marginal, necessitating an elitist process. The
pus. As term frequency within entire collection population elite approach is used to maintain the
increases, its IDF value decreases, meaning that best individuals for each search by exploring and
it loses its meaning for a particular documents. exploiting the dynamic attack parameter balanc-
The IDF value of a term t in the entire documents ing algorithm. SOA can be examined under 4
is shown in Eq. 2: headings.
57

