Page 123 - IJOCTA-15-2
P. 123

¨
                                E. Sonu¸c, E. Ozcan / IJOCTA, Vol.15, No.2, pp.311-329 (2025)
            instances (capa, capb, capc) decreases as the his-  exhibits sub-linear scaling, meaning that the effi-
            tory list length increases, especially at L=100.  ciency gains decrease as the number of threads in-
            This leads to a significant increase in the aver-  creases. This is typical in parallel computing due
            age gap score for longer history lists, contrary to  to factors such as communication overhead and al-
            previous trends. The best overall performance is  gorithm parts that can’t be perfectly parallelized.
            achieved with L = 20 with an average gap of 0.34,  Despite these limitations, the continued improve-
            closely followed by L = 10 with an average gap of  ment even at 32 threads suggests that high lev-
            0.45, suggesting that shorter history lists are more  els of parallelism are beneficial for these UFLP
            effective with high parallelism. This behavior in-  instances. The most significant performance im-
            dicates that excessive parallelism combined with  provements are seen in the initial increments, es-
            longer history lists may cause over-exploration or  pecially from 4 to 8 threads. Overall, these results
            thread interference, resulting in suboptimal solu-  suggest that UFLP and PLAHC are well suited
            tions for extra large problems.                   for parallelization, although the optimal number
                                                              of threads depends on specific hardware capabil-
                Table 6 shows the gap scores of PLAHC using
                                                              ities and the trade-off between computation time
            32 threads. While maintaining optimal solutions
                                                              and resource utilization.
            for small and medium instances across all history
            list lengths, it shows a dramatic deterioration in    The comparative results of the PLAHC ( 8-
            performance for larger instances as the history list  thread with L = 50), and the state-of-the-art al-
            length increases. The best overall performance is  gorithms on the UFLP instances are reported in
            achieved with the shortest history lists: L = 10  Table 7. The performance of PLAHC is evalu-
            with an average gap of 0.32 and L = 20 with an    ated by comparing its gap scores and standard
            average gap of 0.33. Performance degrades signif-  deviation (Std) values with those of reference al-
            icantly for L = 50 and L = 100, primarily due     gorithms, as presented in the individual columns
            to poor results on the largest instances. These   of the table. The results of these methods were
            results indicate that as parallelism increases to  taken directly from the relevant studies. The best
            this level, shorter history lists become more effec-  gap values for each instance are shown in bold.
            tive, likely because high parallelism already pro-  oBABC is a metaheuristic algorithm designed to
            vides diverse exploration. This implementation    address the limitations of existing binary vari-
            highlights the need to adjust the tradeoff between  ants of the ABC algorithm, particularly in solving
            parallelism and history length for optimal perfor-  binary optimization problems such as UFLP. 39
            mance in highly parallel environments, encourag-  MBVS is a metaheuristic algorithm originally de-
            ing shorter history lists as the number of threads  signed for continuous optimization problems and
            increases.                                        later adapted to solve binary problems by trans-
                The comparison of all experimental results    forming continuous values into binary ones. 31
            shows a clear performance improvement through
            parallelization over the sequential implementa-       The performance comparison of PLAHC
            tion. The 8-thread parallel implementation with   against oBABC and MBVS shows PLAHC’s effec-
            L = 50 emerges as the best overall configura-     tiveness in solving UFLP across various instance
            tion, achieving the lowest average gap score of   sizes.  For small to medium instances (cap71-
            0.29. This configuration provides an optimal bal-  cap104), all three algorithms achieve optimal so-
            ance between parallelism and history list length,  lutions, with PLAHC matching or slightly out-
            and performs well across different problem sizes.  performing the others.  PLAHC’s strength be-
            As the number of threads increases, the ideal his-  comes more apparent for larger instances (cap131-
            tory list length generally decreases, with 16- and  cap134), where it consistently outperforms or
            32-thread implementations favoring shorter lists  matches the other algorithms. For extra large in-
            (L = 20 and L = 10, respectively). However,       stances, PLAHC significantly outperforms both
            these high thread counts have shown poor perfor-  oBABC and MBVS on the capa instance with a
            mance for longer history lists, especially for larger  gap score of 1.2982 compared to 2.8969 and 5.8962
            problem instances.                                respectively. However, for capb and capc, MBVS
                Figure 3 shows average speedups for UFLP in-  outperforms PLAHC. Nevertheless, PLAHC con-
            stances. The results show consistent performance  sistently outperforms oBABC on these large in-
            improvements as the number of threads increases.  stances. In addition, PLAHC generally has lower
            Starting with a 3.33x speedup at 4 threads, per-  std values, especially for larger instances, indi-
            formance scales up to 5.52x at 8 threads, 6.96x at  cating more consistent performance over multiple
            16 threads, and reaches a maximum of 10.00x at    runs. This combination of competitive solution
            32 threads. Although the speedup is significant, it  quality in most instances, especially for complex
                                                           318
   118   119   120   121   122   123   124   125   126   127   128