Page 81 - AJWEP-22-5
P. 81

Hybrid optimization for LSTM DO prediction

                the  global  search  results  of  DE.  Conversely,  the  DE   due to its lack of global optimization ability, ultimately
                population received the local optimal solution from the   affecting the final model accuracy.
                Nadam population to enhance its performance in local   During  Nadam  optimization,  the  training  and
                search.  This  two-way  IEM  ensured  that  DE  benefits   verification loss changes are illustrated in Figure 5. Both
                from Nadam’s local optimization results, while Nadam   losses  decrease  with  the  number  of  training  epochs.
                leverages DE’s global search ability.               However, compared to Nadam–DE, the verification loss
                                                                    converges more slowly and reaches a higher final value.
                4.3. Result analysis
                4.3.1. Training results for Nesterov-accelerated    4.3.3. Algorithm comparison
                adaptive moment estimation–DE optimization          To evaluate the effectiveness of the proposed Nadam–
                algorithm                                           DE  optimizer  in  time  series  forecasting  tasks,  we
                First, the LSTM model was trained using the Nadam–DE   compared  its  performance  with  that  of the  standard
                optimization algorithm. The optimized hyperparameter   Nadam optimizer using the MSE metric after training.
                included a learning rate of 9.52e  and 79 LSTM units.  Experimental results show that the post-training MSE
                                            −3
                  Based on these hyperparameters,  the model was    for  Nadam–DE  is  0.0193,  whereas  for  Nadam  it  is
                trained, achieving an MSE of 0.019278116524219513   0.0369.  To quantify the magnitude of performance
                after  training.  This  result  indicates  that  Nadam–DE   improvement,  we calculated  the percentage  reduction
                can  effectively  improve  the  prediction  performance   in prediction error using the following formula:
                of  the  model  by  globally  searching  for  optimal
                hyperparameters. During training, changes in the loss                      Original Error -
                function are shown in Figure 4, where both training and   Error Reduction Rate = Optimized Error  ×100%
                validation losses decrease steadily with the number of                     Original Error
                iterations and stabilize after several epochs. Nadam–DE                                        (XXV)
                showed good convergence in the optimization process.
                                                                       Substituting the actual values into the formula yields:
                4.3.2. Nadam training results                        0 0369 0 0193.    .
                To  further  compare  the  effects,  the  model  was  also   0 0369.   100 %  47 8. %
                trained using only the Nadam optimizer, with the same
                learning  rate  (9.52e )  and  number  of  LSTM  units   These results indicate that Nadam–DE reduces the
                                  −3
                (79). Under these identical  hyperparameter  settings,   prediction  error  by  approximately  47.8%  compared
                the  MSE  achieved  by  Nadam  after  training  was   to Nadam. This significant performance improvement
                0.036936987191438675, which is significantly higher   demonstrates that integrating the DE mechanism into the
                than that obtained using Nadam–DE. This result suggests   Nadam  framework  effectively  enhances  optimization
                that although Nadam can accelerate local convergence,   efficiency  and  improves  the  model’s  generalization
                it is prone to becoming trapped in local optimization   capability. As  a  result,  the  hybrid  approach  achieved





















                Figure 4. Loss function changes during training with   Figure  5.  Loss  function  progression  under
                Nesterov-accelerated adaptive moment estimation–    Nesterov-accelerated  adaptive moment estimation
                differential evolution optimization algorithm       optimization



                Volume 22 Issue 5 (2025)                        75                           doi: 10.36922/AJWEP025210165
   76   77   78   79   80   81   82   83   84   85   86